A 10-week collaborative KITP program with virtual and in-person components
This collaborative KITP program took place January 17 - March 24 2023. It consisted of multiple components, both virtual and in-person or hybrid:
Galaxy formation and evolution is a field that combines many complex and interdependent physical processes across different spatial and time scales. Moreover, observational data exists across a large wavelength range, and with heterogeneous spatial resolutions and observing depths, each sensitive to different physical tracers. Astrostatistics and machine learning can help us mine the huge existing and upcoming multiwavelength datasets, effectively down to the pixel and spaxel level, in order to determine the fundamental relationships between galaxy properties and evolutionary processes. Additionally, novel techniques are required to facilitate comparisons between galaxy formation theories and all observational data -- including outlier galaxy populations, and to integrate galaxy formation physics in the inference of cosmological parameters. However, connecting statistics and machine learning results to an improved understanding of key astrophysical processes remains a major challenge. This program will initiate a discussion between the empirical exploration of the extremely large and diverse data sets from both observations and simulations and the deduction of the underlying physics, aiming to answer the question: How do we optimally combine the empirical intuition gained using data science tools with theoretical intuition and knowledge, with the deductive goal of understanding the physical processes that govern the formation and evolution of galaxies?
This virtual component of our KITP program took place January 17 - February 24 2023. It consisted of a sparse but regular schedule of different events, described in more detail below the schedule.
We expect all participants to treat each other respectfully, and to abide by our code of conduct. Participation also requires agreeing to follow the KITP code of conduct.
The schedule will be updated with speakers and titles/topics when those are confirmed. Links to recorded talks will be added when available. Please also find all recordings (some public, some private) on the KITP program page. All times quoted are in US Pacific Time (UTC-8).
Please see the detailed program of the hybrid workshop at the CCA here. As this week is hosted by the CCA instead of KITP it does require a separate registration and will use separate zoom links. If you have not registered yet but would like to join remotely please let us know.
Monday February 6 | 9.00am - 9.45am | Review talk: Bridging the gap between astronomical datasets with AI - Domain Shift, Model Robustness and Failure Modes by Aleksandra Ciprijanovic (Fermilab) |
---|---|---|
9.45am - 10.30am | Moderated open discussion | |
Tuesday February 7 | No scheduled activities | Flexible time for self-organized meetings and working on projects |
Wednesday February 8 | 9.00am - 10.30am | Tutorial: Photo-z's and uncertainties by Alex Malz (Carnegie-Mellon University) |
10.30am - 11.00am | Ethics discussion: Carbon footprint | |
Thursday February 9 | No scheduled activities | Flexible time for self-organized meetings and working on projects |
Friday February 10 | 9.00am - 10.00am | Short Talks (10+5 minutes each): Millions to billions of photometric redshifts by Micol Bolzonella (INAF-OAS Bologna), Autoencoding Galaxy Spectra: Redshift Invariance and Outlier Detection by Yan Liang (Princeton University), Calibrated Predictive Distributions for Photometric Redshifts by Biprateep Dey (University of Pittsburgh) |
Monday February 13 | 9.00am - 9.45am | Review talk: A few notes about using heterogeneous and multi-wavelength data sets by Yuanyuan Zhang (NOIRLab) |
---|---|---|
9.45am - 10.30am | Moderated open discussion | |
Tuesday February 14 | No scheduled activities | Flexible time for self-organized meetings and working on projects |
Wednesday February 15 | 10.00am - 11.30am (Please note: different start time!) | Tutorial: Autodifferentiation and JAX by Andrew Hearin (Argonne National Laboratory) |
11.30am - 12.00am (Please note: different start time!) | Networking activity | |
4.00pm - 4.15pm | Lightning & Thunder talks (1 slide / 1 min each) | |
4.15pm - 5.00pm | Short Talks (10+5 minutes each): 4HS: What can(’t) you do with 6 million low-z galaxy redshifts across the southern hemisphere? by Edward Taylor (Swinburne University of Technology), Spectroscopic Samples for Photometric Redshifts: Systematic Biases and Unknown Unknowns by Jeff Newman (University of Pittsburgh), Have we seen enough? by Shoubaneh Hemmati (Caltech / IPAC) | |
Thursday February 16 | No scheduled activities | Flexible time for self-organized meetings and working on projects |
Friday February 17 | No scheduled activities | Flexible time for self-organized meetings and working on projects |
Monday February 20 | President's Day - No Program | |
---|---|---|
Tuesday February 21 | 9.00am - 9.45am | Review talk: Identifying Kinematic Substructure with Machine Learning by Lina Necib (MIT) |
9.45am - 10.30am | Moderated discussion | |
Wednesday February 22 | No scheduled activities | Flexible time for self-organized meetings and working on projects |
Thursday February 23 | 9.00am - 10.30am | Tutorial: Probabilistic U-Nets by Hadi Sotoudeh (University of Montreal) |
10.30am - 11.00am | Ethics discussion: Equal access to expertise, to hardware, to software, to data | |
Friday February 24 | 9.00am - 10.00am | Short Talks (10+5 minutes each): Graph-Convolutional Neural Networks as a new tool to extract clustering information of the Cosmic web by Farida Farsian (University of Bologna) |
The Review Talks will provide a broad overview and introduction into aspects of galaxy formation and evolution and statistical and machine learning applications related to the weekly topic. These review talks are all followed by a moderated discussion sessions, where the research and future prospects can be discussed. Three main questions will form a part of each of these review+discussion sessions:
The lightning & thunder series exist of 1 slide / 1 min talks to introduce ourselves to each other and in particular our goals for the week/weeks/whole program. A 1-slide lightning talk could highlight your expertise, describe what you would like to learn that week/during the program, and your research/collaborative goals. Ideally, if you present a lightning slide, this would be paired with a thunder slide later in the week/program where you describe what you have learned and what progress you made on your goals. We are flexible in the planning of lightning versus thunder slides, and we expect to see mostly lightning slides in the first weeks, and hopefully more thunder slides toward the end of the program.
The tutorials will address varying topics in galaxy formation and connected machine learning apporaches, or exist of introductions to statistical and/or machine learning tools, techniques and applications. In the virtual schedule there is 1.5 hour reserved for each tutorial, but speakers are asked to aim their tutorial for about 1 hour of interactive work to allow sufficient time for further questions and discussion. The tutorials will predominantly use the python programming language, and a familiarity with coding and python is strongly recommended.
The short networking sessions have the goal to allow participants to meet each other, chat about common interests, and perhaps even start new collaborations together. There are currently three sessions reserved for these networking activities, occuring once every two weeks, and alternating with the ethics discussions.
We will host short discussion sessions focused around some of the ethical questions connected with machine learning techniques, applications, and access. A few of these questions are also more widely applicable in galaxy evolution research, astrophysics, and/or science as a whole. There are currently three sessions reserved for these ethics-related discussions, occuring once every two weeks, and alternating with the networking activities. Ethics discussions will not be recorded.
The short talks exist of 10 minute talks with additional 5 minutes reserve for questions. The aim is to highlight additional views and applications, but also additional questions related to the topics of the week. Additionally, we endeavor to predominantly highlight work by more junior participants in this series.
Anyone is welcome to start a working group dedicated to a sepcific topic, technique, or key question. The working groups are lead by volunteers and should be self-organized. During the virtual program most Thursday's are reserved for working groups to have their own meetings and discussions, but both the day and whether to organize meetings is flexible and up to the working group itself to decide. If you are interested in starting a working group on a particular topic please let us know!
We invite anyone who is interested to prepare a pre-recorded 10-minute talk about their work. These talks can be uploaded and we will host them on our dedicated youtube channel. Please also feel free to announce your pre-recorded talk on the dedicated slack channel. We especially encourage junior participants to use this opportunity to introduce themselves.
All scheduled virtual sessions will take place in Zoom, hosted by the KITP, UCSB. All participants have received the link for zoom registration. If you have any questions or problems please contact the organizers (see contact info at the bottom), or the KITP.
We encourage all participants to join the slack space, and the #galevo23-general channel there. Many asynchronous discussions and collaborations will take place there, and announcement will be shared on slack as well. If you have not received an invitation to slack, and/or have any questions please contact the organizers (see contact info at the bottom), or the KITP.
The program has its own github page, and we hope that this will enable an ease of sharing for example code and tutorials among participants, further facilitating interaction and collaboration.
The hybrid Joint KITP-CCA workshop, hosted by the CCA in New York City, takes place January 30 - February 3 2023. It will have plenary talks and discussion sessions in the morning, and parallel tutorial and workshop sessions as well as free discussion/hack time in the afternoon. The program elements follow a similar setup as those for the virtual program. Additionally all asynchronous activities (Slack, pre-recorded talks) from the virtual program are also available this week.
We expect all participants to treat each other respectfully, and to abide by our code of conduct. Participation also requires agreeing to follow the Flatiron Institute code of conduct.
The events during the workshop will take place at two different locations. In the mornings, all talks and discussions will be the Ingrid Daubechies Auditorium on the 2nd floor of the Flatiron Institute at 162 Fifth Avenue (entrance around the corner on 21st Street). Breakfast, morning coffee breaks, and lunch will be available on the 2nd floor outside the auditorium. In the afternoon, the tutorials and discussions will take place at Math for America, on the 14th and 17th floor of 915 Broadway (one block from the Flatiron Institute). The Chi/Psi room is on the 14th floor, and the 17th floor hosts the Alpha/Beta room, additional spaces for impromtu discussions and hack sessions, and the lounge area where afternoon coffee breaks will be. Please note that zoom lines are connected to specific meeting rooms!
Recordings for all (review + short) talks can now be found on the CCA event website, and on this youtube playlist, and recordings of all talks and tutorials are also available in this dropbox.
The schedule will be updated with speakers and titles/topics when those are confirmed. All times in the schedule are in US Eastern Time (UTC-5).
Monday January 30 | 8.00am-9.00am | Breakfast available for in-person participants [Flatiron Institute, 2nd floor] | 9.00am - 9.15am | Welcome and introduction to the workshop |
---|---|---|
9.15am - 10.00am | Review talk: Has deep learning solved classification in astronomy? by Marc Huertas-Company (Instituto de Astrofisica de Canarias / Observatoire de Paris) | |
10.00am - 10.30am | Moderated open discussion led by Francisco Villaescusa Navarro (Flatiron Institute), and decide topic for afternoon workshop/discussion | |
10.30am - 11.00am | Coffee break | |
11.00am - 11.15am | Lightning & Thunder talks | |
11.15am - 12.00pm | Short talks chaired by Mike Blanton: Merger identification through photometric bands, colours, and their errors by Luis Suelves (Narodowe Centrum Badan Jadrowych Poland) (remote), CANDELS Merger Identification with IllustrisTNG50 Using a Convolutional Neural Network by Aimee Schechter (University of Colorado Boulder), and Source classification in large photometric surveys by Sotiria Fotopoulou (University of Bristol) (remote) | |
12.00pm - 1.15pm | Lunch break and in-person participants walk over to Math for America | |
1.15pm - 3.45pm | Parallel sessions; Tutorial: Robust Uncertainty Estimation in Machine Learning by Aritra Ghosh (Yale University) [MfA, 17th floor Alpha/Beta room] | |
1.15pm - 3.45pm | Parallel sessions; Workshop/Discussion, topic determined in the morning discussion [MfA, 14th floor Chi/Psi room] | |
2.30pm - 3.00pm | Coffee break [MfA, 17th floor] | |
3.45pm - 4.00pm | End of hybrid program; in-person participants walk back to Flatiron Institute | |
4.00pm - 6.00pm | Reception [Flatiron Institute, 2nd floor] | |
Tuesday January 31 | 8.00am-9.00am | Breakfast available for in-person participants [Flatiron Institute, 2nd floor] |
9.00am - 10.00am | Short talks chaired by Francisco Castander Serentill: Testing Sunyaev-Zel’dovich measurements of the hot gas content of dark matter haloes using synthetic skies by Amandine Le Brun (Paris Observatory / PSL University) (remote), Constraining feedback with galaxy/weak lensing correlations with thermal SZ effect using simulations and data by Shivam Pandey (Columbia University), Model-insensitive cosmological parameter estimation from cosmological simulations by Yongseok Jo (Flatiron Institute), and Deep Learning the Circumgalactic Medium (CGM): Take 1 by Naomi Gluck (Yale University) (remote) | |
10.00am - 10.15am | Discussion led by John Wu (Space Telescope Science Institute) | |
10.15am - 10.30am | Lightning & Thunder talks | |
10.30am - 11.00am | Coffee break | |
11.00am - 11.45pm | Review talk on advanced ML in astronomy by Shirley Ho (Flatiron Institute) | |
11.45am - 12.00pm | Discussion led by John Wu (Space Telescope Science Institute), and decide topic for afternoon workshop/discussion | |
12.00pm - 1.15pm | Lunch break and in-person participants walk over to Math for America | |
1.15pm - 3.45pm | Parallel sessions; Tutorial: Symbolic regression by Miles Cranmer (Princeton University) [MfA, 17th floor Alpha/Beta room] | |
1.15pm - 3.45pm | Parallel sessions; Workshop/Discussion, topic determined in the morning discussion [MfA, 14th floor Chi/Psi room] | |
2.30pm - 3.00pm | Coffee break [MfA, 17th floor] | |
3.45pm - 4.00pm | End of hybrid program; in-person participants walk back to Flatiron Institute | |
4.00pm - 5.00pm | Flexible discussion/hack time [Flatiron Institute, 5th floor Classroom] | |
Wednesday February 1 | 8.00am-9.00am | Breakfast available for in-person participants [Flatiron Institute, 2nd floor] |
9.00am - 9.45am | Review talk on Likelyhood-free inference by Ben Wandelt (Sorbonne University, Institut d’Astrophysique de Paris / Flatiron Institute) (remote) | |
9.45am - 10.15am | Moderated open discussion led by John Forbes (Flatiron Institute), and decide topic for afternoon workshop/discussion | |
10.15am - 10.30am | Lightning & Thunder talks | |
10.30am - 11.00am | Coffee break | |
11.00am - 12.00pm | Short talks chaired by Sultan Hassan: Modeling the distribution of atomic gas in FIREbox with Deep Learning by Robert Feldmann (University of Zurich), AGN Feedback Effects on the Low Redshift Lyman-alpha Forest by Megan Tillman (Rutgers University), Can we build a physical model for galaxy sizes using a data-driven approach? by Festa Bucinca (City University of New York) (remote), Big works on small galaxies by Yu-Heng Lin (University of Minnesota) | |
12.00pm - 1.15pm | Lunch break and in-person participants walk over to Math for America | |
1.15pm - 3.45pm | Parallel sessions; Tutorial: Simulation-based inference by ChangHoon Hahn (Princeton University) [MfA, 17th floor Alpha/Beta room] | |
1.15pm - 3.45pm | Parallel sessions; Workshop/Discussion, topic determined in the morning discussion [MfA, 14th floor Chi/Psi room] | |
2.30pm - 3.00pm | Coffee break [MfA, 17th floor] | |
3.45pm - 4.00pm | End of hybrid program; in-person participants walk back to Flatiron Institute | |
4.00pm - 5.00pm | Flexible discussion/hack time [Flatiron Institute, 5th floor Classroom] | |
Thursday February 2 | 8.00am-9.00am | Breakfast available for in-person participants [Flatiron Institute, 2nd floor] |
9.00am - 9.45am | Review talk: Applications of Machine Learning for Studying Galaxy Evolution with JWST and Roman by Henry Ferguson (Space Telescope Science Institute) | |
9.45am - 10.15am | Moderated open discussion led by Rachel Somerville (Flatiron Institute / Rutgers University) and decide topic for afternoon workshop/discussion | |
10.15am - 10.30am | Lightning & Thunder talks | |
10.30am - 11.00am | Coffee break | |
11.00am - 12.00pm | Short talks chaired by Sarah Wellons: Obscured Growth in the age of Roman and Massively Multiplexed Spectroscopic Facilities by Andreea Petric (Space Telescope Science Institute), The Euclid Flagship Mock Galaxy Catalogue by Francisco Javier Castander Serentill (Instituto de Ciencias del Espacio) and Demo on scientific computing in cloud-based environments by Toby Brown (National Research Council of Canada) (remote) | |
12.00pm - 1.15pm | Lunch break and in-person participants walk over to Math for America | |
1.15pm - 3.45pm | Parallel sessions; Tutorial on SED-modeling by Kartheik Iyer (Columbia University) [MfA, 17th floor Alpha/Beta room] | |
1.15pm - 3.45pm | Parallel sessions; Workshop/Discussion, topic determined in the morning discussion [MfA, 14th floor Chi/Psi room] | |
2.30pm - 3.00pm | Coffee break [MfA, 17th floor] | |
3.45pm - 4.00pm | End of hybrid program; in-person participants walk back to Flatiron Institute | |
4.00pm - 5.00pm | Flexible discussion/hack time [Flatiron Institute, 7th floor Classroom] | |
Friday February 3 | 8.00am-9.00am | Breakfast available for in-person participants [Flatiron Institute, 2nd floor] |
9.00am - 9.15am | Short talks: Interdisciplinary ML for Fundamental Physics by Mariel Pettee (Lawrence Berkeley National Laboratory / Flatiron Institute) | |
9.15am - 9.45am | Review talk: From ML x Astrophysics to ML x Climate: A journey across disciplines by Viviana Acquaviva (City University of New York) | |
9.45am - 10.15am | Moderated open discussion led by Blakesley Burkhart (Rutgers University / Flatiron Institute) and decide topic for afternoon workshop/discussion | |
10.15am - 10.30am | Lightning & Thunder talks | |
10.30am - 11.00am | Coffee break | |
11.00am - 11.45pm | Review talk: Unsupervised Learning: How to sift through astronomical haystacks without getting lost in metaphors, hyperboles, or in a random forest by Dovi Poznanski (Tel Aviv University) (remote) | |
11.45am - 12.00pm | Discussion | |
12.00pm - 1.15pm | Lunch break and in-person participants walk over to Math for America | |
1.15pm - 3.45pm | Parallel sessions; Tutorial: Graph Neural Networks and merger trees by Christian Kragh Jespersen (Princeton University) [MfA, 17th floor Alpha/Beta room] | |
1.15pm - 3.45pm | Parallel sessions; Workshop/Discussion, topic determined in the morning discussion [MfA, 14th floor Chi/Psi room] | |
2.30pm - 3.00pm | Coffee break [MfA, 17th floor] | |
3.45pm - 4.00pm | End of hybrid program; in-person participants walk back to Flatiron Institute | |
4.00pm - 5.00pm | Flexible discussion/hack time [Flatiron Institute, 7th floor Classroom] |
This in-person component of our KITP program will run February 27 - March 24 2023. It consists of a sparse but regular schedule of different events, some limited to in-person participants only, some hybrid, described in more detail below the schedule.
We expect all participants, in-person and those joining on zoom, to treat each other respectfully, and to abide by our code of conduct. Participation also requires agreeing to follow the KITP code of conduct.
The schedule will be updated with speakers and titles/topics when those are confirmed. Links to recorded talks will be added when available. Please also find all recordings (some public, some private) on the KITP program page. All events for our program will be in the Fred Kavli Auditorium (1003). All times quoted are in US Pacific Time (UTC-8).
Monday February 27 | 11.00am - 12.00pm | (in-person only) Meet for coffee (for participants who have arrived already) |
---|---|---|
Tuesday February 28 | 11.00am - 12.00pm | (in-person only) Meet for coffee and short informal introductions |
Wednesday March 1 | 9.00am - 9.45am | (hybrid) Review talk: Machine Learning and Astronomy by Josh Peek (Space Telescope Science Institute) |
9.45am - 10.30am | (hybrid) Moderated dicussion | |
Thursday March 2 | 11.00am - 12.00pm | (hybrid) Blackboard talks: Alexa Villaume (University of Waterloo) |
Friday March 3 | 10.30am | (in-person only) Joint informal meeting with the cosmic-web program participants |
Monday March 13 | 11.00am - 12.00pm | (in-person only) Meet for coffee and introductions/Lightning slides (location: Simons Amphitheater) |
---|---|---|
Tuesday March 14 | 11.00am - 12.00pm | (hybrid) Blackboard talks: The unreasonable efficiency of GNNs for physics by Christian Kragh Jespersen (Princeton University), Pushing models to the limit in the search for new physics by Chris Lovell (University of Portsmouth) and Sultan Hassan (New York University) (location: Simons Amphitheater) |
Wednesday March 15 | 9.00am - 9.45am | (hybrid) Review talk: Learning from pictures by Mike Walmsley (Univeristy of Manchester) (location: Simons Amphitheater) |
9.45am - 10.30am | (hybrid) Moderated dicussion (location: Simons Amphitheater) | |
Thursday March 16 | 11.00am - 12.00pm | (hybrid) Blackboard talks: Data-driven model of galaxy - supermassive black hole connection by Haowen Zhang, The intrinsic dimensionality / information content of galaxy observations by Kartheik Iyer (Columbia University) [notes and references] (location: Simons Amphitheater) |
Friday March 17 | 11.00am - 12.00am | (in-person only) Thunder slides (location: Simons Amphitheater) |
Monday March 20 | 10.30am - 11.00am | (in-person only) Blackboard talk: Deciphering and learning the mapping between quantum noise and galaxies by Martin Rey (Oxford University) |
---|---|---|
11.00am - 12.00pm | (in-person only) Moderated discussion | |
Tuesday March 21 - Friday March 24 | 9.00am - 5.00pm | Conference |
While the program is nominally continuing, the schedule will consist of the associated KITP conference, for which a separate registration is required.
The associated KITP Conference will take place March 21 - March 24 2023. All participants, those in-person and those joining on zoom, are expected to treat each other respectfully, and to abide by our code of conduct. Participation also requires agreeing to follow the KITP code of conduct. The full schedule of the conference can be found here, and recordings of the talks are available here.