Program description

This collaborative KITP program took place January 17 - March 24 2023. It consisted of multiple components, both virtual and in-person or hybrid:


Motivation

Galaxy formation and evolution is a field that combines many complex and interdependent physical processes across different spatial and time scales. Moreover, observational data exists across a large wavelength range, and with heterogeneous spatial resolutions and observing depths, each sensitive to different physical tracers. Astrostatistics and machine learning can help us mine the huge existing and upcoming multiwavelength datasets, effectively down to the pixel and spaxel level, in order to determine the fundamental relationships between galaxy properties and evolutionary processes. Additionally, novel techniques are required to facilitate comparisons between galaxy formation theories and all observational data -- including outlier galaxy populations, and to integrate galaxy formation physics in the inference of cosmological parameters. However, connecting statistics and machine learning results to an improved understanding of key astrophysical processes remains a major challenge. This program will initiate a discussion between the empirical exploration of the extremely large and diverse data sets from both observations and simulations and the deduction of the underlying physics, aiming to answer the question: How do we optimally combine the empirical intuition gained using data science tools with theoretical intuition and knowledge, with the deductive goal of understanding the physical processes that govern the formation and evolution of galaxies?




Virtual Program


This virtual component of our KITP program took place January 17 - February 24 2023. It consisted of a sparse but regular schedule of different events, described in more detail below the schedule.

We expect all participants to treat each other respectfully, and to abide by our code of conduct. Participation also requires agreeing to follow the KITP code of conduct.


Detailed schedule

The schedule will be updated with speakers and titles/topics when those are confirmed. Links to recorded talks will be added when available. Please also find all recordings (some public, some private) on the KITP program page. All times quoted are in US Pacific Time (UTC-8).



Program Elements


  • Review talks and moderated open discussion

    The Review Talks will provide a broad overview and introduction into aspects of galaxy formation and evolution and statistical and machine learning applications related to the weekly topic. These review talks are all followed by a moderated discussion sessions, where the research and future prospects can be discussed. Three main questions will form a part of each of these review+discussion sessions:

    • How do we properly compare results between theory and observations?
    • What advantages/disadvantages are there to use ML/data science apporaches in this context?
    • Can data science results in this context be connected to physical interpretation? Should they? And if yes, how?


  • Lightning & Thunder talks

    The lightning & thunder series exist of 1 slide / 1 min talks to introduce ourselves to each other and in particular our goals for the week/weeks/whole program. A 1-slide lightning talk could highlight your expertise, describe what you would like to learn that week/during the program, and your research/collaborative goals. Ideally, if you present a lightning slide, this would be paired with a thunder slide later in the week/program where you describe what you have learned and what progress you made on your goals. We are flexible in the planning of lightning versus thunder slides, and we expect to see mostly lightning slides in the first weeks, and hopefully more thunder slides toward the end of the program.


  • Tutorials

    The tutorials will address varying topics in galaxy formation and connected machine learning apporaches, or exist of introductions to statistical and/or machine learning tools, techniques and applications. In the virtual schedule there is 1.5 hour reserved for each tutorial, but speakers are asked to aim their tutorial for about 1 hour of interactive work to allow sufficient time for further questions and discussion.
    The tutorials will predominantly use the python programming language, and a familiarity with coding and python is strongly recommended.


  • Networking activities

    The short networking sessions have the goal to allow participants to meet each other, chat about common interests, and perhaps even start new collaborations together. There are currently three sessions reserved for these networking activities, occuring once every two weeks, and alternating with the ethics discussions.


  • Ethics discussions

    We will host short discussion sessions focused around some of the ethical questions connected with machine learning techniques, applications, and access. A few of these questions are also more widely applicable in galaxy evolution research, astrophysics, and/or science as a whole. There are currently three sessions reserved for these ethics-related discussions, occuring once every two weeks, and alternating with the networking activities. Ethics discussions will not be recorded.


  • Short talks

    The short talks exist of 10 minute talks with additional 5 minutes reserve for questions. The aim is to highlight additional views and applications, but also additional questions related to the topics of the week. Additionally, we endeavor to predominantly highlight work by more junior participants in this series.


  • Working groups

    Anyone is welcome to start a working group dedicated to a sepcific topic, technique, or key question. The working groups are lead by volunteers and should be self-organized. During the virtual program most Thursday's are reserved for working groups to have their own meetings and discussions, but both the day and whether to organize meetings is flexible and up to the working group itself to decide. If you are interested in starting a working group on a particular topic please let us know!


  • Asynchronous pre-recorded talks

    We invite anyone who is interested to prepare a pre-recorded 10-minute talk about their work. These talks can be uploaded and we will host them on our dedicated youtube channel. Please also feel free to announce your pre-recorded talk on the dedicated slack channel. We especially encourage junior participants to use this opportunity to introduce themselves.



Program Tools


  • Zoom sessions

    All scheduled virtual sessions will take place in Zoom, hosted by the KITP, UCSB. All participants have received the link for zoom registration. If you have any questions or problems please contact the organizers (see contact info at the bottom), or the KITP.


  • Slack space

    We encourage all participants to join the slack space, and the #galevo23-general channel there. Many asynchronous discussions and collaborations will take place there, and announcement will be shared on slack as well. If you have not received an invitation to slack, and/or have any questions please contact the organizers (see contact info at the bottom), or the KITP.


  • Github page

    The program has its own github page, and we hope that this will enable an ease of sharing for example code and tutorials among participants, further facilitating interaction and collaboration.




Joint KITP-CCA Workshop


The hybrid Joint KITP-CCA workshop, hosted by the CCA in New York City, takes place January 30 - February 3 2023. It will have plenary talks and discussion sessions in the morning, and parallel tutorial and workshop sessions as well as free discussion/hack time in the afternoon. The program elements follow a similar setup as those for the virtual program. Additionally all asynchronous activities (Slack, pre-recorded talks) from the virtual program are also available this week.

We expect all participants to treat each other respectfully, and to abide by our code of conduct. Participation also requires agreeing to follow the Flatiron Institute code of conduct.


The events during the workshop will take place at two different locations. In the mornings, all talks and discussions will be the Ingrid Daubechies Auditorium on the 2nd floor of the Flatiron Institute at 162 Fifth Avenue (entrance around the corner on 21st Street). Breakfast, morning coffee breaks, and lunch will be available on the 2nd floor outside the auditorium. In the afternoon, the tutorials and discussions will take place at Math for America, on the 14th and 17th floor of 915 Broadway (one block from the Flatiron Institute). The Chi/Psi room is on the 14th floor, and the 17th floor hosts the Alpha/Beta room, additional spaces for impromtu discussions and hack sessions, and the lounge area where afternoon coffee breaks will be. Please note that zoom lines are connected to specific meeting rooms!


Recordings for all (review + short) talks can now be found on the CCA event website, and on this youtube playlist, and recordings of all talks and tutorials are also available in this dropbox.

Detailed schedule

The schedule will be updated with speakers and titles/topics when those are confirmed. All times in the schedule are in US Eastern Time (UTC-5).


Monday January 30 8.00am-9.00am Breakfast available for in-person participants [Flatiron Institute, 2nd floor]
9.00am - 9.15am Welcome and introduction to the workshop
9.15am - 10.00am Review talk: Has deep learning solved classification in astronomy? by Marc Huertas-Company (Instituto de Astrofisica de Canarias / Observatoire de Paris)
10.00am - 10.30am Moderated open discussion led by Francisco Villaescusa Navarro (Flatiron Institute), and decide topic for afternoon workshop/discussion
10.30am - 11.00am Coffee break
11.00am - 11.15am Lightning & Thunder talks
11.15am - 12.00pm Short talks chaired by Mike Blanton: Merger identification through photometric bands, colours, and their errors by Luis Suelves (Narodowe Centrum Badan Jadrowych Poland) (remote), CANDELS Merger Identification with IllustrisTNG50 Using a Convolutional Neural Network by Aimee Schechter (University of Colorado Boulder), and Source classification in large photometric surveys by Sotiria Fotopoulou (University of Bristol) (remote)
12.00pm - 1.15pm Lunch break and in-person participants walk over to Math for America
1.15pm - 3.45pm Parallel sessions; Tutorial: Robust Uncertainty Estimation in Machine Learning by Aritra Ghosh (Yale University) [MfA, 17th floor Alpha/Beta room]
1.15pm - 3.45pm Parallel sessions; Workshop/Discussion, topic determined in the morning discussion [MfA, 14th floor Chi/Psi room]
2.30pm - 3.00pm Coffee break [MfA, 17th floor]
3.45pm - 4.00pm End of hybrid program; in-person participants walk back to Flatiron Institute
4.00pm - 6.00pm Reception [Flatiron Institute, 2nd floor]
Tuesday January 31 8.00am-9.00am Breakfast available for in-person participants [Flatiron Institute, 2nd floor]
9.00am - 10.00am Short talks chaired by Francisco Castander Serentill: Testing Sunyaev-Zel’dovich measurements of the hot gas content of dark matter haloes using synthetic skies by Amandine Le Brun (Paris Observatory / PSL University) (remote), Constraining feedback with galaxy/weak lensing correlations with thermal SZ effect using simulations and data by Shivam Pandey (Columbia University), Model-insensitive cosmological parameter estimation from cosmological simulations by Yongseok Jo (Flatiron Institute), and Deep Learning the Circumgalactic Medium (CGM): Take 1 by Naomi Gluck (Yale University) (remote)
10.00am - 10.15am Discussion led by John Wu (Space Telescope Science Institute)
10.15am - 10.30am Lightning & Thunder talks
10.30am - 11.00am Coffee break
11.00am - 11.45pm Review talk on advanced ML in astronomy by Shirley Ho (Flatiron Institute)
11.45am - 12.00pm Discussion led by John Wu (Space Telescope Science Institute), and decide topic for afternoon workshop/discussion
12.00pm - 1.15pm Lunch break and in-person participants walk over to Math for America
1.15pm - 3.45pm Parallel sessions; Tutorial: Symbolic regression by Miles Cranmer (Princeton University) [MfA, 17th floor Alpha/Beta room]
1.15pm - 3.45pm Parallel sessions; Workshop/Discussion, topic determined in the morning discussion [MfA, 14th floor Chi/Psi room]
2.30pm - 3.00pm Coffee break [MfA, 17th floor]
3.45pm - 4.00pm End of hybrid program; in-person participants walk back to Flatiron Institute
4.00pm - 5.00pm Flexible discussion/hack time [Flatiron Institute, 5th floor Classroom]
Wednesday February 1 8.00am-9.00am Breakfast available for in-person participants [Flatiron Institute, 2nd floor]
9.00am - 9.45am Review talk on Likelyhood-free inference by Ben Wandelt (Sorbonne University, Institut d’Astrophysique de Paris / Flatiron Institute) (remote)
9.45am - 10.15am Moderated open discussion led by John Forbes (Flatiron Institute), and decide topic for afternoon workshop/discussion
10.15am - 10.30am Lightning & Thunder talks
10.30am - 11.00am Coffee break
11.00am - 12.00pm Short talks chaired by Sultan Hassan: Modeling the distribution of atomic gas in FIREbox with Deep Learning by Robert Feldmann (University of Zurich), AGN Feedback Effects on the Low Redshift Lyman-alpha Forest by Megan Tillman (Rutgers University), Can we build a physical model for galaxy sizes using a data-driven approach? by Festa Bucinca (City University of New York) (remote), Big works on small galaxies by Yu-Heng Lin (University of Minnesota)
12.00pm - 1.15pm Lunch break and in-person participants walk over to Math for America
1.15pm - 3.45pm Parallel sessions; Tutorial: Simulation-based inference by ChangHoon Hahn (Princeton University) [MfA, 17th floor Alpha/Beta room]
1.15pm - 3.45pm Parallel sessions; Workshop/Discussion, topic determined in the morning discussion [MfA, 14th floor Chi/Psi room]
2.30pm - 3.00pm Coffee break [MfA, 17th floor]
3.45pm - 4.00pm End of hybrid program; in-person participants walk back to Flatiron Institute
4.00pm - 5.00pm Flexible discussion/hack time [Flatiron Institute, 5th floor Classroom]
Thursday February 2 8.00am-9.00am Breakfast available for in-person participants [Flatiron Institute, 2nd floor]
9.00am - 9.45am Review talk: Applications of Machine Learning for Studying Galaxy Evolution with JWST and Roman by Henry Ferguson (Space Telescope Science Institute)
9.45am - 10.15am Moderated open discussion led by Rachel Somerville (Flatiron Institute / Rutgers University) and decide topic for afternoon workshop/discussion
10.15am - 10.30am Lightning & Thunder talks
10.30am - 11.00am Coffee break
11.00am - 12.00pm Short talks chaired by Sarah Wellons: Obscured Growth in the age of Roman and Massively Multiplexed Spectroscopic Facilities by Andreea Petric (Space Telescope Science Institute), The Euclid Flagship Mock Galaxy Catalogue by Francisco Javier Castander Serentill (Instituto de Ciencias del Espacio) and Demo on scientific computing in cloud-based environments by Toby Brown (National Research Council of Canada) (remote)
12.00pm - 1.15pm Lunch break and in-person participants walk over to Math for America
1.15pm - 3.45pm Parallel sessions; Tutorial on SED-modeling by Kartheik Iyer (Columbia University) [MfA, 17th floor Alpha/Beta room]
1.15pm - 3.45pm Parallel sessions; Workshop/Discussion, topic determined in the morning discussion [MfA, 14th floor Chi/Psi room]
2.30pm - 3.00pm Coffee break [MfA, 17th floor]
3.45pm - 4.00pm End of hybrid program; in-person participants walk back to Flatiron Institute
4.00pm - 5.00pm Flexible discussion/hack time [Flatiron Institute, 7th floor Classroom]
Friday February 3 8.00am-9.00am Breakfast available for in-person participants [Flatiron Institute, 2nd floor]
9.00am - 9.15am Short talks: Interdisciplinary ML for Fundamental Physics by Mariel Pettee (Lawrence Berkeley National Laboratory / Flatiron Institute)
9.15am - 9.45am Review talk: From ML x Astrophysics to ML x Climate: A journey across disciplines by Viviana Acquaviva (City University of New York)
9.45am - 10.15am Moderated open discussion led by Blakesley Burkhart (Rutgers University / Flatiron Institute) and decide topic for afternoon workshop/discussion
10.15am - 10.30am Lightning & Thunder talks
10.30am - 11.00am Coffee break
11.00am - 11.45pm Review talk: Unsupervised Learning: How to sift through astronomical haystacks without getting lost in metaphors, hyperboles, or in a random forest by Dovi Poznanski (Tel Aviv University) (remote)
11.45am - 12.00pm Discussion
12.00pm - 1.15pm Lunch break and in-person participants walk over to Math for America
1.15pm - 3.45pm Parallel sessions; Tutorial: Graph Neural Networks and merger trees by Christian Kragh Jespersen (Princeton University) [MfA, 17th floor Alpha/Beta room]
1.15pm - 3.45pm Parallel sessions; Workshop/Discussion, topic determined in the morning discussion [MfA, 14th floor Chi/Psi room]
2.30pm - 3.00pm Coffee break [MfA, 17th floor]
3.45pm - 4.00pm End of hybrid program; in-person participants walk back to Flatiron Institute
4.00pm - 5.00pm Flexible discussion/hack time [Flatiron Institute, 7th floor Classroom]



KITP in-person/hybrid Program


This in-person component of our KITP program will run February 27 - March 24 2023. It consists of a sparse but regular schedule of different events, some limited to in-person participants only, some hybrid, described in more detail below the schedule.

We expect all participants, in-person and those joining on zoom, to treat each other respectfully, and to abide by our code of conduct. Participation also requires agreeing to follow the KITP code of conduct.


Detailed schedule

The schedule will be updated with speakers and titles/topics when those are confirmed. Links to recorded talks will be added when available. Please also find all recordings (some public, some private) on the KITP program page. All events for our program will be in the Fred Kavli Auditorium (1003).
All times quoted are in US Pacific Time (UTC-8).





KITP Conference


The associated KITP Conference will take place March 21 - March 24 2023. All participants, those in-person and those joining on zoom, are expected to treat each other respectfully, and to abide by our code of conduct. Participation also requires agreeing to follow the KITP code of conduct.

The full schedule of the conference can be found here, and recordings of the talks are available here.



Contact

datadrivengalaxyevolution [at] gmail.com