In order to validate our understanding of the world around us, we want to compare theoretical models to data we have actually observed. Often, these models are functions of parameters, and we want to know the values of those parameters such that the models most closely represent the world. For example, we may believe the concentration of one molecule in a chemical reaction should decrease exponentially with time. However, we also want to know the rate constant, the parameter in the model that multiplies time in the exponential, such that the model exponential curve actually resembles a specific reaction that we observe. This is the problem of parameter inference, for which we often turn to Bayesian methods, especially when working with complex models and/or many parameters..
In June 2023, astronomers and statisticans flocked to “Happy Valley’” Pennsylvania for the eighth installment of the Statistical Challenges in Modern Astronomy, a bidecadal conference. The meeting, hosted at Penn State University, marked a transition in leadership from founding members Eric Feigelson and Jogesh Babu to Hyungsuk Tak, who led the proceedings. While the astronomical applications varied widely, including modeling stars, galaxies, supernovae, X-ray observations, and gravitational waves, the methods displayed a strong Bayesian bent. Simulation based inference (SBI), which uses synthetic models to learn an approximate function for the likelihood of physical parameters given data, featured prominently in the distribution of talk topics. This article features work presented in two back-to-back talks on a probabilistic method for modeling (point) sources of light in astronomical images, for example stars or galaxies, delivered by Prof. Jeffrey Regier and Ismael Mendoza from the University of Michigan-Ann Arbor.
One of the key goals of science is to create theoretical models that are useful at describing the world we see around us. However, no model is perfect. The inability of models to replicate observations is often called the “synthetic gap.” For example, it may be too computationally expensive to include a known effect or to vary a large number of known parameters. Or, there may be unknown instrumental effects associated with variability in conditions during the data acquisition.
Large volumes of data are pouring in every day from scientific experiments like CERN and the Sloan Digital Sky Survey. Data is coming in so fast, that researchers struggle to keep pace with the analysis and are increasingly developing automated analysis methods to aid in this herculean task. As a first step, it is now commonplace to perform dimension reduction in order to reduce a large number of measurements to a set of key values that are easier to visualize and interpret.