Multicomponent Separation of Spectra
This post is about a recent project by Ana Sofia Uzsoy, a third year graduate student in Astrophysics at Harvard University. Ana Sofia is an expert in statistical methods and using them in astrophysical contexts. Here, she talks to me about a project where she uses statistics to separate the spectrum of a galaxy from background sky and noise spectra as taken by the DESI instrument. You can read the full paper here. I go into much more detail on my substack post here!
First, some background! Spectroscopy is the bread and butter of Astronomy. By looking at the light emitted by distant astrophysical objects, we can understand what they are made out of and how they evolve over time. From very nearby objects, like asteroids and even satellites, to the most distant objects like high-redshift galaxies, the only way we can figure out the details of what they are made out of is by looking at their spectrum. Therefore, astronomers do a lot of hard work to observe, process, and interpret spectra that they take of astronomical objects. The objects of interest for Ana Sofia’s project are Lyman-Alpha Emitters (LAEs). An LAE is a type of galaxy that only emits one line: the lyman-alpha line. Lyman-alpha is emitted when an electron falls down from the n=2 level to n=1 in neutral hydrogen. If you find this line in a galaxy, it is usually a sign of ongoing star formation in the galaxy. LAEs are therefore galaxies where a LOT of star formation is ongoing.
Ana Sofia’s project is working with spectra of LAEs taken by DESI, the Dark Energy Spectroscopic Instrument. Specifically, the goal is to 1) determine whether the spectrum is an LAE spectrum and 2) find the cosmological redshift of that LAE.
Image of DESI looking at the sky, taken from https://www.desi.lbl.gov/photos/
Now, determining whether each spectrum taken by DESI is of an LAE and the redshift of the LAE in that spectrum can involve some complicated math. Specifically, because DESI is on the ground, and is looking through the atmosphere of the Earth, any spectrum it takes will be the sum of any emission from the atmosphere and from the astronomical object it is looking at. In particular, the spectrum will be the sum of three components: 1) the sky (or atmosphere), 2) the actual target, and 3) background noise.
In this project, they decompose spectra taken by DESI into these three components using their prior expectation of each of these components. This method allows the authors to define what components they want to decompose our data into, and get those components out of the data. Moreover, they can do all that without losing any of the data! In particular, you can contrast this with something like principle component analysis (PCA), which is another method of decomposing data into the sum of different components. Because PCA breaks the data into infinite components, it forces you to lose at least some of the data when you choose only the first few components. Moreover, PCA doesn’t allow you to force any physical meaning on the components it breaks the data into. As such, this method of use priors that were defined by covariance matrices is very robust for component separation!
Using this methodology, the authors are able to both classify and find the redshift of every LAE observed by DESI automatically! Before this, the only way to find the redshift of any LAE was to classify the spectrum by eye, and then guess which line in the spectrum corresponded to the lyman-alpha line of that galaxy.
You can read the full description of this project, including the nitty-gritty math on my post here. You can find Ana Sofia’s paper here.