Astronomers at the Institute for Advanced Study and the Flatiron Institute, along with their collaborators, have utilized artificial intelligence to improve the method of calculating the mass of massive clusters of galaxies. The AI revealed that by incorporating a simple term into an existing equation, researchers can now achieve much more accurate mass estimates than before.
The newly enhanced calculations will allow scientists to determine the basic characteristics of the universe with greater precision, according to a report by the astrophysicists, which was published in the Proceedings of the National Academy of Sciences.
“It’s such a simple thing; that’s the beauty of this,” says study co-author Francisco Villaescusa-Navarro, a research scientist at the Flatiron Institute’s Center for Computational Astrophysics (CCA) in New York City. “Even though it’s so simple, nobody before found this term. People have been working on this for decades, and still they were not able to find this.”
The work was led by Digvijay Wadekar of the Institute for Advanced Study in Princeton, New Jersey, along with researchers from the CCA, Princeton University, Cornell University, and the Center for Astrophysics | Harvard & Smithsonian.
Understanding the universe requires knowing where and how much stuff there is. Galaxy clusters are the most massive objects in the universe: A single cluster can contain anything from hundreds to thousands of galaxies, along with plasma, hot gas, and dark matter. The cluster’s gravity holds these components together. Understanding such galaxy clusters is crucial to pinning down the origin and continuing evolution of the universe.
Perhaps the most crucial quantity determining the properties of a galaxy cluster is its total mass. But measuring this quantity is difficult — galaxies cannot be ‘weighed’ by placing them on a scale. The problem is further complicated because the dark matter that makes up much of a cluster’s mass is invisible. Instead, scientists deduce the mass of a cluster from other observable quantities.
In the early 1970s, Rashid Sunyaev, current distinguished visiting professor at the Institute for Advanced Study’s School of Natural Sciences, and his collaborator Yakov B. Zel’dovich developed a new way to estimate galaxy cluster masses. Their method relies on the fact that as gravity squashes matter together, the matter’s electrons push back. That electron pressure alters how the electrons interact with particles of light called photons. As photons left over from the Big Bang’s afterglow hit the squeezed material, the interaction creates new photons. The properties of those photons depend on how strongly gravity is compressing the material, which in turn depends on the galaxy cluster’s heft. By measuring the photons, astrophysicists can estimate the cluster’s mass.
However, this ‘integrated electron pressure’ is not a perfect proxy for mass, because the changes in the photon properties vary depending on the galaxy cluster. Wadekar and his colleagues thought an artificial intelligence tool called ‘symbolic regression’ might find a better approach. The tool essentially tries out different combinations of mathematical operators — such as addition and subtraction — with various variables, to see what equation best matches the data.
Wadekar and his collaborators ‘fed’ their AI program a state-of-the-art universe simulation containing many galaxy clusters. Next, their program, written by CCA research fellow Miles Cranmer, searched for and identified additional variables that might make the mass estimates more accurate.
AI is useful for identifying new parameter combinations that human analysts might overlook. For example, while it is easy for human analysts to identify two significant parameters in a dataset, AI can better parse through high volumes, often revealing unexpected influencing factors.
“Right now, a lot of the machine-learning community focuses on deep neural networks,” Wadekar explained. “These are very powerful, but the drawback is that they are almost like a black box. We cannot understand what goes on in them. In physics, if something is giving good results, we want to know why it is doing so. Symbolic regression is beneficial because it searches a given dataset and generates simple mathematical expressions in the form of simple equations that you can understand. It provides an easily interpretable model.”
The researchers’ symbolic regression program handed them a new equation, which was able to better predict the mass of the galaxy cluster by adding a single new term to the existing equation. Wadekar and his collaborators then worked backward from this AI-generated equation and found a physical explanation. They realized that gas concentration correlates with the regions of galaxy clusters where mass inferences are less reliable, such as the cores of galaxies where supermassive black holes lurk. Their new equation improved mass inferences by downplaying the importance of those complex cores in the calculations. In a sense, the galaxy cluster is like a spherical doughnut. The new equation extracts the jelly at the center of the doughnut that can introduce larger errors, and instead concentrates on the doughy outskirts for more reliable mass inferences.
The researchers tested the AI-discovered equation on thousands of simulated universes from the CCA’s CAMELS suite. They found that the equation reduced the variability in galaxy cluster mass estimates by around 20 to 30 percent for large clusters compared with the currently used equation.
The new equation can provide observational astronomers engaged in upcoming galaxy cluster surveys with better insights into the mass of the objects they observe. “There are quite a few surveys targeting galaxy clusters [that] are planned in the near future,” Wadekar noted. “Examples include the Simons Observatory, the Stage 4 CMB experiment, and an X-ray survey called eROSITA. The new equations can help us in maximizing the scientific return from these surveys.”
Wadekar also hopes that this publication will be just the tip of the iceberg when it comes to using symbolic regression in astrophysics. “We think that symbolic regression is highly applicable to answering many astrophysical questions,” he said. “In a lot of cases in astronomy, people make a linear fit between two parameters and ignore everything else. But nowadays, with these tools, you can go further. Symbolic regression and other artificial intelligence tools can help us go beyond existing two-parameter power laws in a variety of different ways, ranging from investigating small astrophysical systems like exoplanets, to galaxy clusters, the biggest things in the universe.”
Reference: “Augmenting astrophysical scaling relations with machine learning: Application to reducing the Sunyaev–Zeldovich flux–mass scatter” by Digvijay Wadekar, Leander Thiele, Francisco Villaescusa-Navarro, J. Colin Hill, Miles Cranmer, David N. Spergel, Nicholas Battaglia, Daniel Anglés-Alcázar, Lars Hernquist and Shirley Ho, 17 March 2023, Proceedings of the National Academy of Sciences.