50 New Planets Confirmed in Machine Learning First – AI Distinguishes Between Real and “Fake” Planets

Researchers developed an algorithm using machine learning to distinguish genuine planets from false ones in massive candidate collections from telescope missions like NASA’s Kepler and TESS.

New machine learning algorithm designed by astronomers and computer scientists from University of Warwick confirms new exoplanets in telescope data
Sky surveys find thousands of planet candidates, and astronomers have to separate the true planets from fake ones
Algorithm was trained to distinguish between signs of real planets and false positives
New technique is faster than previous techniques, can be automated, and improved with further training

Fifty potential planets have had their existence confirmed by a new machine learning algorithm developed by University of Warwick scientists.

For the first time, astronomers have used a process based on machine learning, a form of artificial intelligence, to analyze a sample of potential planets and determine which ones are real and which are ‘fakes’, or false positives, calculating the probability of each candidate to be a true planet.

Their results are reported in a new study published in the Monthly Notices of the Royal Astronomical Society, where they also perform the first large scale comparison of such planet validation techniques. Their conclusions make the case for using multiple validation techniques, including their machine learning algorithm, when statistically confirming future exoplanet discoveries.

Many exoplanet surveys search through huge amounts of data from telescopes for the signs of planets passing between the telescope and their star, known as transiting. This results in a telltale dip in light from the star that the telescope detects, but it could also be caused by a binary star system, interference from an object in the background, or even slight errors in the camera. These false positives can be sifted out in a planetary validation process.

Researchers from Warwick’s Departments of Physics and Computer Science, as well as The Alan Turing Institute, built a machine learning based algorithm that can separate out real planets from fake ones in the large samples of thousands of candidates found by telescope missions such as NASA’s Kepler and TESS.

It was trained to recognize real planets using two large samples of confirmed planets and false positives from the now retired Kepler mission. The researchers then used the algorithm on a dataset of still unconfirmed planetary candidates from Kepler, resulting in fifty new confirmed planets and the first to be validated by machine learning. Previous machine learning techniques have ranked candidates, but never determined the probability that a candidate was a true planet by themselves, a required step for planet validation.

Those fifty planets range from worlds as large as Neptune to smaller than the Earth, with orbits as long as 200 days to as little as a single day. By confirming that these fifty planets are real, astronomers can now prioritize these for further observations with dedicated telescopes.

Dr. David Armstrong, from the University of Warwick Department of Physics, said: “The algorithm we have developed lets us take fifty candidates across the threshold for planet validation, upgrading them to real planets. We hope to apply this technique to large samples of candidates from current and future missions like TESS and PLATO.

“In terms of planet validation, no one has used a machine learning technique before. Machine learning has been used for ranking planetary candidates but never in a probabilistic framework, which is what you need to truly validate a planet. Rather than saying which candidates are more likely to be planets, we can now say what the precise statistical likelihood is. Where there is less than a 1% chance of a candidate being a false positive, it is considered a validated planet.”

Dr. Theo Damoulas from the University of Warwick Department of Computer Science, and Deputy Director, Data Centric Engineering and Turing Fellow at The Alan Turing Institute, said: “Probabilistic approaches to statistical machine learning are especially suited for an exciting problem like this in astrophysics that requires incorporation of prior knowledge — from experts like Dr. Armstrong — and quantification of uncertainty in predictions. A prime example when the additional computational complexity of probabilistic methods pays off significantly.”

Once built and trained the algorithm is faster than existing techniques and can be completely automated, making it ideal for analyzing the potentially thousands of planetary candidates observed in current surveys like TESS. The researchers argue that it should be one of the tools to be collectively used to validate planets in the future.

Dr. Armstrong adds: “Almost 30% of the known planets to date have been validated using just one method, and that’s not ideal. Developing new methods for validation is desirable for that reason alone. But machine learning also lets us do it very quickly and prioritize candidates much faster.

“We still have to spend time training the algorithm, but once that is done it becomes much easier to apply it to future candidates. You can also incorporate new discoveries to progressively improve it.

“A survey like TESS is predicted to have tens of thousands of planetary candidates and it is ideal to be able to analyze them all consistently. Fast, automated systems like this can take us all the way to validated planets in fewer steps let us do that efficiently.”

Reference: “Exoplanet Validation with Machine Learning: 50 new validated Kepler planets” by David J Armstrong, Jevgenij Gamper, Theodoros Damoulas, 20 August 2020, Monthly Notice of the Royal Astronomical Society.
DOI:10.1093/mnras/staa2498

Dr. Armstrong’s research was supported by the Science and Technology Facilities Council (STFC), part of UK Research and Innovation, through an Ernest Rutherford Fellowship.