Are Scientists Being Fooled by Bacteria? New Machine Learning Algorithm Reveals the Truth About DNA

DNA Genetics

Previous studies of a genetic on/off switch may have been confounded by contamination, but Mount Sinai scientists have created a new tool for accurately determining whether it plays a role in human disease.

A tiny team of cutting-edge medical experts has been examining a biochemical, DNA tagging mechanism that turns genes on and off for decades. Some have recently discovered evidence of it in plants, flies, human brain tumors, and even bacteria, which has long been researched in bacteria. A new study by scientists at the Icahn School of Medicine at Mount Sinai, however, suggests that there may be a problem: a large portion of the evidence for its presence in higher organisms may be caused by bacterial contamination, which was challenging to detect using current experimental techniques.

To solve this problem, the researchers developed a special gene sequencing technique that makes use of a brand-new machine learning algorithm to precisely determine the origin and concentration of tagged DNA. This made it easier for them to separate bacterial DNA from human and other non-bacterial cell DNA. The findings reported in Science confirmed the hypothesis that this mechanism may exist naturally in cells other than bacteria, although the levels were significantly lower than those reported in some earlier research and were easily influenced by bacterial contamination or modern experimental techniques. Similar results were obtained in experiments using human brain cancer cells.

“Pushing the boundaries of medical research can be challenging. Sometimes the ideas are so novel that we have to rethink the experimental methods we use to test them out,” said Gang Fang, PhD, Associate Professor of Genetics and Genomic Sciences at Icahn Mount Sinai. “In this study, we developed a new method for effectively measuring this DNA mark in a wide variety of species and cell types. We hope this will help scientists uncover the many roles these processes may play in evolution and human disease.”

DNA Tagging System

Researchers at the Icahn School of Medicine at Mount Sinai developed an advanced method for determining whether cells may use an obscure DNA tagging system for turning genes on or off. Credit: Courtesy of Do lab, Mount Sinai, N.Y., N.Y.

The study focused on DNA adenine methylation, a biochemical reaction which attaches a chemical, called a methyl group, to an adenine, one of the four building block molecules used to construct lengthy DNA strands and encode genes. This can “epigenetically” activate or silence genes without actually altering DNA sequences. For instance, it is known that adenine methylation plays a critical role in how some bacteria defend themselves against viruses.

For decades, scientists thought that adenine methylation strictly happened in bacteria whereas human and other non-bacterial cells relied on the methylation of a different building block—cytosine—to regulate genes. Then, starting around 2015, this view changed. Scientists spotted high levels of adenine methylation in plant, fly, mouse, and human cells, suggesting a wider role for the reaction throughout evolution.

However, the scientists who performed these initial experiments faced difficult trade-offs. Some used techniques that can precisely measure adenine methylation levels from any cell type but do not have the capacity to identify which cell each piece of DNA came from, while others relied on methods that can spot methylation in different cell types but may overestimate reaction levels.

In this study, Dr. Fang’s team developed a method called 6mASCOPE which overcomes these trade-offs. In it, DNA is extracted from a sample of tissue or cells and chopped up into short strands by proteins called enzymes. The strands are placed into microscopic wells and treated with enzymes that make new copies of each strand. An advanced sequencing machine then measures in real time the rate at which each nucleotide building block is added to a new strand. Methylated adenines slightly delay this process. The results are then fed into a machine learning algorithm which the researchers trained to estimate methylation levels from the sequencing data.

“The DNA sequences allowed us to identify which cells—human or bacterial—methylation occurred in while the machine learning model quantified the levels of methylation in each species separately,” said Dr. Fang,

Initial experiments on simple, single-cell organisms, such as green algae, suggested that the 6mASCOPE method was effective in that it could detect differences between two organisms that both had high levels of adenine methylation.

The method also appeared to be effective at quantifying adenine methylation in complex organisms. For example, previous studies had suggested that high levels of methylation may play a role in the early growth of the fruit fly Drosophila melanogaster and of the flowering weed Arabidopsis thaliana. In this study, the researchers found that these high levels of methylation were mostly the result of contaminating bacterial DNA. In reality, the fly and the plant DNA from these experiments only had trace amounts of methylation.

Likewise, experiments on human cells suggested that methylation occurs at very low levels in both healthy and disease conditions. Immune cell DNA obtained from patient blood samples had only trace amounts of methylation.

Similar results were also seen with DNA isolated from glioblastoma brain tumor samples. This result was different than a previous study, which reported much higher levels of adenine methylation in tumor cells. However, as the authors note, more research may be needed to determine how much of this discrepancy may be due to differences in tumor subtypes as well as other potential sources of methylation.

Finally, the researchers found that plasmid DNA, a tool that scientists use regularly to manipulate genes, may be contaminated with high levels of methylation that originated from bacteria, suggesting this DNA could be a source of contamination in future experiments.

“Our results show that the manner in which adenine methylation is measured can have profound effects on the result of an experiment. We do not mean to exclude the possibility that some human tissues or disease subtypes may have highly abundant DNA adenine methylation, but we do hope 6mASCOPE will help scientists fully investigate this issue by excluding the bias from bacterial contamination,” said Dr. Gang. “To help with this we have made the 6mASCOPE analysis software and a detailed operating manual widely available to other researchers.”

Reference: “Critical assessment of DNA adenine methylation in eukaryotes using quantitative deconvolution” by Yimeng Kong, Lei Cao, Gintaras Deikus, Yu Fan, Edward A. Mead, Weiyi Lai, Yizhou Zhang, Raymund Yong, Robert Sebra, Hailin Wang, Xue-Song Zhang and Gang Fang, 3 February 2022, Science.
DOI: 10.1126/science.abe7489

This work was supported by the National Institutes of Health (GM139655, HG011095, AG071291); the Icahn Institute for Genomics and Multiscale Biology; the Irma T. Hirschl/Monique Weill-Caulier Trust; the Nash Family Foundation; and the Department of Scientific Computing at the Icahn School of Medicine at Mount Sinai. Methods validation using Mass Spectrometry was supported by the collaborators at the Chinese Academy of Sciences (XDPB2004) and the National Natural Science Foundation of China (22021003).

1 Comment on "Are Scientists Being Fooled by Bacteria? New Machine Learning Algorithm Reveals the Truth About DNA"

  1. Very Interesting.
    Viruses are Lifeless! Bacteria have Life.
    All Life wants to thrive. Bacteria methylation to protect itself from lifeless virus is evolution playing out at its most basic level.
    Single Cell green alge experiment is interesting.

    1. Can we relook at the rates of methylation of Bacteria under various viral environments and loads?

    2. Maybe these are good bacteria which undergo methylation to enable it to fight viruses and protect the Mammalian Gene.

    3.There may be a link to mothers milk which helps to provide a new born infant gain better immunity. Far better than artificial milk, which is fed to new borns in many advanced nations, thus lowering the immunity of new borns.

    4. The Milk from Mammalian and Human Females needs to be analyzed minutely and links with the Immnue EcoSystem long-term health of the Mmamlian and Human species needs to be established .

    Views expressed are personal and not binding on anyone.

Leave a comment

Email address is optional. If provided, your email will not be published or shared.