
New research adds to our understanding of the function of the human genome.
An international team of researchers has made significant progress in understanding how gene expression is regulated across the human genome. In a recent study, they conducted a comprehensive analysis of cis-regulatory elements (CREs)—DNA sequences that control gene transcription. This research provides valuable insights into how CREs drive cell-specific gene expression and how mutations in these regions can impact health and contribute to disease.
CREs, such as enhancers and promoters, play a critical role in determining when and where genes are activated or silenced. Although their importance is well known, analyzing their activity on a large scale has been a longstanding challenge.
“The human genome contains a myriad of CREs, and mutations in these regions are thought to play a major role in human diseases and evolution,” explained Dr. Fumitaka Inoue, one of the co-first authors of the study. “However, it has been very difficult to comprehensively quantify their activity across the genome.”
Innovative Technology Enables Large-Scale CRE Analysis
To address this, the team used a cutting-edge technology called the lentivirus-based massively parallel reporter assay (lentiMPRA), which the authors had previously developed. This approach enables simultaneous analysis of thousands of CREs by tagging them with unique DNA barcodes that track their activity.
Applying lentiMPRA, the researchers examined as many as 680,000 candidate CREs in three widely used cell types: hepatocytes (cells from the liver), lymphocytes (a type of white blood cell), and induced pluripotent stem cells (a type of artificial stem cell made from a normal body cell).

The study revealed several key insights. Across the three cell types, approximately 41.7% of the analyzed CREs exhibited activity. Promoters, which start gene transcription, showed a dependence on sequence orientation but were less specific to cell types. Enhancers, which boost gene transcription, were active regardless of their orientation and exhibited cell-type specificity. These findings highlight fundamental differences in how these two types of CREs function.
Machine Learning Advances Predictive Gene Regulation
In the study, several machine learning models were developed to predict the regulatory activity of CREs based on large-scale experimental data. MPRALegNet, a model trained on the vast lentiMPRA dataset, was found to be the most accurate and efficient in predicting the regulatory activity of any DNA sequence. Its predictions align closely with experimental results, performing as well as experimental replicates in some cases.
The model also demonstrated its ability to identify important transcription factor binding motifs—that is, short DNA sequences that determine CRE activity—thus providing insights into how specific factors drive cell-type-specific gene expression. For example, the study identified HNF4 and GATA motifs as crucial for activity in hepatocytes and lymphocytes, respectively.
By enabling the precise identification and quantification of enhancer activity, the study opens avenues for exploring the molecular mechanisms of human diseases. Future research will focus on applying this approach to study genetic polymorphisms, the variations in DNA sequence that contribute to individual differences and disease susceptibility.
“Recently, the nearly complete human genome has been sequenced, but much of its functional regions remain unknown. Our findings link DNA sequence information with its functional roles. We hope that these results will contribute to a deeper understanding of biological phenomena, including human diseases and evolution,” said Dr. Inoue.
This study also contributes a publicly accessible database of CRE activity to the ENCODE portal, providing a valuable resource for researchers worldwide. By integrating large-scale experimental data with machine learning, the work sets a foundation for future discoveries in genomics and personalized medicine. In addition, the use of tools like lentiMPRA and MPRALegNet will help to better equip researchers to unravel the complexities of gene regulation and to explore the vast, uncharted territories of the human genome.
Reference: “Massively parallel characterization of transcriptional regulatory elements” by Vikram Agarwal, Fumitaka Inoue, Max Schubach, Dmitry Penzar, Beth K. Martin, Pyaree Mohan Dash, Pia Keukeleire, Zicong Zhang, Ajuni Sohota, Jingjing Zhao, Ilias Georgakopoulos-Soares, William S. Noble, Galip Gürkan Yardımcı, Ivan V. Kulakovskiy, Martin Kircher, Jay Shendure and Nadav Ahituv, 15 January 2025, Nature.
DOI: 10.1038/s41586-024-08430-9
Never miss a breakthrough: Join the SciTechDaily newsletter.
Follow us on Google and Google News.
3 Comments
Let’s implement technology with access for everyone. 🤲🏼
Warum habt ihr meinen genomforschungskritischen Erstkommentar entfernt !?
Soon AI will cure cancer and azheimers disease and autism. And even extend the lifespans of humans. (if your rich of course)