Close Menu
    Facebook X (Twitter) Instagram
    SciTechDaily
    • Biology
    • Chemistry
    • Earth
    • Health
    • Physics
    • Science
    • Space
    • Technology
    Facebook X (Twitter) Pinterest YouTube RSS
    SciTechDaily
    Home»Biology»From Theory to Therapy: MIT’s Computational Breakthrough in Protein Optimization
    Biology

    From Theory to Therapy: MIT’s Computational Breakthrough in Protein Optimization

    By Anne Trafton, Massachusetts Institute of TechnologyApril 29, 20241 Comment7 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn WhatsApp Email Reddit
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email Reddit
    Computational Technique To Engineer Useful Proteins
    MIT researchers have developed a computational approach that makes it easier to predict mutations that will lead to optimized proteins, based on a relatively small amount of data. Credit: MIT News; iStock

    MIT researchers plan to search for proteins that could be used to measure electrical activity in the brain.

    To engineer proteins with useful functions, researchers usually begin with a natural protein that has a desirable function, such as emitting fluorescent light, and put it through many rounds of random mutation that eventually generate an optimized version of the protein.

    This process has yielded optimized versions of many important proteins, including green fluorescent protein (GFP). However, for other proteins, it has proven difficult to generate an optimized version. MIT researchers have now developed a computational approach that makes it easier to predict mutations that will lead to better proteins, based on a relatively small amount of data.

    Advancements in Computational Protein Design

    Using this model, the researchers generated proteins with mutations that were predicted to lead to improved versions of GFP and a protein from adeno-associated virus (AAV), which is used to deliver DNA for gene therapy. They hope it could also be used to develop additional tools for neuroscience research and medical applications.

    “Protein design is a hard problem because the mapping from DNA sequence to protein structure and function is really complex. There might be a great protein 10 changes away in the sequence, but each intermediate change might correspond to a totally nonfunctional protein. It’s like trying to find your way to the river basin in a mountain range, when there are craggy peaks along the way that block your view. The current work tries to make the riverbed easier to find,” says Ila Fiete, a professor of brain and cognitive sciences at MIT, a member of MIT’s McGovern Institute for Brain Research, director of the K. Lisa Yang Integrative Computational Neuroscience Center, and one of the senior authors of the study.

    Regina Barzilay, the School of Engineering Distinguished Professor for AI and Health at MIT, and Tommi Jaakkola, the Thomas Siebel Professor of Electrical Engineering and Computer Science at MIT, are also senior authors of an open-access paper on the work, which will be presented at the International Conference on Learning Representations in May. MIT graduate students Andrew Kirjner and Jason Yim are the lead authors of the study. Other authors include Shahar Bracha, an MIT postdoc, and Raman Samusevich, a graduate student at Czech Technical University.

    Optimizing Proteins

    Many naturally occurring proteins have functions that could make them useful for research or medical applications, but they need a little extra engineering to optimize them. In this study, the researchers were originally interested in developing proteins that could be used in living cells as voltage indicators. These proteins, produced by some bacteria and algae, emit fluorescent light when an electric potential is detected. If engineered for use in mammalian cells, such proteins could allow researchers to measure neuron activity without using electrodes.

    While decades of research have gone into engineering these proteins to produce a stronger fluorescent signal, on a faster timescale, they haven’t become effective enough for widespread use. Bracha, who works in Edward Boyden’s lab at the McGovern Institute, reached out to Fiete’s lab to see if they could work together on a computational approach that might help speed up the process of optimizing the proteins.

    “This work exemplifies the human serendipity that characterizes so much science discovery,” Fiete says. “It grew out of the Yang Tan Collective retreat, a scientific meeting of researchers from multiple centers at MIT with distinct missions unified by the shared support of K. Lisa Yang. We learned that some of our interests and tools in modeling how brains learn and optimize could be applied in the totally different domain of protein design, as being practiced in the Boyden lab.”

    For any given protein that researchers might want to optimize, there is a nearly infinite number of possible sequences that could generated by swapping in different amino acids at each point within the sequence. With so many possible variants, it is impossible to test all of them experimentally, so researchers have turned to computational modeling to try to predict which ones will work best.

    Computational Modeling and Predictions

    In this study, the researchers set out to overcome those challenges, using data from GFP to develop and test a computational model that could predict better versions of the protein.

    They began by training a type of model known as a convolutional neural network (CNN) on experimental data consisting of GFP sequences and their brightness — the feature that they wanted to optimize.

    The model was able to create a “fitness landscape” — a three-dimensional map that depicts the fitness of a given protein and how much it differs from the original sequence — based on a relatively small amount of experimental data (from about 1,000 variants of GFP).

    These landscapes contain peaks that represent fitter proteins and valleys that represent less fit proteins. Predicting the path that a protein needs to follow to reach the peaks of fitness can be difficult, because often a protein will need to undergo a mutation that makes it less fit before it reaches a nearby peak of higher fitness. To overcome this problem, the researchers used an existing computational technique to “smooth” the fitness landscape.

    Once these small bumps in the landscape were smoothed, the researchers retrained the CNN model and found that it was able to reach greater fitness peaks more easily. The model was able to predict optimized GFP sequences that had as many as seven different amino acids from the protein sequence they started with, and the best of these proteins were estimated to be about 2.5 times fitter than the original.

    “Once we have this landscape that represents what the model thinks is nearby, we smooth it out and then we retrain the model on the smoother version of the landscape,” Kirjner says. “Now there is a smooth path from your starting point to the top, which the model is now able to reach by iteratively making small improvements. The same is often impossible for unsmoothed landscapes.”

    Proof-of-Concept

    The researchers also showed that this approach worked well in identifying new sequences for the viral capsid of adeno-associated virus (AAV), a viral vector that is commonly used to deliver DNA. In that case, they optimized the capsid for its ability to package a DNA payload.

    “We used GFP and AAV as a proof-of-concept to show that this is a method that works on data sets that are very well-characterized, and because of that, it should be applicable to other protein engineering problems,” Bracha says.

    The researchers now plan to use this computational technique on data that Bracha has been generating on voltage indicator proteins.

    “Dozens of labs having been working on that for two decades, and still there isn’t anything better,” she says. “The hope is that now with generation of a smaller data set, we could train a model in silico and make predictions that could be better than the past two decades of manual testing.”

    Reference: “Improving Protein Optimization with Smoothed Fitness Landscapes” by Andrew Kirjner, Jason Yim, Raman Samusevich, Shahar Bracha, Tommi Jaakkola, Regina Barzilay and Ila Fiete, 3 March 2024, Quantitative Biology > Biomolecules.
    arXiv:2307.00494

    The research was funded, in part, by the U.S. National Science Foundation, the Machine Learning for Pharmaceutical Discovery and Synthesis consortium, the Abdul Latif Jameel Clinic for Machine Learning in Health, the DTRA Discovery of Medical Countermeasures Against New and Emerging threats program, the DARPA Accelerated Molecular Discovery program, the Sanofi Computational Antibody Design grant, the U.S. Office of Naval Research, the Howard Hughes Medical Institute, the National Institutes of Health, the K. Lisa Yang ICoN Center, and the K. Lisa Yang and Hock E. Tan Center for Molecular Therapeutics at MIT.

    Never miss a breakthrough: Join the SciTechDaily newsletter.
    Follow us on Google and Google News.

    Artificial Intelligence Biomedical Engineering CSAIL Gene Therapy MIT Protein
    Share. Facebook Twitter Pinterest LinkedIn Email Reddit

    Related Articles

    MIT’s “FrameDiff” – Generative AI Imagines New Protein Structures That Could Transform Medicine

    Accelerating Drug Discovery With the AI Behind ChatGPT – Screening 100 Million Compounds a Day

    MIT Uses Artificial Intelligence to Identify Powerful New Antibiotic

    How tRNA Modifications Help Cells Survive Toxic Chemicals

    How Chronic Inflammation of Organs Can Become Cancerous

    Crowding Causes Internal Cell Structure Alignment

    Proteins in Mucus Offer Immune Protection

    Infrared Spectroscopy Analyzes Proteins in Picoseconds

    A Quest to Map Brain Connections and Understand Connectomes

    1 Comment

    1. Liz on April 30, 2024 9:56 am

      Oh sure! And, while we’re at it let’s do some gain of function research while we’re at it; God only knows how much there’s a severe lack of slave like workers in the world that can’t think for themselves. So; who’s really behind this research?

      Reply
    Leave A Reply Cancel Reply

    • Facebook
    • Twitter
    • Pinterest
    • YouTube

    Don't Miss a Discovery

    Subscribe for the Latest in Science & Tech!

    Trending News

    Breakthrough Bowel Cancer Trial Leaves Patients Cancer-Free for Nearly 3 Years

    Natural Compound Shows Powerful Potential Against Rheumatoid Arthritis

    100,000-Year-Old Neanderthal Fossils in Poland Reveal Unexpected Genetic Connections

    Simple “Gut Reset” May Prevent Weight Gain After Ozempic or Wegovy

    2.8 Days to Disaster: Scientists Warn Low Earth Orbit Could Suddenly Collapse

    Common Food Compound Shows Surprising Power Against Superbugs

    5 Simple Ways To Remember More and Forget Less

    The Atomic Gap That Could Cost the Semiconductor Industry Billions

    Follow SciTechDaily
    • Facebook
    • Twitter
    • YouTube
    • Pinterest
    • Newsletter
    • RSS
    SciTech News
    • Biology News
    • Chemistry News
    • Earth News
    • Health News
    • Physics News
    • Science News
    • Space News
    • Technology News
    Recent Posts
    • Scientists Develop Bioengineered Chewing Gum That Could Help Fight Oral Cancer
    • Popular Weight-Loss Drugs Found To Cut Heart Attack and Stroke Risk
    • After 37 Years, the World’s Longest-Running Soil Warming Experiment Uncovers a Startling Climate Secret
    • NASA Satellite Captures First-Ever High-Res View of Massive Pacific Tsunami
    • ADHD Isn’t Just a Deficit: Study Reveals Powerful Hidden Strengths
    Copyright © 1998 - 2026 SciTechDaily. All Rights Reserved.
    • Science News
    • About
    • Contact
    • Editorial Board
    • Privacy Policy
    • Terms of Use

    Type above and press Enter to search. Press Esc to cancel.