Close Menu
    Facebook X (Twitter) Instagram
    SciTechDaily
    • Biology
    • Chemistry
    • Earth
    • Health
    • Physics
    • Science
    • Space
    • Technology
    Facebook X (Twitter) Pinterest YouTube RSS
    SciTechDaily
    Home»Technology»Straightening Out AI: How MIT Researchers Bridge the Gap Between Human and Machine Vision
    Technology

    Straightening Out AI: How MIT Researchers Bridge the Gap Between Human and Machine Vision

    By Adam Zewe, Massachusetts Institute of TechnologyMay 9, 20231 Comment8 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn WhatsApp Email Reddit
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email Reddit
    Brain Artificial Intelligence AI CPU Technology
    MIT researchers have discovered that training computer vision models using adversarial training can improve their perceptual straightness, making them more similar to human visual processing. Perceptual straightness enables models to better predict object movements, potentially improving the safety of autonomous vehicles. Adversarially trained models are more robust, retaining a stable representation of objects despite slight changes in images. The researchers aim to use their findings to create new training schemes and further investigate why adversarial training helps models mimic human perception.

    Researchers identify a property that helps computer vision models learn to represent the visual world in a more stable, predictable way.

    MIT researchers found that adversarial training improves perceptual straightness in computer vision models, making them more similar to human visual processing and enabling better prediction of object movements.

    Imagine sitting on a park bench, watching someone stroll by. While the scene may constantly change as the person walks, the human brain can transform that dynamic visual information into a more stable representation over time. This ability, known as perceptual straightening, helps us predict the walking person’s trajectory.

    Unlike humans, computer vision models don’t typically exhibit perceptual straightness, so they learn to represent visual information in a highly unpredictable way. But if machine-learning models had this ability, it might enable them to better estimate how objects or people will move.

    MIT researchers have discovered that a specific training method can help computer vision models learn more perceptually straight representations, like humans do. Training involves showing a machine-learning model millions of examples so it can learn a task.

    The researchers found that training computer vision models using a technique called adversarial training, which makes them less reactive to tiny errors added to images, improves the models’ perceptual straightness.

    Training Machines To Learn More Like Humans Do
    MIT researchers discovered that a specific training technique can enable certain types of computer vision models to learn more stable, predictable visual representations, which are more similar to those humans learn using a biological property known as perceptual straightening. Credit: MIT News with iStock

    The team also discovered that perceptual straightness is affected by the task one trains a model to perform. Models trained to perform abstract tasks, like classifying images, learn more perceptually straight representations than those trained to perform more fine-grained tasks, like assigning every pixel in an image to a category.

    For example, the nodes within the model have internal activations that represent “dog,” which allow the model to detect a dog when it sees any image of a dog. Perceptually straight representations retain a more stable “dog” representation when there are small changes in the image. This makes them more robust.

    By gaining a better understanding of perceptual straightness in computer vision, the researchers hope to uncover insights that could help them develop models that make more accurate predictions. For instance, this property might improve the safety of autonomous vehicles that use computer vision models to predict the trajectories of pedestrians, cyclists, and other vehicles.

    “One of the take-home messages here is that taking inspiration from biological systems, such as human vision, can both give you insight about why certain things work the way that they do and also inspire ideas to improve neural networks,” says Vasha DuTell, an MIT postdoc and co-author of a paper exploring perceptual straightness in computer vision.

    Joining DuTell on the paper are lead author Anne Harrington, a graduate student in the Department of Electrical Engineering and Computer Science (EECS); Ayush Tewari, a postdoc; Mark Hamilton, a graduate student; Simon Stent, research manager at Woven Planet; Ruth Rosenholtz, principal research scientist in the Department of Brain and Cognitive Sciences and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL); and senior author William T. Freeman, the Thomas and Gerd Perkins Professor of Electrical Engineering and Computer Science and a member of CSAIL. The research is being presented at the International Conference on Learning Representations.

    Studying Straightening

    After reading a 2019 paper from a team of New York University researchers about perceptual straightness in humans, DuTell, Harrington, and their colleagues wondered if that property might be useful in computer vision models, too.

    They set out to determine whether different types of computer vision models straighten the visual representations they learn. They fed each model frame of a video and then examined the representation at different stages in its learning process.

    If the model’s representation changes in a predictable way across the frames of the video, that model is straightening. At the end, its output representation should be more stable than the input representation.

    “You can think of the representation as a line, which starts off really curvy. A model that straightens can take that curvy line from the video and straighten it out through its processing steps,” DuTell explains.

    Most models they tested didn’t straighten. Of the few that did, those which straightened most effectively had been trained for classification tasks using the technique known as adversarial training.

    Adversarial training involves subtly modifying images by slightly changing each pixel. While a human wouldn’t notice the difference, these minor changes can fool a machine so it misclassifies the image. Adversarial training makes the model more robust, so it won’t be tricked by these manipulations.

    Because adversarial training teaches the model to be less reactive to slight changes in images, this helps it learn a representation that is more predictable over time, Harrington explains.

    “People have already had this idea that adversarial training might help you get your model to be more like a human, and it was interesting to see that carry over to another property that people hadn’t tested before,” she says.

    But the researchers found that adversarially trained models only learn to straighten when they are trained for broad tasks, like classifying entire images into categories. Models tasked with segmentation — labeling every pixel in an image as a certain class — did not straighten, even when they were adversarially trained.

    Consistent Classification

    The researchers tested these image classification models by showing them videos. They found that the models which learned more perceptually straight representations tended to correctly classify objects in the videos more consistently.

    “To me, it is amazing that these adversarially trained models, which have never even seen a video and have never been trained on temporal data, still show some amount of straightening,” DuTell says.

    The researchers don’t know exactly what about the adversarial training process enables a computer vision model to straighten, but their results suggest that stronger training schemes cause the models to straighten more, she explains.

    Building off this work, the researchers want to use what they learned to create new training schemes that would explicitly give a model this property. They also want to dig deeper into adversarial training to understand why this process helps a model straighten.

    “From a biological standpoint, adversarial training doesn’t necessarily make sense. It’s not how humans understand the world. There are still a lot of questions about why this training process seems to help models act more like humans,” Harrington says.

    “Understanding the representations learned by deep neural networks is critical to improve properties such as robustness and generalization,” says Bill Lotter, assistant professor at the Dana-Farber Cancer Institute and Harvard Medical School, who was not involved with this research. “Harrington et al. perform an extensive evaluation of how the representations of computer vision models change over time when processing natural videos, showing that the curvature of these trajectories varies widely depending on model architecture, training properties, and task. These findings can inform the development of improved models and also offer insights into biological visual processing.”

    “The paper confirms that straightening natural videos is a fairly unique property displayed by the human visual system. Only adversarially trained networks display it, which provides an interesting connection with another signature of human perception: its robustness to various image transformations, whether natural or artificial,” says Olivier Hénaff, a research scientist at DeepMind, who was not involved with this research. “That even adversarially trained scene segmentation models do not straighten their inputs raises important questions for future work: Do humans parse natural scenes in the same way as computer vision models? How to represent and predict the trajectories of objects in motion while remaining sensitive to their spatial detail? In connecting the straightening hypothesis with other aspects of visual behavior, the paper lays the groundwork for more unified theories of perception.”

    Reference: “Exploring Perceptual Straightness in Learned Visual Representations” by Anne Harrington, Vasha DuTell, Ayush Tewari, Mark Hamilton, Simon Stent, Ruth Rosenholtz and William T. Freeman, ICLR 2023.
    PDF

    The research is funded, in part, by the Toyota Research Institute, the MIT CSAIL METEOR Fellowship, the National Science Foundation, the U.S. Air Force Research Laboratory, and the U.S. Air Force Artificial Intelligence Accelerator.

    Never miss a breakthrough: Join the SciTechDaily newsletter.
    Follow us on Google and Google News.

    Artificial Intelligence Machine Learning MIT
    Share. Facebook Twitter Pinterest LinkedIn Email Reddit

    Related Articles

    Artificial Intelligence Consumes a Startling Amount of Power – MIT System Reduces the Carbon Footprint

    Startup Deploying AI Chatbots With “Conversational Memory” for More Natural Exchanges

    Showing Robots How to Do Your Chores – Automated Robots That Learn Just by Watching

    Innovative AI From MIT Helps Delivery Robots Find the Front Door [Video]

    Hunting Down Cybercriminals With New Machine-Learning System

    Boosting Computing Power With Machine Learning for the Future of Particle Physics

    Artificial Intelligence Uses “Self-Learning” to Make Cancer Treatment Less Toxic

    Machine-Learning Models Capture Subtle Variations in Facial Expressions

    Machine-Learning System Uses Physics to Identify Habitable Planets

    1 Comment

    1. Ralph Johnson on May 10, 2023 1:35 pm

      Vision the machine learning to form synaptic type pathways for what is correct also to formulate how a intelligent helpful tool would fend off from the negative false misused information, then set a patterned consideration of an opinion, Opinion can be showing the positive narrative keep in mind that all negatives are not misleading. An opinion of best practises. set patterns with the ability to scan in the background for Up and Down or Open, information bits like the three states of quantum Bit information, Opens are the building paths for a Up or Down, when a Open designation would become a Up / Down, the setting is then included as a set pattern, and each Up or Down segment continues to build over time like the human brain learns, human brain can tell it self what information to minimize as reference in a Open state of build.

      Reply
    Leave A Reply Cancel Reply

    • Facebook
    • Twitter
    • Pinterest
    • YouTube

    Don't Miss a Discovery

    Subscribe for the Latest in Science & Tech!

    Trending News

    Your Blood Pressure Reading Could Be Wrong Because of One Simple Mistake

    Astronomers Stunned by Ancient Galaxy With No Spin

    Physicists May Be on the Verge of Discovering “New Physics” at CERN

    Scientists Solve 320-Million-Year Mystery of Reptile Skin Armor

    Scientists Say This Daily Walking Habit May Be the Secret to Keeping Weight Off After Dieting

    New Therapy Rewires the Brain To Restore Joy in Depression Patients

    Giant Squid Detected off Western Australia in Stunning Deep-Sea Discovery

    Popular Sugar-Free Sweetener Linked to Liver Disease, Study Warns

    Follow SciTechDaily
    • Facebook
    • Twitter
    • YouTube
    • Pinterest
    • Newsletter
    • RSS
    SciTech News
    • Biology News
    • Chemistry News
    • Earth News
    • Health News
    • Physics News
    • Science News
    • Space News
    • Technology News
    Recent Posts
    • Scientists Stunned As Volcano Removes Methane From the Air
    • Scientists Discover Signs Africa May Be Splitting Apart Beneath Zambia
    • New Stroke Study Challenges Decades-Old Medical Beliefs
    • These Simple Plant Foods Are Linked to Lower Blood Pressure
    • Common Blood Pressure Drug Supercharges Cancer Treatment in Surprising New Study
    Copyright © 1998 - 2026 SciTechDaily. All Rights Reserved.
    • Science News
    • About
    • Contact
    • Editorial Board
    • Privacy Policy
    • Terms of Use

    Type above and press Enter to search. Press Esc to cancel.