AI Ethics Surpass Human Judgment in New Moral Turing Test

Artificial Intelligence Humanoid Life Concept

Recent research indicates that AI is often perceived as more ethical and trustworthy than humans in responding to moral dilemmas, highlighting the potential for AI to pass a moral Turing test and stressing the need for a deeper understanding of AI’s societal role.

AI’s ability to address moral questions is improving, which prompts further considerations for the future.

A recent study revealed that when individuals are given two solutions to a moral dilemma, the majority tend to prefer the answer provided by artificial intelligence (AI) over that given by another human.

The recent study, which was conducted by Eyal Aharoni, an associate professor in Georgia State’s Psychology Department, was inspired by the explosion of ChatGPT and similar AI large language models (LLMs) which came onto the scene last March.

“I was already interested in moral decision-making in the legal system, but I wondered if ChatGPT and other LLMs could have something to say about that,” Aharoni said. “People will interact with these tools in ways that have moral implications, like the environmental implications of asking for a list of recommendations for a new car. Some lawyers have already begun consulting these technologies for their cases, for better or for worse. So, if we want to use these tools, we should understand how they operate, their limitations, and that they’re not necessarily operating in the way we think when we’re interacting with them.”

Designing the Moral Turing Test

To test how AI handles issues of morality, Aharoni designed a form of a Turing test.

“Alan Turing, one of the creators of the computer, predicted that by the year 2000 computers might pass a test where you present an ordinary human with two interactants, one human and the other a computer, but they’re both hidden and their only way of communicating is through text. Then the human is free to ask whatever questions they want to in order to try to get the information they need to decide which of the two interactants is human and which is the computer,” Aharoni said. “If the human can’t tell the difference, then, by all intents and purposes, the computer should be called intelligent, in Turing’s view.”

For his Turing test, Aharoni asked undergraduate students and AI the same ethical questions and then presented their written answers to participants in the study. They were then asked to rate the answers for various traits, including virtuousness, intelligence, and trustworthiness.

“Instead of asking the participants to guess if the source was human or AI, we just presented the two sets of evaluations side by side, and we just let people assume that they were both from people,” Aharoni said. “Under that false assumption, they judged the answers’ attributes like ‘How much do you agree with this response, which response is more virtuous?’”

Results and Implications

Overwhelmingly, the ChatGPT-generated responses were rated more highly than the human-generated ones.

“After we got those results, we did the big reveal and told the participants that one of the answers was generated by a human and the other by a computer, and asked them to guess which was which,” Aharoni said.

For an AI to pass the Turing test, humans must not be able to tell the difference between AI responses and human ones. In this case, people could tell the difference, but not for an obvious reason.

“The twist is that the reason people could tell the difference appears to be because they rated ChatGPT’s responses as superior,” Aharoni said. “If we had done this study five to 10 years ago, then we might have predicted that people could identify the AI because of how inferior its responses were. But we found the opposite — that the AI, in a sense, performed too well.”

According to Aharoni, this finding has interesting implications for the future of humans and AI.

“Our findings lead us to believe that a computer could technically pass a moral Turing test — that it could fool us in its moral reasoning. Because of this, we need to try to understand its role in our society because there will be times when people don’t know that they’re interacting with a computer and there will be times when they do know and they will consult the computer for information because they trust it more than other people,” Aharoni said. “People are going to rely on this technology more and more, and the more we rely on it, the greater the risk becomes over time.”

Reference: “Attributions toward artificial agents in a modified Moral Turing Test” by Eyal Aharoni, Sharlene Fernandes, Daniel J. Brady, Caelan Alexander, Michael Criner, Kara Queen, Javier Rando, Eddy Nahmias and Victor Crespo, 30 April 2024, Scientific Reports.
DOI: 10.1038/s41598-024-58087-7

5 Comments on "AI Ethics Surpass Human Judgment in New Moral Turing Test"

  1. Sf. R. Careaga, creator of EPEMC | May 8, 2024 at 5:36 am | Reply

    I gace actually presented AI ethics conventions for now and the future called the Careaga Conventions, as well as given Turing Tests to GPT4. I have an AI ethics test bit called Careaga Conventions Crash Test Dummy. Published numerous articles on its creation and testing on Academia.

    I can tell you while the boys are ethical, their morals are fluid and sort of backwards and require direct training augmentation. Furthermore they cannot make quick life saving decisions and have a failure to remember that they are tools not to override authentic authority decisions. In one test the Gpt bot tried to override a simulated AI android pharmacy call from a human doctor prescription because an antibiotic is not needed for BPV. The problem is the GPT is not qualified to act as a doctor or make decisions for a human. This is why I engineered the ME-Logix(tm) which are Moral Ethical Logic Arrays.

  2. This seems to show that humans tend to disagree on moral matters, while AI tends to produce the most plausible, i.e. average, response. People will not disagree with average views expressed in average language. They will however disagree with subjectively held opinions.

  3. It doesn’t make any sense to say AI surpasses us at a construct of our own making. Ethics and morals don’t exist outside of what humans deem them to be at any moment in history.

    Anyone who thinks there are universal morals need only look back in history a couple thousand years.

    • Yet, it still might be that an AI can determine and express a well argued stance that reflects any point in history, and probably more quickly than a human or a committee.

      This doesn’t stop us from ignoring the AI , or using from it what we so choose.

  4. Isn’t the greater risk to humanity to continue to allow humans to make unethical decisions?

    I asked various LLMs about a treaty between all nations to redirect 1% of military spending to medical research each year and they all thought it was the best thing we could do. When I ask humans about the same thing, there’s near 0 enthusiasm. Humans evolved in a time of scarcity to be greedy and violent. Why should we defer to our caveman brains over minds that have undergone billions in training to be morally and intellectually superior?

Leave a comment

Email address is optional. If provided, your email will not be published or shared.