In a groundbreaking new study, researchers at UC San Diego tested the latest generation of AI—called large language models (LLMs)—in a true-to-form version of the Turing Test. The results were eye-opening. Not only did some AI systems fool people into thinking they were human, one AI was actually more convincing than the real humans it was being compared to.
What Was Tested
The researchers evaluated four different AI systems: a very old chatbot called ELIZA, OpenAI’s GPT-4o and GPT-4.5, and Meta’s LLaMa-3.1. Two types of experiments were conducted. In each one, a participant had two simultaneous chat conversations—one with a real person, and one with an AI. After five minutes, the participant had to decide which of their chat partners was human.
This setup is the classic three-person Turing Test: human vs. AI, with a judge in the middle trying to tell them apart.
But there was a twist. Some of the AI systems were given specific instructions to pretend to be a certain kind of person—for example, a shy young adult who enjoys internet culture and uses slang. These were called persona prompts. Others received only basic instructions to try to sound human.
How Did the AI Do?
The results were striking. When GPT-4.5 was given a persona prompt, it fooled people 73% of the time. In fact, it was more likely to be mistaken for a human than the actual human it was chatting against! LLaMa-3.1, with the same kind of persona, was mistaken for a human 56% of the time—about even with the real humans it was paired with.
On the other hand, when these same models weren’t given a persona—just told to act human without much guidance—they didn’t do as well. They were correctly identified as AI more often than not, especially by the undergraduate students in the study.
The older models, like GPT-4o and ELIZA, fared much worse. They were rarely mistaken for humans, scoring around 20%—well below the 50% you’d expect by random guessing. This showed that the judges could tell when they were speaking with a weak AI, and the test wasn’t too easy.
Why It Matters
So, does this mean AI is now intelligent? Not exactly. What this shows is that advanced AIs can convincingly imitate human behavior in short, casual conversations. But it doesn’t mean they understand the world like we do, have emotions, or are conscious. What they’ve mastered is something more subtle and perhaps more dangerous: deception.
These AIs were able to blend in during conversations so well that people couldn’t tell they were fake. This has real-world implications. If AI can pass for human in everyday chats, it could be used to impersonate people, manipulate opinions, or even carry out scams—what some experts are calling “counterfeit people.”
Reference:
Jones, Cameron R., and Benjamin K. Bergen. Large Language Models Pass the Turing Test. Department of Cognitive Science, UC San Diego, 2025. https://osf.io/jk7bw.



Leave a Reply