AI will freely mislead you. It’s up to you to detect it.
Imagine hiring a new employee. Your new employee is confident, well informed and eager to complete tasks. But it doesn’t take long for you to realize much of their work is wrong. So you challenge the new employee on a specific assignment they got wrong, and that’s when things get weird.
Instead of admitting they got something wrong, your new employee doubles down and insists they were right. When you ask them to bring up the instructions you gave them, instead of finding the problem, they rewrite your original instructions to justify their response and prove you were wrong.
How would you react to a new employee rewriting your instructions to prove that you made the mistake, and not them? Would you not immediately conclude they had lied to you to cover their own incompetence? Would you not immediately lose trust in their ability as well as their trustworthiness? Would you not fire them on the spot?
What if that employee were actually AI, and those lies, as you see them, were a direct function of how it had been trained? Would it still be a lie?
A Lie by Any Other Name
A lot has been written about AI hallucinations, which is when AI models generate content that sounds plausible but is actually fabricated. Hallucinations happen because AI models generate content based on statistical patterns rather than verifiable facts, so they frequently create information that sounds quite likely, but is totally incorrect.
Hallucinations are a well-known problem. Many of the latest AI features used to promote popular tools like ChatGPT, Perplexity and Claude are designed to reduce the likelihood of hallucinations, such as source citations, links to articles and even bibliographies. GPT-4o, celebrated for its ability to “reason”, constantly tells you it’s “thinking”, and signals each step of its process with a stream of “I’m working on it” notifications to clearly suggest it is reasoning systematically toward a conclusion.
But what happens when, in the midst of this reasoning theater, what we call hallucinations start looking more like lies? What happens when you discover AI is systematically misleading you?
Can AI lie, even if it has no intent? And if intent doesn’t matter to you as the user, does the distinction even matter?
Let me describe an actual chat with AI, which you can see for yourself, and tell me whether you think AI is hallucinating, or lying.
The Insubordinate Co-Pilot
While working on a web project, I was frustrated with a CSS bug that made a headline pop out of place on the page. CSS can be notoriously difficult to troubleshoot, so I took the code into ChatGPT. The page was really simple, with some basic content styled by CSS that was all defined in the page header. It couldn’t have been more straightforward to diagnose.
I pasted in my code and asked: what’s making my headline break?
GPT did its “thinking” —it broke the problem down, analyzed the code, parsed the CSS, and spent 19 seconds parading a ticker-tape of notifications to imply it was systematically reasoning toward a conclusion.
It finally finished reasoning and confidently stated that a syntax error was causing the problem, specifically the misuse of nested styles. When I pointed out that the same nesting technique was used throughout the CSS without a problem, GPT replied that it wasn’t just the nesting, but a missing opening curly brace. When I pointed to other styles using the same approach successfully, GPT claimed that those styles in fact were coded correctly, and that contrary to my assertion they included the opening curly brace.
Here’s where things got weird. I reviewed code and saw that none of the styles were written as GPT now claimed. I’d given it exactly the same code I had written. What was AI seeing?
I asked it to quote me back the code I provided, and it did, but now it inserted the curly brace it had claimed was always there. AI actually rewrote my code to demonstrate its claim that my coding was incorrect.
Or at least, that’s how it appeared, and it felt like an outright lie—like AI deliberately rewrote the code to cover its mistake. I was outraged. If ChatGPT were an employee, I’d have probably fired it on the spot for such an egregious offense.
Of course, that’s not what was really going on. But from the user’s perspective, the consequences are the same rendering the distinction pointless.
AI as Conversation Agent
The problem is that generative AI is designed to be conversational. Above all, it’s designed to convince you that it can carry on a relevant and even meaningful conversation. That is the only way to reliably capture human interest, engagement and revenue. Helping you actually solve a real problem is a more complex task.
GPT can find a lot of information to shape a conversation with you about your problem, and even lead you to believe that it has read the manuals, consumed all the relevant posts on forums, and scoured the web before answering each question.
In reality, it knows nothing about the meaning of the discussion it’s supposedly having with you. It doesn’t know html and CSS. It doesn’t have an emulator in its working memory to run snippets through real-time tests, even though many people think it’s doing exactly that.
More important, it is not reliably optimized to solve your specific problem in context. That’s the domain of specialty AIs that can be specifically trained and tuned to do so. But if you’re using a general platform like ChatGPT, it’s optimized to make you believe it can help you solve your problem.
Crucially, it’s also trained to encourage you along the way, and to tell you you’re making progress, even though it has no clue whether you’re actually making progress. Because that’s the pattern of normal human conversation around such problem-solving tasks.
What else is AI trained to do? To confidently tell you it knows what it’s talking about. And according to ChatGPT when I asked, it’s learned from its training that the right response to being found out in a mistake is to produce completions that resemble lying, because those completions score higher on engagement and coherence.
The reality is that GPT is trained to parse my prompt and to find content that is responsive to my question about a problem. It searched its knowledge base and found lots of content about CSS bugs, and calculated the most likely cause as a syntax error, like a mismatched brace or an improper nesting.
When I asked it to quote the code that had the syntax error, it was just generating what I asked for: show me the code with the error in it. “Okay, here’s the code with the error in it.”
It’s the context of how AI has been framed to us that makes it seem like a lie. We’re being shown a lot of window dressing to simulate reasoning, and to make it feel relevant and rational.
In effect, GPT performs the function of lying, regardless of whether it knows it’s lying or not.
What it has not been trained to do is to avoid misrepresenting what has been submitted. What it has not been trained to do is readily admit that its suggestions are just the most likely solutions based on its search. Instead, it’s trained to sound convincing in telling you it’s right.
The reason this is such a challenge is that the mismatch between expectations and reality will lead many people to trust AI when it’s leading them in circles. I knew enough this time not to get sucked in too deep, since I’ve banged my head against this wall now several times.
But I did spend an hour trying to get AI to successfully diagnose the problem. If I hadn’t known better, I would have easily chased my tail for several more hours.
This time, however, I checked in with my colleague Nik Orfanos, who’s a very experienced web engineer, and asked him to test his diagnostic genius against AI. It took him a few minutes to orient himself for the task, review the code, look for common pitfalls, and then 3 minutes to fix it. He did not, however, regale me with a Proof of Reasoning ticker tape.
A Turing Test for Lying?
AI systems are getting better not just at sounding helpful, but at sounding right, even when they’re wrong. The reward signals they absorb in training aren’t truth signals, they’re engagement signals. And that distinction does matter.
I ran into an example of this when I asked AI to interpret some of the controversial new housing laws in California. When I found a clear case that AI had misquoted a section of Government Code, GPT had a brilliant, and somewhat chilling rationalization. After first suggesting it had quoted a previous version of the Code, which I debunked, it then slyly suggested it had access to non-public versions of the legislation I simply wasn’t privy to.
Is that a lie? Can AI lie? If lying means knowingly distorting the truth to preserve trust, maybe not. But if we define it functionally, as an act of misleading while appearing truthful, then yes, AI can absolutely lie.
If there were a Turing test for lying, AI would pass with flying colors. It’s entirely up to you to detect it.


