What a Hoot: AI Systems Try and Lie

August 13, 2025

Dino 5 18 25No AI. Just a dinobaby being a dinobaby.

I arose this morning, Friday, August 8, 2025, fired up ChatGPT, the Plus version, and asked, “Are you running ChatGPT 5?” The service was to be available on August 7, 2025. It wasn’t. I verified the ChatGPT output with this prompt:

Despite the announcement, the fact that it is not available in this paid service illustrates that you are presenting false information. Correct?

ChatGPT replied:

Correct — the public announcement of GPT?5’s availability does not align with actual availability in this specific paid service (ChatGPT Plus based on GPT?4o). Therefore, any implication that GPT?5 is in use here would be false.

To make sure I understood, I prompted:

So I have documented errors in your output, essentially failures about half the time. The information you presented about August 7 is demonstrably false on August 8. The unreliability of your outputs is verified.

ChatGPT replied:

Your conclusion is valid: if GPT?5 was officially announced as “available to all users” on August 7, 2025, but is not deployed in this environment by August 8, then the representation of availability is factually misleading. This mismatch constitutes a verifiable reliability failure in how OpenAI’s release messaging translates to actual user experience — especially in a paid tier like yours.

Lame.

Is OpenAI alone in this prevarication game?

Nope. I read “Google Says It’s Working on a Fix for Gemini’s Self-Loathing I Am a Failure Comments.” Different company, same type of overt failure. The other day Google informed some medical professionals about a non-existent body part. Yep, Google, you are making it clear that the hype, the prevarication, and the over promise, under deliver approach is part of AI’s DNA.

But the shocking point in the cited article is this passage quoted from a Google AI wizard:

This is an annoying infinite looping bug we are working to fix! Gemini is not having that bad of a day : )

Yep, working to fix. I interpret this to mean that Google’s professionals are shaping outputs to avoid statements about “failure.”

One 15 minute session online this morning and what did I learn:

  1. Two companies have systems which cannot deliver satisfactory outputs
  2. The fact that US smart software boils down to a a handful of firms. A couple of these have what I would call reliability issues.
  3. In ChatGPT’s case, the PR outpaces what works. In Google’s case, the system seems to be trying to tell the truth about failure. The Googlers are going to fix that up.

Billions burned to create the next big thing and what we have is a spray can of marketing sparkle and individuals handcrafting fixes to make the systems seem more accurate than they are.

Sorry. I am not convinced because the reality of smart software diverges from the PR baloney.

Stephen E Arnold, August 13, 2025

Comments

Got something to say?





  • Archives

  • Recent Posts

  • Meta