Software That “Works.” Okay, What Does “Work” Mean?

February 9, 2026

green-dino_thumbAnother dinobaby post. No AI unless it is an image. This dinobaby is not Grandma Moses, just Grandpa Arnold.

I don’t want to write about the US government. I don’t want to write about consulting trends. I don’t want to write about AI. Once I wrote about search and retrieval. Now that’s just AI-ized, and it still does not “work” for many work related processes. Why am I negative? Well, folks, AI infused search just outputs information that can be wrong. When key word search can’t find a document a user just created on a laptop connected to the company network, an AI-infused system may not find it. The AI system could just fabricate it or output a match that is close enough for horse shoes.

Therefore, I perked up when I read “Your Job Is to Deliver Code You Have Proven to Work.” The title sounded a bit like the other dinobabies whom I meet for lunch. Let’s take a look.

The author Simon Willison states:

Your job is to deliver code you have proven to work.

I don’t want to be a Negative Nancy, but my team and I encounter quite a bit of software that does not “work.” The key to understanding Mr. Willison’s point of view and mine is to define “work.” Mr. Willison writes:

We need to deliver code that works—and we need to include proof that it works as well.

Okay, someone somewhere has taken the time to write a tight specification or just waved hands and said, “I need something to do X.” In order to demonstrate that the software works, one has to show the customer or the user or the other components with which the new code interacts that it outputs what’s in the spec.

image

Thanks, Venice.ai. Good enough. Where have you heard that before?

Do you spot the flaw? Many “modern” software systems have what I call a “sort of spec.” The idea is that no one has the time or the information to create a detailed specification. The spec is just good enough. What happens?

Here’s a current example. Use ChatGPT in Edge. The smart software will say click the “plus” or the “icon” to perform a task. Okay, but there is no icon. There is no plus. The reason is that ChatGPT does not display certain controls in Edge. The same weird half complete implementation surfaces in other smart software. Ever try Comfy or Gemini in Edge? What about Perplexity in the Yandex browser? Who has time for this silliness? Certainly not the programmer / developer. The interfaces are good enough.

The flaw is that “works” is relative. A boss may not look at the code and review it. The person could be an MBA from a far off land who studied at a French graduate school. Excel is about the limit of the individual’s technical expertise. Some managers don’t use the software. I have been in meetings for one reason: To demo a software. Why? The boss had no clue how to use the product.

The essay concludes with:

A computer can never be held accountable. That’s your job as the human in the loop. Almost anyone can prompt an LLM to generate a thousand-line patch and submit it for code review. That’s no longer valuable. What’s valuable is contributing code that is proven to work. Next time you submit a PR, make sure you’ve included your evidence that it works as it should.

I want to point out that some people look for one throat to choke. Not many I agree. Some are out there. But the problem, in my opinion, is that the attitude, commitment, or determination to do a job, work to the best of one’s ability, and make sure whatever the spec calls for actually delivers. Then check the result on other systems.

Sure, you can use smart software. But at some point yours will be the throat to choke. Why die on the Hill of Ineptitude? “Works” is subjective, but you can avoid immolation by a superior’s hot fire like outputs. Maybe the entire information technology department will burn in the white hot leadership flame thrower.

Stephen E Arnold, February 9, 2026

Comments

Got something to say?





  • Archives

  • Recent Posts

  • Meta