I think your post is fundamentally wrong. See, you are comparing AI responses with the written code, which may not be the fair comparison. I see it as, better you could compare code generated by AI vs the code written by an engineer.
The original author seems to view the AI application as itself a software application which has desired or undesired, and predictable or unpredictable, behaviors. That doesn't seem like an invalid thing to talk about merely because there are other software-related conversations we can have about AIs (or other code-quality-related conversations).
I think you misunderstood my post? I'm comparing AI as a system vs written code as a system. Both systems can have flaws, but the way in which they fail is different. The danger comes when non-technical people try to apply intuitions about software failures to AI, because those intuitions are false when applied to AI.