Not sure why so many down votes for you. HN is clearly very pro-AI. I think you'...

Not sure why so many down votes for you. HN is clearly very pro-AI.

I think you're correct about the training data problem. This is starting to remind me of the Final Question story by Asimov. "Insufficient data for meaningful answer". While in that story the AIs kept progressing, I think in reality we will forever be stuck with diminishing quality of training data.

Even just consider post-Copilot Github. Presumably there is now code publicly available that was generated by an AI. Next time somebody slurps up Github to train a new model, some of that code will be included. Overfitting ensues.