Yes, I understand. And sorry to hear that. But I'm trying to understand how it is related to AI. How come this is happening with AI crawlers but not with traditional web index crawlers. If the pattern is so common (which is confirmed by multiple credible sources) there must be some interesting and potentially useful explanation.
Search engines link to websites. They want the websites up, so its worth a little extra work to avoid harming them. LLMs seek to replace the websites.
Search engine crawlers are more mature and better written.
I suspect a lot of LLM crawling an development is done under time pressure to get things done while the investors money is still coming in to fund it. DO stuff in a hurry, and it will be less competently done.
Ordinary search indices don't contain the entire target site, while LLM-style so called AI does consume it all. I would guess some of these crawlers are subcontractors rather than "AI" companies, i.e. they compete on having the most complete and fresh dataset you could rent for "training".
Whenever the market decides the Internet is too full of slop to be usable for "training" the one that has the most copies of the pre-"AI" Internet wins. Some of the traffic is likely "AI" "tool use", i.e. bot scraping as part of running some LLM, i.e. "AI" "research".
The big scraping bots have gone from stupid to ruthless. Previously it was irritating that some of them got stuck traversing cyclical link paths on your site or on-the-fly generated pages, now it's like your silly family blog suddenly got very popular for no good reason and it puts a lot of load on the tiny amount of hardware it's served from.