> b) Training datasets can only be made by humans.
AI can make training datasets too.
For example to replace the human generated stuff from stable diffusion you could have some random-ish image generator coupled with some sort of image classification AI. As long as you have a good enough classification AI (or even more than one) that tells you what images are, you can focus on random-ish image generator algorithms to generate training data for another AI to generate images from descriptions.
(this is obviously with lots of handwaving and there will be problems that need to be solved - e.g. to avoid 99% of the generated training data be stuff like "noise on noise" but have some form of variety :-P), but the point is AIs generating data for other AIs is something that isn't far fetched and you don't need to think in terms of a single AI either)
AI can make training datasets too.
For example to replace the human generated stuff from stable diffusion you could have some random-ish image generator coupled with some sort of image classification AI. As long as you have a good enough classification AI (or even more than one) that tells you what images are, you can focus on random-ish image generator algorithms to generate training data for another AI to generate images from descriptions.
(this is obviously with lots of handwaving and there will be problems that need to be solved - e.g. to avoid 99% of the generated training data be stuff like "noise on noise" but have some form of variety :-P), but the point is AIs generating data for other AIs is something that isn't far fetched and you don't need to think in terms of a single AI either)