Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> b) Training datasets can only be made by humans.

AI can make training datasets too.

For example to replace the human generated stuff from stable diffusion you could have some random-ish image generator coupled with some sort of image classification AI. As long as you have a good enough classification AI (or even more than one) that tells you what images are, you can focus on random-ish image generator algorithms to generate training data for another AI to generate images from descriptions.

(this is obviously with lots of handwaving and there will be problems that need to be solved - e.g. to avoid 99% of the generated training data be stuff like "noise on noise" but have some form of variety :-P), but the point is AIs generating data for other AIs is something that isn't far fetched and you don't need to think in terms of a single AI either)



You can also feed Goggle Street View or equivalent for a good start.

(though https://commons.wikimedia.org/wiki/Commons:Freedom_of_panora... may matter here)


I believe Tesla has been doing some form of this to expand their training datasets.


how is the generated data not just worse than already existing data


this sounds like Artificial Imagination




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: