Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> First of all, training off of data generated by another AI is generally a bad idea because you'll end up with a strictly less accurate model (usually).

That is not true at all.

We have known how to solve this for at least 2 years now.

All the latest state of the art models depend heavily on training on synthetic data.




Key point from your linked paper:

> We find that indiscriminate use of model-generated content in training causes irreversible defects in the resulting models

No one is training on indiscriminate synthetic data. It's very much discriminated, but still synthetic.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: