Stack Exhange data really is the worlds best open Q&A dataset. Far cleaner and more reliable than anything else.
But LLM trainers are going to use it no matter what. It’s not like they care about copyright or licenses.
Stack Exhange data really is the worlds best open Q&A dataset. Far cleaner and more reliable than anything else.
But LLM trainers are going to use it no matter what. It’s not like they care about copyright or licenses.