Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What condescending nonsense is this? I use all the major LLM systems, mostly with their most expensive models, and when I ask them for sources, including specifically in many cases sources for legal questions, half the time the linked source will not be remotely irrelevant, and will not remotely substantiate the claim that it is being cited for. Almost never is it without an error of some significance. They all still hallucinate very consistently if you’re actually pushing them into areas that are complicated and non-obvious, when they can’t figure out an answer, they make one up. The reduction in apparent hallucinations in a recent models seems to be more that they’ve learned specific cases where they should say they don’t know, not that the problem has been solved in a broader sense.

This is true for first party applications, as well as for custom integrations, where I can explicitly check that the context should be grounding them with all of the relevant facts. It doesn’t matter, that isn’t enough, you can tell me I’m holding it wrong, but we’ve consulted with experts from anthropic and from OpenAI and who have done major AI integrations at some of the most prominent AI consuming companies. I’m not holding it wrong. It’s just a horribly flawed piece of technology that must be used with extreme thoughtfulness if you want to do anything non-trivial without massive risks.

I remain convinced that the people who can’t see the massive flaws in current LLM systems must be negligently incompetent in how they perform their jobs. I use LLM’s every day in my work and they are a great help to my productivity, but learning to use them effectively is all about understanding the countless ways in which they fail and thinks that they cannot be relied on for and understanding where they actually provide value.

They do provide value for me in legal research, because sometimes they point me in the direction of caselaw or legal considerations that hadn’t occurred to me. But the majority of the time, the vast majority, their summaries are incorrect, and their arguments are invalid.

LLMs are not capable of reasoning which requires non-obvious jumps of logic which are more than one small step removed from the example that they’ve seen in their training. If you attempt to use them to reason about a legal situation, you will immediately see themselves tie themselves in not because they are not capable of that kind of reasoning, on top of their inability to actually understand in summarize case documents and statutes accurately.



There's a simpler explanation: they are comparing LLM performance to that of regular humans, not perfection.

Where do you think LLMs learned this behavior from? Go spend time in the academic literature outside of computer science and you will find an endless sea of material with BS citations that don't substantiate the claim being made, entirely made up claims with no evidence, citations of retracted papers, nonsensical numbers etc. And that's when papers take months to write and have numerous coauthors, peer reviewers and editors involved (theoretically).

Now read some newspapers or magazines and it's the same except the citations are gone.

If an LLM can meet that same level of performance in a few seconds, it's objectively impressive unless you compare to a theoretical ideal.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: