I don't understand the limitation, e.g. how much data do you need to train the "law state" specific LLM that doesn't know anything else but that?
Such LLM does not need to have 400B of parameters since it's not a general knowledge LLM but perhaps I'm wrong on this (?). So my point rather is that it may very well be, let's take for example, a 30B parameters LLM which in turn means that we might have just enough data to train it. Larger contexts in smaller models are a solved problem.
> how much data do you need to train the "law state" specific LLM that doesn't know anything else but that?
Law doesn’t exist in a vacuum. You can’t have a useful LLM for state law that doesn’t have an exceptional grounding in real world objects of mechanics.
You could force a bright young child to memorize a large text, but without a strong general model of the world, they’re just regurgitating words rather than able to reason about it.
I'm going to push back on "produce reasonable code".
I've seen reasonable code written by AI, and also code that looks reasonable but contains bugs and logic errors that can be found if you're an expert in that type of code.
In other words, I don't think we can rely solely on AI to write code.
I've seen a lot of code written by humans that "looks reasonable but contains bugs and logic errors that can be found if you're an expert in that type of code".
Code is both much smaller as a domain and less prone to the chaos of human interpretation. There are many factors that go into why a given civil or criminal case in court turns out how it does, and often the biggest one is not "was it legal". Giving a computer access to the full written history of cases doesn't give you any of the context of why those cases turned out. A judge or jury isn't going to include in the written record that they just really didn't like one of the lawyers. Or that the case settled because one of the parties just couldn't afford to keep going. Or that one party or the other destroyed/withheld evidence.
Generally speaking, your compiler won't just decide not to work as expected. Tons of legal decisions don't actually follow the law as written. Or even the precedent set by other courts. And that's even assuming the law and precedent are remotely clear in the first place.
A model that's trained on legal decisions can still be used to explore these questions, though. The model may end up being uncertain about which way the case will go, or even more strikingly, it may be confident about the outcome of a case that then is decided differently, and you can try and figure out what's going on with such cases.
But what value does that have? The difference between a armchair lawyer and a real actual lawyer is in knowing when something is legal/illegal, but unlikely to be seen that way in a court or brought to a favorable verdict. It's knowing which cases you can actually win, and how much it'll cost and why.
Most of that is not in scope of what an LLM could be trained on, or even what an LLM would be good at. What you're training in that case would be someone who's an opinion columnist or twitter poster. Not an actual lawyer.
The point is not in replacing all of the lawyers or programmers but rather that we will no longer need so many of them since a lot of their expertise is becoming a commodity today. This is a fact and there have been many many examples of that.
My friend who hasn't been trained for SQL, nor computer science at all, is now all of the sudden able to crunch through complex SQL queries because of the help he gets through LLMs. He, or more specifically his company, does not need to hire an extern SQL expert anymore since he can manage it himself. He will probably not write a perfect SQL but it's going to be more than good enough and that's actually all that it matters.
The same thing happened at much much smaller scale with Google Translate. 10 years ago we weren't able to read foreign language content. Today? It's not even a click-away because Chrome is doing it for you automatically so it has become a commodity to go and read any website we wish to.
So, the history already proved us that "real translators" and "real SQL experts" and "real XY experts" have been already replaced by their "armchair" alternatives.
But that ignores that the stakes of law are high enough that you often cannot afford to be wrong.
30 years ago, the alternative to Google Translate was buying a translation dictionary or hiring a professional, neither of which was things you'd do for something you didn't care much about. Yes, I can go look at a site/article that's in a language I don't speak and get it translated and generally get the idea of what it's saying. If I'm just trying to look at a restaurant's menu in another language, I'm probably fine. I probably wouldn't trust it if I had serious food allergies, or was trying to translate what I could legally take through customs. If you're having a business meeting about something, you're probably still hiring a real human translator.
Yes, stuff has become commodity-level, but that just broadens who can use it, assuming they can afford for it to be wrong, and for them to have no recourse if it is. Google Translate won't pay your hospital bills if you rely on it to know there aren't allergens in your food and it mistranslated things. ChatGPT won't do the overtime to fix the DB if it gives you a SQL command that accidentally truncates the entire Dev environment.
Almost everything around law on most countries doesn't have "casual usage" where you can afford to be wrong. Even the most casual stuff you may go to a lawyer about, such as setting up a will, is still something where if you try to just do it yourself, you can create a huge legal mess. I've known friends whose relatives "did their own research" and wrote their own wills and when they died, most of their estate's value was consumed in legal issues trying to resolve it.
As I said before - a legal LLM may be fine for writing opinion pieces or informing arguing on the internet, but messing up even basic stuff about the law can be insanely costly if it ends up mattering, and most people won't know what will end up mattering. Lawyers bill hundreds an hour, and bailing you out of decisions you made an LLM-deluded mess could easily take tens of hours.
The stakes of deploying a buggy code into the data center production code can easily cost millions of $$$ and yet we still see that one of the primary usages of LLMs today is exactly in the software engineering. Accountability exists in every domain so such argument doesn't make law any different than anything else. You will still have an actual human signing off the law interpretation or code pull-request. It will just happen that we will not going to need 10 people for that job but 1. And this is at this point I believe inevitable.
legal reasoning involves applying facts to the law, and it needs knowledge of the world. the expertise of a professional is in picking the right/winning path based on their study of the law, the facts and their real world training. money is in codifying that to teach models to do the same
I agree, but I'd add – code as a domain is a lot more vast than any AI can currently handle.
AIs do well on mainstream languages for which there is lots of open source code examples available.
I doubt they'd do so well on some obscure proprietary legacy language. For example, large chunks of the IBM i minicomputer operating system (formerly known as OS/400) are still written in two proprietary PL/I dialects, PL/MI and PL/MP. Both languages are proprietary – the compiler, the documentation, and the code bases are all IBM confidential, nobody outside of IBM is getting to see them (except just maybe under an NDA if you pay $$$$). I wonder how good an AI would go on that code base? I think it would have little hope unless IBM specifically fine-tuned an AI for those languages based on their internal documentation and code.
> unless IBM specifically fine-tuned an AI for those languages based on their internal documentation and code.
Why do you think this isn't already or won't be a case in the near future? Because that's exactly what I believe is going to happen given the current state and advancements of LLMs. There's certainly a large incentive from IBM to do so.
Law of an average EU country fits in several hundred, let's say even thousands, of pages of text. Specification. Very well known. Low frequency of updates. But code? Everything opposite so I am not sure I could agree on this point at all.
Right, but you're missing the point here that interpreting the law requires someone with a law degree, and all the real-world context that they have, and all the subtle knowledge about what things mean.
The Bible is also a short and well-known text, but if I want to answer religious questions for observant Christians, I can't just train it on that. You need a deep real world context to understand that "my buddy made SWE II and I'm only SWE I and it's eating me up" is about the biblical notion of covetousness.
And then I guess you're also missing the point that interpreting and writing the code also requires an expert and that in that respect it is no different than law. I could argue that engineering is more complex than interpreting law but that's not the point now. Subtleties and ambiguity are large in both domains. So, I don't see the point you're trying to make. We can agree to disagree I guess.
For a “legal LLM” you need three things: general IQ / common sense at a substantially higher level than current, understanding of the specific rules, and hallucination-free recall of the relevant legal facts/cases.
I think it’s reasonable to assume you can get 2/3 with a small corpus IF you have an IQ 150 AGI. Empirically the current known method for increasing IQ is to make the model bigger.
Part of what you’re getting at is possible though, once you have the big model you can distill it down to a smaller number of parameters without losing much capability in your chosen narrow domain. So you forget physics and sports but remain good at law. That doesn’t help you with improving the capability frontier though.
And then your Juris R. Genius gets a new case about two Red Socks fans getting into a fight and without missing a beat starts blabbering about how overdosing on too much red pigments from the undergarments caused their rage!
Such LLM does not need to have 400B of parameters since it's not a general knowledge LLM but perhaps I'm wrong on this (?). So my point rather is that it may very well be, let's take for example, a 30B parameters LLM which in turn means that we might have just enough data to train it. Larger contexts in smaller models are a solved problem.