Hacker Newsnew | past | comments | ask | show | jobs | submit | wasmainiac's commentslogin

Code has always been nondetermistic. Which engineer wrote it? What was their past experience? This just feels like we are accepting subpar quality because we have no good way to ensure the code we generate is reasonable that wont mayyyybe rm-rf our server as a fun easter egg.

Code written by humans has always been nondeterministic, but generated code has always been deterministic before now. Dealing with nondeterministically generated code is new.

> generated code has always been deterministic

Technically you are right… but in principle no. Ask an LLM any reasonably complex task and you will get different results. This is because the mode changes periodically and we have no control over the host systems source of entropy. It’s effectively non deterministic.


determinism v nondeterminism is and has never been an issue. also all llms are 100% deterministic, what is non deterministic are the sampling parameters used by the inference engine. which by the way can be easily made 100% deterministic by simply turning off things like batching. this is a matter for cloud based api providers as you as the end user doesnt have acess to the inferance engine, if you run any of your models locally in llama.cpp turning off some server startup flags will get you the deterministic results. cloud based api providers have no choice but keeping batching on as they are serving millions of users and wasting precious vram slots on a single user is wasteful and stupid. see my code and video as evidence if you want to run any local llm 100% deterministocally https://youtu.be/EyE5BrUut2o?t=1

That's not an interesting difference, from my point of view. The box m black box we all use is non deterministic, period. Doesn't matter where on the inside the system stops being deterministic: if I hit the black box twice, I get two different replies. And that doesn't even matter, which you also said.

The more important property is that, unlike compilers, type checkers, linters, verifiers and tests, the output is unreliable. It comes with no guarantees.

One could be pedantic and argue that bugs affect all of the above. Or that cosmic rays make everything unreliable. Or that people are non deterministic. All true, but the rate of failure, measured in orders of magnitude, is vastly different.


My man did you even check my video, did you even try the app. This is not "bug related" nowhere did i say it was a bug. Batch processing is a FEATURE that is intentionally turned on in the inference engine for large scale providers. That does not mean it has to be on. If they turn off batch processing al llm api calls will be 100% deterministic but it will cost them more money to provide the services as now you are stuck with providing 1 api call per GPU. "if I hit the black box twice, I get two different replies" what you are saying here is 100% verifiably wrong. Just because someone chose to turn on a feature in the inference engine to save money does not mean llms are anon deterministic. LLM's are stateless. their weights are froze, you never "run" an LLM, you can only sample it. just like a hologram. and depending on the inference sampling settings you use is what determines the outcome.....

Correct me if I'm wrong, but even with batch processing turned off, they are still only deterministic as long as you set the temperature to zero? Which also has the side-effect of decreasing creativity. But maybe there's a way to pass in a seed for the pseudo-random generator and restore determinism in this case as well. Determinism, in the sense of reproducible. But even if so, "determinism" means more than just mechanical reproducibility for most people - including parent, if you read their comment carefully. What they mean is: in some important way predictable for us humans. I.e. no completely WTF surprises, as LLMs are prone to produce once in a while, regardless of batch processing and temperature settings.

You can change ANY sampling parameter once batch processing is off and you will keep the deterministic behavior. temperature, repetition penalty, etc.... I got to say I'm a bit disappointed in seeing this in hacker news, as I expect this from reddit. you bring the whole matter on a silver platter, the video describes in detail how any sampling parameter can be used, i provide the whole code opensource so anyone can try it themselves without taking my claims as hearsay, well you can bring a horse to water as they say....

It’s called TDD, ya write a bunch a little tests to make sure your code is doing what it needs to do and not what it’s not. In short, little blocks of easily verifiable code to verify your code.

But seriously, what is this article even? It feels like we are reinventing the wheel or maybe just humble AI hype?


Hi Stephen, is that you?

Jokes and sales pitches aside. We kinda have that already, we platforms that allow us to run the same code on, x86, arm, wasm… and so on. It’s just there is no consensus on which platform to use. Nor should there be since that would slow progress of new and better ways to program.

We will never have one language to span graphics, full stack, embedded, scientific, high performance, etc without excessive bloat.


Many VCs today would vouch that would be agentic AI.

No. I work in VC backed startups. Externally they might say that for investors, but talk to a software engineer. Agentic IA is not working, it will regurgitate code from examples online, but if you get it to do something complex or slightly novel, it falls flat on its face.

Maybe it will someday be good enough, but not today, and probably not for at least 5 years.


Meanwhile there are people delivering solutions in iPaaS tools, which is quite far from the traditional programming that gets discussed on HN.

Some of those tools aren't yet fully there, but also aren't completely dumb, they get more done in a day, than trying to do the same workflows with classical programming.

Workato, Boomi, Powerapps, Opal,...


How is that relevant to this thread?

Those are high level programming tools, nowadays backed by AI, being adopted the big corps, some are sponsored with VC money, which you are sceptical about.

Well good luck to them. I don’t want to be stuck without a job because the bubble burst along with all the capital my job depends on.

I’m not sure about that. I used to use LabView and its various libraries often. The whole thing felt scattered and ossified. I’d take a python standard library any day.

Yet most EE engineers rather use a graphical tool like LabView or Simulink.

Not everyone is keen doing scripting from command line with vi.


I once interned at a lab that used a piece of surely overpriced hardware that integrated with Simulink. You would make a Simulink model, and you’d click something and the computer would (IIRC) compile it to C and upload it to the hardware. On the bright side, you didn’t waste time bikeshedding about how to structure things. On the other hand, actually implementing any sort of nontrivial logic was incredibly unpleasant.

Hahaha yep, so much clicking after One day my finger was actually sore.

Maybe it’s different for those actually working in the profession and n=1 but in my (many) years of studying EE I never used these tools even once.

No not really, depending on the application Cpp or python has been the language of choice in the lab. Labview was used because it was seen as easy to make UIs for operators in production facilities, but even that was a regrettable decision. We ended up rewriting LV business logic in c# and importing it as a lib into a LV front end.

Try the 6 legged Eni dog next!

I love posts like this. Amazing work!

Your right! Strange. I thought I commented this to the Seattle thread… bug?

Did you post this comment on the wrong article? Nothing here has anything to do with AI whatsoever.

But it’s not. I think most can agree that there really has not been any real entertainment from genAI beyond novelty crap like seeing Lincoln pulling a nice track at a skate park. No one wants to watch genAI slop video, no one wants to listen to genAI video essays, most people do not want to read genAI blog posts. Music is a maybe, based on leaderboards, but it is not like we ever had a lack of music to listen to.

Eventually it will be good enough that you won't know the difference.

I have a feeling that's already happened to me.


Bro. You and your cohorts said the exact same thing about LLMs and coding when ChatGPT just came out. The status quo is obvious. So no one is talking about that.

Draw the trendline into the future. What will happen when the content is indistinguishable and AI is so good it produces something moves people to tears?


Bro, it sure if you noticed. ChatGTP isnt that great at coding end to end. It can regurgitate common examples well, but if your working on large technical code bases it does more harm that good. It need constant oversight, why don’t I write the code myself. We are at an infrastructure limit, not sure we are going to see order of magnitude improvements any more.

I no longer write code. I’ve been a swe for over a decade. AI writes all my code following my instructions. My code output is now expected to be 5x what it was before because we are now augmented by AI. All my coworkers use AI. We don’t use ChatGPT we use anthropic. If I didn’t use AI I would be fired for being too slow.

What I work on is large and extremely technical.

And no we are not at an infrastructure limit. This statement is insane. We are literally only a couple years into LLMs becoming popular. Everything we see now is just the beginning. You can only make a good judgement call of whether we are at our limit in 10 years.

Because the transition hit so quickly a lot of devs and companies haven’t fully embraced AI yet. Culture is still lagging capability. What you’re saying about ChatGPT was true a year ago. And now one year later, everything you’re saying isn’t remotely true anymore. The pace is frightening. So I don’t blame you for not knowing. Yes AI needs to be managed but it’s at a point where the management no longer hinders you and it instead augments your capabilities.


> I will not allow AI to be pushed down my throat just to justify your bad investment.

Pretty much my sentiment too.


The neat thing about all this is that you don’t get a choice!

Your favorite services are adding “AI” features (and raising prices to boot), your data is being collected and analyzed (probably incorrectly) by AI tools, you are interacting with AI-generated responses on social media, viewing AI-generated images and videos, and reading articles generated by AI. Business leaders are making decisions about your job and your value using AI, and political leaders are making policy and military decisions based on AI output.

It’s happening, with you or to you.


I do have a choice, I just stop using the product. When messenger added AI assistants, I switched to WhatsApp. Now WhatsApp has one too, now I’m using Signal. Wife brought home a win11 laptop, didn’t like the cheeky AI integration, now it runs Linux.

Sadly, almost none of my friends care or understand (older family members or non-tech people). If I tried to convince friends to move to Signal because of my disdain for AI profiteering, they'd react as if I were trying to get them to join a church.

Reasonably far off topic:

Visa hasn't worked for online purchases for me for a few months, seemingly because of a rogue fraud-detection AI their customer service can't override.

Is there any chance that's just a poorly implemented traditional solution rather than feeding all my data into an LLM?


If by "traditional solution" you mean a bunch of data is fed into creating an ML model and then your individual transaction is fed into that, and it spits out a fraud score, then no, they'd not using LLMs, but at this high a level, what's the difference? If their ML model uses a transformers-based architecture vs not, what difference does it make?

> what difference does it make

Traditional fraud-detection models have quantified type-i/ii error rates, and somebody typically chooses parameters such that those errors are within acceptable bounds. If somebody decided to use a transformers-based architecture in roughly the same setup as before then there would be no issue, but if somebody listened to some exec's hairbrained idea to "let the AI look for fraud" and just came up with a prompt/api wrapping a modern LLM then there would be huge issues.


One hallucinates data, one does not?

I run a small online software business and I am continually getting cards refused for blue chip customers (big companies, universities etc). My payment processor (2Checkout/Verifone) say it is 3DS authentication failures and not their fault. The customers tell me that their banks say it isn't the bank's fault. The problem is particularly acute for UK customers. It is costing me sales. It has happened before as well:

https://successfulsoftware.net/2022/04/14/verifone-seems-to-...


I've recently found myself having to pay for a few things online with bitcoin, not because they have anything to do with bitcoin, but because bitcoin payments actually worked and Visa/MC didn't!

For all the talk in the early days of Bitcoin comparing it to Visa and how it couldn't reach the scale of Visa, I never thought it would be that Visa just decided to place itself lower than Bitcoin.

Kind of the same as Windows getting so bad it got worse than Linux, actually...


Even if my favorite service is so irreplaceable, I still can use it without touching AI part of it. If majority who use a popular service never touch AI features, it will inevitably send a message to the owner one way or another - you are wasting money with AI.

Nah the owner will get a filtered truth from the middle managers that present them with information that everything's going great with AI, and the lost money is actually because of those greedy low level employees drinking up all the profit by working from home! The entire software industry has a massive Truth-To-Power problem that just keeps getting worse. I'd say the software industry in this day and age feels like Lord of the Flies but honestly feels too kind.

Exactly this. "AI usage is 20% of our customer base" "AI usage has increased 5% this quarter" "Due to our xyz campaign, AI usage has increased 10%"

It writes a narrative of success even if it's embellished. Managers respond to data and the people collecting the data are incentivised to indicate success.


almost the same as RTO mandates:

we’ll force you to come back to justify sunk money in office space.


I personally think all the gains in productivity that happened with WFH were just because people were stressed and WFH acted like a pressure relief. But too much of a good thing and people get lazy (seeing it right now, some people are filling full timesheets and not even starting let alone getting through a day of work in a week), so the right balance is somewhere in the middle.

Perhaps… the right balance is actually working only 4 days a week, always from the office, and just having the 5th day proper-off instead.

I think people go through “grinds” to get big projects done, and then plateau’s of “cooling down”. I think every person only has so much grind to give, and extra days doesn’t mean more work, so the ideal employee is one you pay for 3-4 days per week only.


We just need a metric that can't be gamed which will reliably show who is performing and who is not, and we can rid ourselves of the latter. Everyone else can continue to work wherever the hell they want.

But that's a tall order, so maybe we just need managers to pay attention. It doesn't take that much effort to stay involved enough to know who is slacking and who is pulling their weight, and a good manager can do it without seeming to micromanage. Maybe they'll do this when they realize that what they're doing now could largely be replaced by an LLM...


Not for nothing did the endless WSJ and Forbes articles about "commuting for one hour into expensive downtown offices is good, actually" show up around the same time RTO mandates did.

Don't forget about the poor local businesses. Someone needs to pay to keep the executives' lunch spots open.

Well, not if rents crash because all the offices moved out from the area, and the lunch spot can afford to stay open and lowers prices.

We don't talk enough about how the real estate industry is a gigantic drag on the economy.


Hey now. Little coffee shops and lunch spots and dry cleaners are what make cities worth living in in the first place.

It really gives me the same vibes as the sort of products that go all in on influencer marketing. Nothing has made me less likely to try "Raid Shadow Legends" than a bunch of youtubers faking enthusiasm about it.

It's a sort of pushiness that hints not even the people behind the product are very confident in its appeal.


I see comments like this one* and I wonder if the whole AI trend is a giant scam we're getting forced to play along with.

* https://news.ycombinator.com/item?id=46096603


Why distro do you run? Python is a part of the os in many cases ?

It’s a fair angle your taking here, but I would only expect to see it on hardend servers.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: