I mean, yes? The cameras do help solve a ton of crime. The real issue is using them for surveillance without a cause, and that imo this is what should be under scrutiny. But trying to fight cameras existing in general is a lost cause imo
> The instant you start having to make the client smart enough to think about business logic, you are doomed.
Could you explain more here? What do you consider "business logic". Context: I have a client app to fly drone using gamepad, mouse and keyboard, and video feedback and maps, and drone tasking etc.
Streamlit apps or similar are doing a great job at this where I'm at.
As simple to build and deploy as Excel, but with the right data types, the right UI, the right access and version control, the right programming language that LLMs understand, the right SW ecosystem and packages, etc.
I'm curious what makes you so confident on this? I confess I expect that people are often far more cognizant of the last thing that the they want to say when they start?
I don't think you do a random walk through the words of a sentence as you conceive it. But it is hard not to think people don't center themes and moods in their mind as they compose their thoughts into sentences.
Similarly, have you ever looked into how actors learn their lines? It is often in a way that is a lot closer to a diffusion than token at a time.
I think there is a wide range of ways to "turn something in the head into words", and sometimes you use the "this is the final point, work towards it" approach and sometimes you use the "not sure what will happen, lets just start talking and go wherever". Different approaches have different tradeoffs, and of course different people have different defaults.
I can confess to not always knowing where I'll end up when I start talking. Similarly, not every time I open my mouth it's just to start but sometimes I do have a goal and conclusion.
They're speaking literally. When talking to someone (or writing), you ultimately say the words in order (edits or corrections notwithstanding). If you look at the gifs of how the text is generated - I don't know of anyone that has ever written like that. Literally writing disconnected individual words of the actual draft ("during," "and," "the") in the middle of a sentence and then coming back and filling in the rest. Even speaking like that would be incredibly difficult.
Which is not to say that it's wrong or a bad approach. And I get why people are feeling a connection to the "diffusive" style. But, at the end of the day, all of these methods do build as their ultimate goal a coherent sequence of words that follow one after the other. It's just a difference of how much insight you have into the process.
Weird anecdote, but one of the reasons I have always struggled with writing is precisely that my process seems highly nonlinear. I start with a disjoint mind map of ideas I want to get out, often just single words, and need to somehow cohere that into text, which often happens out-of-order. The original notes are often completely unordered diffusion-like scrawling, the difference being I have less idea what final the positions of the words were going to be when I wrote them.
I can believe that your abstract thoughts in latent space are diffusing/forming progressively when you are thinking.
But I can't believe the actual literal words are diffusing when you're thinking.
When being asked: "How are you today", there is no way that your thoughts are literally like "Alpha zulu banana" => "I banana coco" => "I banana good" => "I am good". The diffusion does not happen at the output token layer, it happens much earlier at a higher level of abstraction.
"I ____ ______ ______ ______ and _____ _____ ______ ____ the ____ _____ _____ _____."
If the images in the article are to be considered an accurate representation, the model is putting meaningless bits of connective tissue way before the actual ideas. Maybe it's not working like that. But the "token-at-a-time" model is also obviously not literally looking at only one word at a time either.
It's just too far of an analogy, it starts in the familiar SWE tarpit of human brain = lim(n matmuls) as n => infinity.
Then, glorifies wrestling in said tarpit: how do people actually compose sentences? Is an LLM thinking or writing? Can you look into how actors memorize lines before responding?
Error beyond the tarpit is, these are all ineffable questions that assume a singular answer to an underspecified question across many bags of sentient meat.
Taking a step back to the start, we're wondering:
Do LLMs plan for token N + X, while purely working to output token N?
In order to model poetry autoregressively, you're going to need a variable that captures rhyme scheme. At the point where you've ended the first line, the model needs to keep track of the rhyme that was used, just like it does for something like coreference resolution.
I don't think that the mentioned paper shows that the model engages in a preplanning phase in which it plans the rhyme that will come. In fact such would be impossible. Model state is present only in so-far-generated text. It is only after the model has found itself in a poetry generating context and has also selected the first line-ending word, that a rhyme scheme "emerges" as a variable. (Now yes, as you increase the posterior probability of 'being in a poem' given context so far, you would expect that you also increase the probability of the rhyme-scheme variable's existing.)
I’m confused: the blog shows they A) predict the end of line 2 using the state at the end of line 1 and B) can choose the end of line 2 by altering state at end of line 1.
Might I trouble you for help getting from there to “such would be impossible”, where such is “the model…plans the rhyme to come”
Edit: I’m surprised to be at -2 for this. I am representing the contents of the post accurately. Its unintuitive for sure, but, it’s the case.
I agree, the post above you is patently wrong / hasn't read the paper they are dismissing. I also got multiple downvotes for disagreeing, with no actual rebuttal.
You're my fav new-ish account, spent about 5 minutes Googling froobius yesterday tryna find more content. :) Concise, clear, no BS takes for high-minded nonsense that sounds technical. HNs such a hellhole for LLM stuff, the people who are hacking ain't here, and the people who are, well, they mostly like yapping about how it connects to some unrelated grand idea they misremember from undergrad. Cheers.
(n.b. been here 16 years and this is such a classic downvote scenario the past two years. people overindexing on big words that are familiar to them, and on any sort of challenging tone. That's almost certainly why I got mine, I was the dummy who read the article and couldn't grasp the stats nonsense, and "could I bother you to help" or w/e BS I said, well, was BS)
> Model state is present only in so-far-generated text
Wrong. There's "model state", (I assume you mean hidden layers), not just in the generated text, but also in the initial prompt given to the model. I.e. the model can start its planning from the moment it's given the instruction, without even having predicted a token yet. That's actually what they show in the paper above...
> It is only after the model has found itself in a poetry generating context and has also selected the first line-ending word, that a rhyme scheme "emerges" as a variable
This is an assertion based on flawed reasoning.
(Also, these ideas should really be backed up by evidence and experimentation before asserting them so definitively.)
> far more cognizant of the last thing that the they want to say when they start
This can be captured by generating reasoning tokens (outputting some representation the desired conclusion in token form, then using it as context for the actual tokens), or even by an intermediate layer of a model not using reasoning.
If a certain set of nodes are strong contributors to generate the concluding sentence, and they remain strong throughout all generated tokens, who's to say if those nodes weren't capturing a latent representation of the "crux" of the answer before any tokens were generated?
(This is also in the context of the LLM being able to use long-range attention to not need to encode in full detail what it "wants to say" - just the parts of the original input text that it is focusing on over time.)
Of course, this doesn't mean that this is the optimal way to build coherent and well-reasoned answers, nor have we found an architecture that allows us to reliably understand what is going on! But the mechanics for what you describe certainly can arise in non-diffusion LLM architectures.
It must be the case that some smart people have studied how we think, right?
The first person experience of having a thought, to me, feels like I have the whole thought in my head, and then I imagine expressing it to somebody one word at a time. But it really feels like I’m reading out the existing thought.
Then, if I’m thinking hard, I go around a bit and argue against the thought that was expressed in my head (either because it is not a perfect representation of the actual underlying thought, or maybe because it turns out that thought was incorrect once I expressed it sequentially).
At least that’s what I think thinking feels like. But, I am just a guy thinking about my brain. Surely philosophers of the mind or something have queried this stuff with more rigor.
People don't come up with things their brain does.
Words rise from an abyss and are served to you, you have zero insight into their formation. If I tell you to think of an animal, one just appears in your "context", how it got there is unknown.
So really there is no argument to be made, because we still don't mechanistically understand how the brain works.
aeonik says >"We don't know exactly how consciousness works in the human brain, but we know way more than "comes from the abyss"."<
You are undoubtedly technically correct, but I prefer the simplicity, purity and ease-of-use of the abysmal model, especially in comparison with other similar competing models, such as the below-discussed "tarpit" model.
Like most people I jump back and forth when I speak, disclaiming, correcting, and appending to previous utterances. I do this even more when I write, eradicating entire sentences and even the ideas they contain, within paragraphs that which by the time they were finished the sentence seemed unnecessary or inconsistent.
I did it multiple times while writing this comment, and it is only four sentences. The previous sentence once said "two sentences," and after I added this statement it was changed to "four sentences."
>You 100% do pronounce or write words one at a time sequentially.
It's statements like these that make me wonder if I am the same species as everyone else. Quite often, I've picked adjectives and idioms first, and then fill in around them to form sentences. Often because there is some pun or wordplay, or just something that has a nice ring to it, and I want to lead my words in that direction. If you're only choosing them one at a time and sequentially, have you ever considered that you might just be a dimwit?
It's not like you don't see this happening all around you in others. Sure you can't read minds, but have you never once watched someone copyedit something they've written, where they move phrases and sentences around, where they switch out words for synonyms, and so on? There are at least dozens of fictional scenes in popular media, you must have seen one. You have to have noticed hints at some point in your life that this occurs. Please. Just tell me that you spoke hastily to score internet argument points, and that you don't believe this thing you've said.
All of that can can still be seen as a linear sequence of actions from the perspective of human I/O with the environment.
What happens in the black box of the human mind to determine the next word to write/say is exactly made irrelevant in this level of abstraction, as regardless how, it would still result in a linear sequence of actions as observed by the environment.
Are you able to pronounce multiple words in superposition at the same time? Are you able to write multiple words in superposition? Can you read the following sentence: "HWeolrllod!"
Clearly communication is sequential.
LLMs are not more sequential than your vocal chords or your hand writing. They also plan ahead before writing.
(Just to expand on that, it's true not just the for the first token. There's a lot of computation, including potentially planning ahead, before each token outputted.)
That's why saying "it's just predicting the next word", is a misguided take.
Around 90% of the maps currently on the site are from the US government (USGS). The last 10% are from public institutions and libraries and this is the newest segment that I'm actively working on growing. My hope is to flip this ratio with time.
I also have a few partnerships in the work with some private collections but those have proven trickier to actually get to a "yes". It also involves a lot of bespoke work to process and ingest each individual source so I'm not focusing as hard on this type of sourcing anymore.
I keep telling my euro-friends that food and health regulation could potentially be enforced by the free market more effectively than by corruptible government, and this is a perfect example of this.
I'd want to see all products I can buy in there, with all possible chemical, ingredients and nutrients, and clear indications of good/bad, a little bit like in Yuka. You should partner with them maybe even!
I agree "enforce" is a poor choice of words. It does not need to be "enforced" using state violence if any consumer can access facts with such transparency. What's missing today is this level of transparency with which the market will just naturally benefit to producer of sane and safe goods in a much more natural way.
Also, speaking of the "more free market in the US", my answer is that you don't hate capitalism, you hate crony capitalism.
> you don't hate capitalism, you hate crony capitalism
What distinguishes this from 'you don't hate socialism, you just hate every so-called socialist government'? I know this seems like lazy smartarsery, but I'm genuinely curious whether you think we have real-world examples of countries doing capitalism right -- and, if not, why that's not a bad sign in the same way that a dearth of examples of socialist success stories is a bad sign.
This is great, but a crucial piece of advice for the developer would be to reduce number of players. When games have 80 players, statistics means a given player has 70% chances of getting obliterated in the first few minutes. So you end up doing a lot of frustrating loses, and then rarely you make it to the end game
reply