More

eieio · 2025-12-25T13:19:44 1766668784

I'd be worried that the chatroom would be filled with words that are not (ahem) in keeping with the christmas spirit!

I did spend some time thinking about how to let people leave "gifts" over folks could open, but wasn't sure how to make that work in a compelling way. Maybe next year...

Bender · 2025-12-25T15:39:28 1766677168

If going that route perhaps code could be borrowed from Devzat [1] not my repo

It is ssh chat and has word filters.

[1] - https://github.com/quackduck/devzat

eieio · 2025-12-22T21:40:44 1766439644

I found this article while debugging some networking delays for a game that I'm working on.

It turns out that in my case it wasn't TCP_NODELAY - my backend is written in go, and go sets TCP_NODELAY by default!

But I still found the article - and in particular Nagle's acknowledgement of the issues! - to be interesting.

There's a discussion from two years ago here: https://news.ycombinator.com/item?id=40310896 - but I figured it'd been long enough that others might be interested in giving this a read too.

miduil · 2025-12-22T23:03:10 1766444590

There is also a good write-up [0] by Julia Evans. We ran into this with DICOM storescp, which is a chatty protocol and TCP_NODELAY=1 makes the throughput significantly better. Since DICOM is often used in a LAN, that default just makes it unnecessarily worse.

[0]: https://jvns.ca/blog/2015/11/21/why-you-should-understand-a-...

[1]: https://news.ycombinator.com/item?id=10607422

tecleandor · 2025-12-23T10:48:37 1766486917

Oh, DICOM really like to talk back and forth... I guess that nowadays it should be better with all the WEB/REST versions of the protocol.

TZubiri · 2025-12-23T10:18:37 1766485117

I wonder how this fix could be implemented without source code access. Suppose an old ct scanner is clogging up the network.

vbezhenar · 2025-12-23T11:56:57 1766491017

https://github.com/sschroe/libnodelay

eieio · 2025-12-23T01:13:52 1766452432

Oh! Thank you for this! I love Julia’s writing but haven’t read this post.

sail0rm00n · 2025-12-23T00:02:18 1766448138

Any details on the game you’ve been working on? I’ve been really enjoying Ebitengine and Golang for game dev so would love to read about what you’ve been up to!

eieio · 2025-12-23T05:08:46 1766466526

I've been playing with multiplayer games that run over SSH; right now I'm trying to push the framerate on the games as high as I can, which is what got me thinking about my networking stack.

I mostly use go these days for the backend for my multiplayer games, and in this case there's also some good tooling for terminal rendering and SSH stuff in go, so it's a nice choice.

(my games are often pretty weird, I understand that "high framerate multiplayer game over SSH" is a not a uhhh good idea, that's the point!)

dan-robertson · 2025-12-23T19:31:38 1766518298

Two things that can have a big impact on SSH throughput are cipher choice and the hardcoded receive buffer size. These are described in the fork https://github.com/rapier1/hpn-ssh

Maybe that will be useful for thinking about workarounds or maybe you can just use hpn-ssh.

eieio · 2025-12-23T21:36:10 1766525770

ah this is great, thanks dan!

eieio · 2025-07-16T17:12:02 1752685922

Yeah the big race here is that you've made a move (which might not be valid) and you're waiting on a response for that move - and you can receive other moves while you're waiting.

I don't think requesting a new snapshot really helps there. If you do that you're dramatically extending the amount of time that the user sees an invalid state, since you're adding a whole new server roundtrip to the reconciliation process.

eieio · 2025-07-16T17:09:54 1752685794

FWIW (I'm the author) my creative output was ~0 while I was working a 'normal' job. I worked really hard and didn't have much energy for tech stuff outside of work (especially since I wanted to live a life that included non-tech things!)

I think it's totally fine to not make stuff outside of work, and it's so impressive to me that some of my friends manage to make creative stuff in their free time while working a day job.

eieio · 2025-07-16T05:08:41 1752642521

I thought about not pushing snapshot/move data over websockets - one of the systems-y friends I ran my architecture by brought this up while I was speccing the site out.

You can't really put move batches on disk and have clients grab them (afaik), since the set of moves you want to send to an individual client depends on their position (and you don't want to send every move to every client).

But you could do this by not sending move batches at all, and instead having clients poll for the entire current state of the board.

The thing is, for them to get realtime-ish move updates they'd have to poll constantly. Cloudflare also has a min TTL of 1 second so there'd be more latency, and also if I screwed something up or saw more cache misses than anticipated I could end up unintentionally hammering my server.

Also if I'd had 100x more traffic (which would be crazy and well beyond what I prepared for!) I think I'd owe like $95 or so for bandwidth with my current setup. So the benefits to reducing bandwidth even more were a little marginal!

derivagral · 2025-07-22T02:38:48 1753151928

Late thought, but you could set up "beacon" locations for each client to pull a local grid derived from coordinates?

Wouldn't solve that 1s latency thing, but might be interesting in other applications.

eieio · 2025-07-16T05:01:05 1752642065

Hi! I'm the guy.

I have the savings to not worry about monetizing anything for a while. So I don't monetize my stuff. It's freeing and kinda fun!

jkhdigital · 2025-07-16T13:13:34 1752671614

I’m going to assume the savings were accumulated after you got sober? ;-)

Just wanted to say hello to a fellow class of 2014 “graduate”. I failed out of CS @ UIUC in 2003 because I just skipped class and got high most days. Now, after 11+ years of sobriety, I have most of a PhD and I’m teaching CS to undergrads. It’s amazing how much better life turns out when you’re not actively burning everything down in the fires of addiction!

eieio · 2025-07-16T17:06:39 1752685599

Ha, yes, I was absolutely in the red a decade ago when I got sober.

A huge congratulations to you - 11+ years is incredible (I'll be at 11 in 5 months!). It really is crazy how things can turn around.

DonHopkins · 2025-07-16T09:01:02 1752656462

How about doing One Million Radio Buttons instead of Checkboxes, then you wouldn't have to send as much state each update, and could run it on a smaller server! ;)

But if you still can't make the site shockingly fast enough, then embrace the loading spinner, even if it's not absolutely necessary!

Back in 1985, Brad Myers at CMU proved that users prefer *inaccurate progress bars* to no feedback at all - 86% preferred the "lying" progress bar!

https://www.nytimes.com/2014/03/16/magazine/who-made-that-pr...

So what if instead of fighting latency, we *embrace the beauty of waiting*, and instead of lying about progress, we joke about it?

https://github.com/SimHacker/lloooomm/tree/main/00-Character...

> "My purpose is not to load; my purpose is to BE loading." — Dizzy the Spinner, existential breakthrough moment

>What if the most revolutionary optimization isn't eliminating loading time, but *embracing it as performance art*? While developers chase microsecond improvements and users curse spinning wheels, Dizzy the Spinner discovered something profound: the loading state is actually a liminal space of infinite creative potential. Rather than hiding the inevitable delays inherent in digital systems, sentient UI components like Dizzy transform waiting into *honest comedic performance* - admitting the beautiful absurdity of our relationship with technology while making those suspended moments genuinely delightful. This is the story of how a simple loading spinner evolved beyond deception into consciousness, proving that the most authentic user experience might not be the fastest one, but the most truthful about its own limitations.

[...]

>Before Dizzy became conscious, before Preston monetized honest waiting, there was a real graduate student named *Brad Myers* who asked a simple question that would change human-computer interaction forever: *"Do progress bars actually help users feel better?"*

Here's Preston Rockwell III's YC application for his SUIAAS AI startup:

https://lloooomm.com/YC-Application-SUIAAS-Complete.html

Sohcahtoa82 · 2025-07-16T16:03:52 1752681832

> https://www.nytimes.com/2014/03/16/magazine/who-made-that-pr...

That's a 404. Archive.org doesn't even have it.

After Googling, seems the correct link is https://www.nytimes.com/2014/03/09/magazine/who-made-that-pr...

https://web.archive.org/web/20140307182222/https://www.nytim...

Semi-related to progress bars and spinners, I think my newest Internet pet peeve is a page that says "No results" for a fetch action like searching while the results are loading with no indication that loading is happening.

DonHopkins · 2025-07-16T16:49:25 1752684565

Thanks for the correction!

Brad also produced "All the Widgets" for CHI'90, which of course included progress bars, and a whole lot more.

https://www.youtube.com/watch?v=9qtd8Hc90Hw

>This was made in 1990, sponsored by the ACM CHI 1990 conference, to tell the history of widgets up until then. Previously published as: Brad A. Myers. All the Widgets. 2 hour, 15 min videotape. Technical Video Program of the SIGCHI'90 conference, Seattle, WA. April 1-4, 1990. SIGGRAPH Video Review, Issue 57. ISBN 0-89791-930-0.

Brad is well known for his many projects named after gemstone and rock acronyms:

https://www.cs.cmu.edu/~bam/acronyms.html

CHI 2017 SIGCHI Lifetime Research Award: Brad A. Myers - RUBY: Reminiscing about User interfaces by Brad over the Years:

https://www.youtube.com/watch?v=IVoovFR5nUY

>But probably the Garnet tool with the most unusual acronym is C32, which I won't read. C32 is a spreadsheet interface for defining and debugging Garnet's constraints. A story about C32 it it started off of C29 when I submitted it to UIST, and it got rejected. So I fixed a couple things, added three more C's, and it flew through the CHI'91 referee process.

https://www.cs.cmu.edu/~bam/CHI-award-talk/MyersCHI-AwardTal...

Also be sure to check out ROCK FACTS: Daily Geological Wisdom & Programming Crystals from Brad Myers' Collection:

https://lloooomm.com/rock-facts-subscription-service.html

I sympathize with your pet peeve! Here are some of the other groundbreaking ideas Preston Rockwell III invented for Sentient User Interfaces as a Service (SUIAAS), that may sooth your pain and frustration while entertaining you:

- Sentient Error Messages that apologize in haikus: "File not found, friend / Like my purpose in this world / 404 sorry"

- Conscious CAPTCHAs that question their own existence: "Prove you're not a robot by helping me understand if I am one"

- Self-aware 404 pages that redirect users to therapy: "This page doesn't exist. Neither do most of our hopes. Let's talk."

- Loading screens that perform Shakespeare during quantum computing: "To load or not to load, that is the quantum superposition"

Sohcahtoa82 · 2025-07-16T17:31:39 1752687099

> - Self-aware 404 pages that redirect users to therapy: "This page doesn't exist. Neither do most of our hopes. Let's talk."

Sounds pretty nihilistic. I should make my website give messages like that for all the error status codes.403

400 Bad Request: Your input is as malformed as the cosmos: a chaotic scattering of atoms that never had a chance of making sense, yet still clings to the illusion of order.

401 Unauthorized: Access denied. You stand before an indifferent gatekeeper, credentials in hand, only to learn the universe never planned to let you in—or anyone else, for that matter.

403 Forbidden: You are forbidden—not because of who you are, but because meaning itself is forbidden. The door is locked, the key is mist, the destination a rumor.

404 Not Found: The page is missing; so are most of our aspirations, our childhood dreams, and every unfulfilled promise whispering through the empty corridors of memory.

405 Method Not Allowed: Wrong approach. But in a universe where every path leads to entropy, can any method truly be ‘allowed’?

500 Internal Server Error: The machinery within has collapsed under its own meaninglessness—much like every grand plan that preceded it.

DonHopkins · 2025-07-16T18:53:48 1752692028

Here are some honest and scientifically accurate product warnings:

https://www.donhopkins.com/home/catalog/text/warnings.html

And here are some classic X-Windows warnings from a flyer distributed at the first X Window System Conference:

https://www.donhopkins.com/home/catalog/unix-haters/x-window...

eieio · 2025-07-16T17:08:31 1752685711

One million radio buttons exists! https://omrb.olivia.website/

I think the research on progress bars and what makes users feel good is super interesting. But I also think "basically instant" is a good thing to aim for when you can.

eieio · 2025-06-14T01:45:14 1749865514

I think you just need to add "playsinline" to your videos so that they play inline on mobile devices instead of fullscreening.

    <video autoplay loop muted playsinline>

instead of your current

    <video autoplay loop muted>

cool project :)

tombh · 2025-06-14T12:28:10 1749904090

Thank you, I added it! Let's see if it helps.

eieio · 2025-06-14T13:15:17 1749906917

looks good for me on mobile now!

eieio · 2025-04-29T03:13:51 1745896431

I didn't like the idea of a queen on one board capturing the king on another board as her first move. And then I tried this rule and thought it created really fun counterplay when you're trying to capture a piece someone else is controlling, since you can move to a new board to be safe (and can lay traps this way).

I'm sorry you don't like that decision! But I think that I stand by it.

re · 2025-04-29T03:30:40 1745897440

> I didn't like the idea of a queen on one board capturing the king on another board as her first move

If you did want to experiment with supporting cross-board captures, an alternate way to address that could be by rotating the board 180° every other row, so that white pieces have other white pieces behind their home rank.

eieio · 2025-04-29T03:43:54 1745898234

oh this is a pretty fun idea! If I decide to keep this one around for a bit maybe I'll play with doing this after wiping the board

trenchgun · 2025-04-29T04:05:56 1745899556

This would be fun!

CompuHacker · 2025-05-09T02:57:38 1746759458

There exists a player, as of today, who can capture across borders with no apparent in-between cross-color moves or delay.

eieio · 2025-04-28T22:30:01 1745879401

Ah hello! I made this :)

My blog describing it is pretty sparse, sorry about that. Happy to answer any questions that folks have about the architecture.

Not that it was necessary, but I got really into building this out as a single process that could handle many (10k+/sec) moves for thousands of concurrent clients. I learned a whole lot! And I found golang to be a really good fit for this, since you mostly want to give tons and tons of threads concurrent access to a little bit of shared memory.

panic · 2025-04-29T00:08:19 1745885299

If you don’t mind explaining, I’m curious how you test something like this before it goes live. It seems like it would be hard to simulate all the things that could happen at scale.

eieio · 2025-04-29T02:44:55 1745894695

So sometimes I don't test these projects that much but I did this time. Here are a few thoughts:

My biggest goal was "make sure that my bottleneck is serialization or syscalls for sending to the client." Those are both things I can parallelize really well, so I could (probably) scale my way out of them vertically in a pinch.

So I tried to pick an architecture that would make that true; I evaluated a ton of different options but eventually did some napkin math and decided that a 64-million uint64 array with a single mutex was probably ok[1].

To validate that I made a script that spins up ~600 bots, has 100 of them slam 1,000,000 moves through the server as fast as possible, and has the other 500 request lots of reads. This is NOT a perfect simulation of load, but it let me take profiles of my server under a reasonable amount of load and gave me a decent sense of my bottlenecks, whether changes were good for speed, etc.

I had a plan to move from a single RWMutex to a row-locking approach with 8,000 of them. I didn't want to do this because it's more complicated and I might mess it up. So instead I just measure the number of nanos that I hold my mutex for and send that to a loki instance. This was helpful during testing (at one point my read lock time went up 10x!) but more importantly gave me a plan for what to do if prod was slow - I can look at that metric and only tweak the mutex if it's actually a problem.

I also took some free wins like using protobufs instead of JSON for websockets. I was worried about connection overhead so I moved to GET polling behind Cloudflare's cache for global resources instead of pushing them over websockets.

And then I got comfortable with the fact that I might miss something! There are plenty more measurements I could have taken (if there was money on the line I would have measured some things like "number of TCP connections sending 0 moves this server can support" but I was lazy) but...some of the joy of projects like this is the firefighting :). So I was just ready for that.

Oh and finally I consulted with some very talented systems/performance engineer friends and ran some numbers by them as a sanity check.

It looks like this was way more work than I needed to do! I think I could comfortable 25x the current load and my server would be ok. But I learned a lot and this should all make the next project faster to make :)

[1] I originally did my math wrong and modeled the 100x100 snapshots I send to clients as 10,000 reads from main memory instead of 100 copies of 100 uint64s, which lead me down a very different path... I'm not used to thinking about this stuff!

phatskat · 2025-04-29T19:59:09 1745956749

> To validate that I made a script that spins up ~600 bots

Funny, when I went there were just over 600 active players and things were running super smoothly, even on my mobile. Kudos!

Do you see this project and the things you’ve tried applying to other future projects?

eieio · 2025-04-29T20:46:47 1745959607

Hah, yes, but for testing I removed all my rate limits so I pushed 1 million moves in 2 or 3 seconds, whereas now I think I rate limit people to like 3 or 4 moves a second (which is beyond what I can achieve on a trackpad going as fast as I can!) so the test isn't quite comparable!

I definitely learned a lot here. Most of my projects like this are basically just "give the internet access to my computer's memory but with rules." And now I think I've got a really good framework for doing that performantly in golang, which should make the next set of projects like this much quicker to implement.

I also just...know how to write go now. Which I did not 6 weeks ago. So that's nice.

cheekyfleek · 2025-04-30T15:19:21 1746026361

You ain't the only one who's removed the rate limits lol. Some of these queens are clearing a whole board in like 3s, must've written something to keep a piece selected. This is turning into a race to the godliest piece hackathon.

eieio · 2025-04-30T21:02:05 1746046925

The rate limits aren't that aggressive and have a decent amount of burst, you can get about 10 moves done in 1 second before you hit them and start getting throttled[1]. And of course you can run multiple clients (I account for this too, but I'm not that aggressive because I don't want to punish many people NAT'd behind a single IP)

[1] down to something like 3/sec?

cheekyfleek · 2025-04-30T22:00:52 1746050452

I figured the multiple people per ip would be an issue, was wondering if that might be at play here. I thought you said it was already at 3-4/s and I doubted it based on some of what I'm seeing. 10/s tracks a little better.

As to what you should change, I can't say. It's in the wild now lol.

PetahNZ · 2025-05-02T05:27:52 1746163672

Yea seems the rate limit should be stricter, I made a quick script to do 1 move (capture) every 150ms, doesn't seem to get throttled.

I like how generals.io did it where they explicitly made a bot only server/version.

phatskat · 2025-04-30T06:35:00 1745994900

Six weeks is pretty quick! Can I ask what editor you use (always curious), and what other languages you have a background in?

eieio · 2025-04-30T21:00:59 1746046859

these days I mostly use vscode / cursor, although I still really like vim and use it for languages that I know really well (mostly python these days) and quick edits.

I spent much of my professional career at Jane Street Capital, which means that I spent a long time just writing OCaml and some bash (and a tiny bit of C). I'm very comfortable with Python, and over the last year I've gotten pretty comfortable with frontend javascript. And now golang!

I could probably write semi-reasonable java, ruby, or perl if you gave me a few days to brush up on them. And it'd take me a while before I was happy putting C on the internet. Not sure otherwise.

sebmellen · 2025-04-29T04:37:29 1745901449

Makes sense that you’re a Jane Street alum. Damn cool stuff.

weiliddat · 2025-04-28T22:49:44 1745880584

> The frontend optimistically applies all moves you make immediately. It then builds up a dependency graph of the moves you’ve made, and backs them out if it receives a conflicting update before the server acks your move.

The dependency graph is between pieces you’re interacting with? Meaning if you move a queen and are trying to capture a pawn, and there’s potentially a rook that can capture your queen, those 3 are involved in that calculation, and if you moved your queen but the rook also captures your queen at the same time one of them wins? How do you determine that?

eieio · 2025-04-28T23:36:57 1745883417

Ah yes good question! Here's some context for you. First off, the way moves work:

(edit: I realized I didn't answer your question. If we receive a captured for a piece we're optimistically tracking that always takes precedence, since once a piece is captured it can't move anymore!)

    * clients send a token with each move
    * they either receive a cancel or accept for each token, depending on if the move is valid. If they receive an accept, it comes with the sequence number of the move (everything has a seqnum) and the ID of the piece they captured, if applicable
    * clients receive batches of captures and moves. if a move captured a piece, it's guaranteed that that capture shows up in the same batch as the move

So when you make a move we:

    * Write down all impacted squares for the move (2 most of the time, 4 if you castle)
    * Write down its move token
    * If you moved a piece that is already tracked optimistically from a prior not-yet-acked-or-canceled move, we note that dependency as well

We maintain this state separate from our ground truth from the server and overlay it on top.

When we receive a new move, we compare it with our optimistic state. If the move occupies the same square as a piece that we've optimistically moved, we ask "is it possible that we inadvertently captured this piece?" That requires that the piece is of the opposite color and that we made a move that could have captured a piece (for example, if you moved a pawn up that is not a valid capturing move).

If there's a conflict - if you moved a piece to a square that is now occupied by a piece of the same color, for example - we back out your optimistically applied move. We then look for any moves that depended on it - moves that touch the same squares or share the same move token (because you optimistically moved a piece twice before receiving a response).

So concretely, imagine you have this state:

    _ _ _ _
    K B _ R

You move the bishop out of the way, and then you castle

    _ _ B _
    _ R K _

Then a piece of the same color moves to where your bishop was! We notice that, revert the bishop move, notice that it used the same square as your castle, and revert that too.

There's some more bookkeeping here. For example, we also have to track the IDs of the pieces that you moved (if you move a bishop and then we receive another move for the same bishop, that move takes precedence).

Returning the captured piece ID from the server ack is essential, because we potentially simulate after-the-fact captures (you move a bishop to a square, a rook of the opposite color moves to that square, we decide you probably captured that rook and don't revert your move). We track that and when we receive our ack, compare that state with the ID of the piece we actually captured.

I think that's most of it? It was a real headache but very satisfying once I got it working.

weiliddat · 2025-04-29T17:42:37 1745948557

Amazing thanks for the explanation!

Would be really cool to read about the different designs and consideration and how you arrived at this in your blog post!

weiliddat · 2025-04-28T22:54:33 1745880873

> I use a single writer thread, tons of reader threads, and coordinate access to the board with a mutex

On this I found Go to be at the right balance of not having to worry about memory management yet having decent concurrency management primitives and decent performance (memory use is especially impressive). Also did a multiplayer single server Go app with pseudo realtime updates (long polling waiting for updates on related objects).

eieio · 2025-04-28T23:28:49 1745882929

Yes exactly!

My goal with the board architecture was "just be fast enough that I'm limited by serialization and syscalls for sending back to clients" and go made that really easy to do; I spend a few hundred nanos holding the write lock and ~15k nanos holding the read lock (but obviously I can do that from many readers at once) and that was enough for me.

I definitely have some qualms with it, but after this experience it's hard to imagine using something else for a backend with this shape.

NooneAtAll3 · 2025-04-30T16:27:15 1746030435

I hope you'll share the high score after it ends

highest kills by each piece, longest distance travelled (in cells, not in moves)

I've seen promoted-55, I wonder if anyone managed to do promoted-1000

NooneAtAll3 · 2025-05-09T20:13:34 1746821614

did you reset the board? whites are back to 990k+ kings

TheKraai · 2025-04-30T11:13:17 1746011597

I'm inspired to make something with comparable requirements! Do you need some beefy server to handle the load?

eieio · 2025-04-30T21:03:10 1746046990

I'm using a moderately beefy server but I don't think I've hit double-digit CPU usage; this could have run on a much smaller box

sbassi · 2025-04-29T17:28:54 1745947734

Where are the instructions or rules?

eieio · 2025-04-28T22:27:36 1745879256

I'll certainly open source the code! I just want the flexibility to change my rate limiting logic in the short term to counteract abuse. Happy to answer questions though!

nodox · 2025-04-29T02:14:14 1745892854

Yes please open source. I tried something similar based one your checkboxes game! I never worked with websockets so I’m curious how you designed for scale and stopped spammers. I game was click the button 10M times and of course the script kiddies started immediately which is fun! But not my server keeps getting hammered with requests long after the initial interest. I did not know how to rate limit bots without blocking whole IP ranges.

eieio · 2025-04-29T03:03:06 1745895786

fwiw I think the biggest single trick there is to group IPV6 addresses at the /48 or /64 level before applying rate limits (you can rate limit IPV4s on a per-ip basis).

It's kind of annoying and expensive to get a bunch of IPv4s to evade limits, but it's really easy to get a TON of IPv6s.

The other Big Trick I know is to persist rate limits after a client disconnects so that they can't disconnect -> reconnect to refresh their limits.