Funny enough, no matter how big these things get, they never seem to make me a l...

jiggawatts · on Sept 21, 2023

People get set in their ways. Sometimes entire industries need a shake-up.

This never occurs voluntarily, and people will wail and thrash about even as you try to help them get out of their rut.

Builds being slow is one of my pet peeves also. Modern "best practices" are absurdly wasteful of the available computer power, but because everyone does it the same way, nobody seems to accept that it can be done differently.

A typical modern CI/CD pipeline is like a smorgasboard of worst-case scenarios for performance. Let me list just some of them:

- Everything is typically done from scratch, with minimal or no caching.

- Synchronous I/O from a single thread, often to a remote cloud-hosted replicated disk... for an ephemeral build job.

- Tens of thousands of tiny files, often smaller than the physical sector size.

- Layers upon layers of virtualisation.

- Many small HTTP downloads, from a single thread with no pipelining. Often un-cached despite being identified by stable identifiers -- and hence infinitely cacheable safely.

- Spinning up giant, complicated, multi-process workflows for trivial tasks such as file copies. (CD agent -> shell -> cp command) Bonus points for generating more kilobytes of logs than the kilobytes of files processed.

- Repeating the same work over and over (C++ header compilation).

- Generating reams of code only to shrink it again through expensive processes or just throw it away (Rust macros).

I could go on, but it's too painful...

mschuster91 · on Sept 21, 2023

> Builds being slow is one of my pet peeves also. Modern "best practices" are absurdly wasteful of the available computer power, but because everyone does it the same way, nobody seems to accept that it can be done differently.

A lot of what you describe originates from lessons learned in "classic" build environments:

- broken (or in some cases, regular) build attempts leaving files behind that confuse later build attempts (e.g. because someone forgot to do a git clean before checkout step)

- someone hot-fixing something on a build server which never got documented and/or impacted other builds, leading to long and weird debugging efforts when setting up more build servers

- its worse counterpart, someone setting up the environment on a build server and never documenting it, leading to serious issues when that person inevitably left and then something broke

- OS package upgrades breaking things (e.g. Chrome/FF upgrades and puppeteer using it), and a resulting reluctancy in upgrading build servers' software

- attackers hacking build systems because of vulnerabilities, in the worst case embedding malware into deliverables or stealing (powerful) credentials

- colliding versions of software stacks / libraries / other dependencies leading to issues when new projects are to be built on old build servers

In contrast to that, my current favourite system of running GitLab CI on AWS EKS with ephemeral runner pods is orders of magnitude better:

- every build gets its own fresh checkout of everything, so no chance of leftover build files or an attacker persisting malware without being noticed (remember, everything comes out of git)

- no SSH or other access to the k8s nodes possible

- every build gets a reproducible environment, so when something fails in a build, it's trivial to replicate locally, and all changes are documented

ilyt · on Sept 21, 2023

Right but build environment are awfully stupid about it. Re-downloading deps when they did not change is utterly wasteful that achieves nothing, same as re-compiling stuff that did not change.

> broken (or in some cases, regular) build attempts leaving files behind that confuse later build attempts (e.g. because someone forgot to do a git clean before checkout step)

But thanks to CI you will never fix such broken build system!.

lazide · on Sept 22, 2023

Not a fan of it, but I have some experience behind why this happens!

Download gets corrupted (randomly) at some point. Corrupted file gets stuffed into the cache. Now, since everyone is using the cache, everything/everyone is broken because the cache is ‘always good’, and the key didn’t change!

So then, someone figured it out and turned off caching - and it fixed it.

So now caching is always off.

ilyt · on Sept 22, 2023

That's bug to be filled with language tooling...

lazide · on Sept 23, 2023

And of course everyone has time to setup a reproducible test case for this random data corruption bug, and can wait for the language tooling to get fixing while all their builds break….

Or they just turn off caching and forget about it.

gpderetta · on Sept 21, 2023

You forgot: - you can get lunch while you wait for your build to finish.

mschuster91 · on Sept 21, 2023

Not if your pipeline is decent. Parallelize and cache as much as you can.

The only stack that routinely throws wrenches into pipeline optimization is Maven. I'd love to run, say, Sonarqube, OWASP Dependency Checker, regular unit tests and end-to-end tests in parallel in different containers, but Maven - even if you pass the entire target / */target folders through - insists on running all steps prior to the goal you attempt to run. It's not just dumb and slow, it makes the runner images large AF because they have to carry everything needed for all steps in one image, including resources like RAM and CPU.

justinclift · on Sept 21, 2023

> OS package upgrades breaking things

Heh. docker.io package on Ubuntu did this recently, whereby it stopped honouring the "USER someuser" clause in Dockerfiles. Completely breaks docker builds.

No idea if it's fixed yet, we just updated our systems to not pull in docker.io 20.10.25-0ubuntu1~20.04. or newer.

ilyt · on Sept 21, 2023

Docker developers being clueless, what else is new...

slowmovintarget · on Sept 21, 2023

Indeed: Optimize to reduce developer time spent on bad builds first.

One of the rules of this approach is to filter all extraneous variable input likely to disrupt the build results, especially and including artifacts from previous failed builds.

jiggawatts · on Sept 21, 2023

You’re the perfect example of the cranky experienced developer stuck in their ways and fighting against better solutions.

Most of the issues you’ve described are consequences of yet more issues such as not caching with the correct cache key.

I argue that all of the problems are eminently solvable. You’re arguing for leaving massive issues in the system because of… other massive issues. Only one of these two approaches to problem solving gets to a solution without massive problems.

api · on Sept 21, 2023

We've gotten very good at using containerization and virtualization to isolate and abstract away ugliness, and as a result we have been able to build unspeakable towers of horror that would have been impossible without these innovations.

jrockway · on Sept 21, 2023

A few weeks ago, I decided to lock myself in my apartment and write a build system. I have always felt like developers wait ages for Docker, and I wanted to see why. (Cycle times on the app I develop are a minute for my crazy combination of shell scripts that I use, or up to 5 minutes for what most of my teammates do. This is incomprehensibly unproductive.)

It turns out, it's all super crazy at every level. Things like Docker use incredibly slow algorithms like SHA256 and gzip by default. For example, it takes 6 seconds to gzip a 150MB binary, while zstd -fast=3 achieves the same ratio and does it in 100 milliseconds! The OCI image spec allows Zstandard compression, so this is something you can just do to save build time and container startup time. (gzip, unsurprisingly, is not a speed demon when decompressing either.) SHA256, used everywhere in the OCI ecosystem, is also glacial; a significant amount of CPU used by starting or building containers is just running this algorithm. Blake3 is 17 times faster! (Blake2b, a fast and more-trusted hash than blake3, is about 6x faster.) But unfortunately, Docker/OCI only support SHA256, so you are stuck waiting every time you build or pull a container. (On the building side, you actually have to compute layer SHA256s twice; once for the compressed data and once for the uncompressed data. I don't know what happens if you don't do this, I just filled in every field the way the standard mandated and things worked.)

This was on HN a couple years ago and was a real eye opener for me: https://jolynch.github.io/posts/use_fast_data_algorithms/

There are also things that Dockerfiles preclude, like building each layer in parallel. I don't use layers or shell commands for anything; I just put binaries into a layer. With a builder that doesn't use Dockerfiles, you can build all the layers in parallel and push some of the earlier layers while the later ones are building. (One of the reasons I wrote my own image assembler is because we produce builds for each architecture. The build machine has to run an arm64 qemu emulator so that a Dockerfile-based build can run `[` to select the right third-party binary to extract. This is crazy to me; the decision is static and unchanging, so no code needs to be run. But I know that it's designed for stuff like "FROM debian; RUN apt-get update; RUN apt-get upgrade" which is ... not needed for anything I do.)

The other thing that surprises me about pushing images is how slow a localhost->localhost container push is. I haven't looked into why because the standard registry code makes me cry, but I plan to just write a registry that stores blobs on disk, share the disk between the build environment and the k8s cluster (hostpath provisioner or whatever), and have the build system just write the artifacts it's building into that directory; thus there is no push step required. When the build is complete, the artifacts are available for k8s to "pull".

The whole thing is a work in progress, but with a week of hacking I got the cycle time down from 1 minute to about 5 seconds, and many more improvements are available. (Eventually I plan to build everything with Bazel and remote execution, but I needed the container builder piece for multi-architecture releases; Bazel will have to be invoked independently for each architecture because of its design, and then the various artifacts have to be assembled into the final image list.)

oconnor663 · on Sept 21, 2023

> Blake3 is 17 times faster! (Blake2b, a fast and more-trusted hash than blake3, is about 6x faster.)

I'm a little surprised to see that 6x figure. Just going off the red bar chart at blake2.net, I wouldn't expect to see much more than a 2x difference, unless you're measuring a sub-optimal SHA256 implementation. And recent x86 CPUs have hardware acceleration for SHA256, which makes it faster than BLAKE2b. But those CPUs also have wide vector registers and lots of cores, so BLAKE3's relative advantage tends to grow even as BLAKE2b falls behind.

But in any case, yes, builds and containers tend to be great use cases for BLAKE3. You've got big files that are getting hashed over and over, and they're likely to be in cache. An expensive AWS machine can hit crazy numbers like 100 GB/s on that sort of workload, where the bottleneck ends up being memory bandwidth rather than CPU speed.

jrockway · on Sept 21, 2023

Yeah, I'm not using any special implementation of sha256. Just crypto/sha256 from the standard library. It's also worth noting that I'm testing on a zen2 chip. I have read a lot of papers saying that sha2 is much faster because CPUs have special support; didn't see it in practice. (No AVX-512 here, but I believe those algorithms predate AVX-512 anyway.)

saltcured · on Sept 22, 2023

The normal sha256sum CLI program in Fedora Linux seems to use these extensions.

On a couple older Intel x86_64 systems without the extension, I get around 300 MiB/s hashing a ~15 MB file.

On a Ryzen 3700u laptop I get around 1250 MiB/s. And I also get around 1380 MiB/s on an an aarch64 ARM vCPU on AWS (t4g instance type).

jiggawatts · on Sept 22, 2023

It’s included with the AES-NI instruction set, which is quite old now and included on most computers.

Aurornis · on Sept 21, 2023

> Software is so slow compared to hardware. It's embarrassing that we haven't moved not even a hundredth of what hardware has the last 30 years

I don’t understand this mentality. What, exactly, did you expect to get faster? If you run the same software on older hardware it’s going to be much slower. We’re just doing more because we can now.

From my perspective, things are pretty darn fast on modern hardware compared to what I was dealing with 5-10 years ago.

I had embedded systems builds that would take hours and hours on my local machine years ago. Now they’re done in tens of minutes on my machine that uses a relatively cheap consumer CPU. I can clean build a complete kernel in a couple minutes.

In my text editor I can do a RegEx search across large projects and get results nearly instantly! Having NVMe SSDs and high core count consumer CPUs makes amazing things possible.

Software is improving, too. Have you seen how fast the new Bun package manager is? I can pick from dozens of open source database options for different jobs that easily enable high performance, large scale operations that would have been unthinkable or required expensive enterprise software a decade ago (even with today’s hardware).

> Why get this?

If you really think nothing has improved, you might be experiencing a sort of hedonic adaptation: Every advancement gets internalized as the new baseline and you quickly forget how slow things were previously.

I remember the same thing happened when SSDs came out: They made an amazing improvement in desktop responsiveness over mechanical HDDs, but many people almost immediately forgot how slow HDDs were. It’s only when you go back and use a slow HDD-based desktop that you realize just how slow things were in the past.

josephg · on Sept 21, 2023

> We’re just doing more because we can now.

Are we though?

Our computers are orders of magnitude faster than they were. What new features justify consuming 100x or more CPU, RAM, network and disk space?

Is my email software doing orders of magnitude more work to render an email compared to the 90s? Does discord have that many more features compared to mIRC that it makes sense for it to take several seconds to open on my 8 core M1 laptop? For reference, mIRC was a 2mb binary and I swear it opened faster on my pentium 2 than discord takes to open on my 2023 laptop. By the standards of 1995 we all walk around with supercomputers in our pockets. But you wouldn't know it, because the best hardware in the world still can't keep pace with badly written software. As the old line from the 1990s goes, "what Andy giveth, Bill taketh away."[1] (Andy Grove was CEO at the time of Intel.)

My instinct is that as more and more engineers work "up the stack" we're collectively forgetting how to write efficient code. Or just not bothering. Why optimize your react web app when everyone will have new phones in a few years with more RAM? If the users complain, blame them for having old hardware.

I find this process deeply disrespectful to our users. Our users pay thousands of dollars for good computer hardware because they want their computer to run fast and well. But all of that capacity is instead chewed through by developers the world over trying to save a buck during development time. Every hardware upgrade our users make just becomes the new baseline for how lazy we can be.

Slow CI/CD pipelines are a completely artificial problem. There's absolutely no technical reason that they need to run so slowly.

[1] https://en.wikipedia.org/wiki/Andy_and_Bill%27s_law

Almondsetat · on Sept 21, 2023

why is opening word or excel or powerpoint not instantaneous? plenty of people, me included, have to sieve through vast amounts of document and constantly open/close them

JohnBooty · on Sept 21, 2023

Application "bloat" is one obvious thing, but operating systems are also doing a lot more "work" to open an app these days -- address space randomization, checking file integrity, checking for developer signing certs, mitigations against hardware side channel attacks like rowhammer or whatever, etc.

Those things aren't free. But, I don't know the relative performance hit there compared to good old software bloat.

soulbadguy · on Sept 21, 2023

> address space randomization, checking file integrity, checking for developer signing certs, mitigations against hardware side channel attacks like rowhammer or whatever, etc.

OS/Application have been slower way before any of those were a thing.

mejutoco · on Sept 21, 2023

Maybe they are waiting to make them right before making them fast? /s

(Make it work, make it right, make it fast)

rwmj · on Sept 21, 2023

And yet, just a few minutes ago (the time it took to reboot my laptop), I clicked open the "Insert" menu in a Google doc and the machine hung. Even a 128k Macintosh with a 68000 CPU could handle that.

sp332 · on Sept 21, 2023

ClarisWorks would regularly hang my Mac Classic II. And with cooperative multitasking, I had to reboot the machine.

jayd16 · on Sept 21, 2023

Yeah, I remember compressing 4k120hz video of my AI upscaled ray traced RPG play through in real time and streaming it to all my friends' phones in the 90s. Times never change.

But really, it's easy to forget the massive leaps we make every year.

josephg · on Sept 21, 2023

Maybe I'm just getting old, but diablo 4 doesn't look that much better to my eyes than diablo 3 did. I'm sure the textures are higher resolution and so on, but 5 minutes in I don't notice or care. Its how the game plays that matters, and that has almost never been hardware limited.

That said, I'm really looking forward to the day we can embed LLMs and generative AI into video games for like world generation & dialog. I can't wait to play games where on startup, the game generates truly unique cities for me to explore filled with fictional characters. I want to wander around unique worlds and talk - using natural language - with the people who populate them.

I'm giddy with excitement over the amazing things video games can bring in the next few years. I feel like a kid again looking forward to christmas.

hypercube33 · on Sept 21, 2023

there was a video posted to hacker news about core windows apps taking longer to launch on new enough hardware. Stuff like calc, word, paint taking longer to start on win11 vs win 2000 even though machines have much faster everything so I think the point stands - why is everything slower now?

lazide · on Sept 22, 2023

Windows sucks.

switchbak · on Sept 21, 2023

I get your take here, but as someone who's worked very hard at times to optimize builds (amongst other things), the business just generally doesn't respect those efforts and certainly doesn't reward them. Often times they're actively punished with a reflexive assumption that they're not "serious" efforts worth the time of the business. (There's the odd exception, but this is very widespread in my experience)

Sure, there's a balance to be made between cutting wood and sharpening the saw. Who do we blame when the boss-man won't allow anyone to sharpen the tools even though we're obviously wasting outrageous amounts of time? You blame the people that won't allow those investments to be made.

When you multiply that across an entire industry, add some trendy fashionable tech (that's also just fast-enough to be tolerable), and this is how we end up in the shitty circumstance you describe.

And yet I still wouldn't trade my fancy IDE and slow CI pipelines for a copy of Turbo Pascal 7, as fast as it would be!

redox99 · on Sept 21, 2023

> Funny enough, no matter how big these things get, they never seem to make me a lot more productive.

Maybe because the stuff you do isn't bottlenecked by compute? In my case every hardware upgrade resulted in a big productivity improvement.

Better CPU: Cut my C++ build times in half (from 10 to 5 minutes if I change an important .h)

Better GPUs: Cut my AI training time by a few X, massively improving iteration times. Also allow me to run bigger models that more easily reach my target accuracy.

moffkalast · on Sept 21, 2023

https://en.wikipedia.org/wiki/Wirth%27s_law

joseph_grobbles · on Sept 21, 2023

It's a variation of Parkinson's Law -- we just keep expanding what we are doing to fit the hardware available to us, then claiming that nothing has changed.

CI is a fairly new thing. The idea of constantly doing all that compute work again and again was unfathomable not that long ago for most teams. We layer on and load in loads of ancillary processes, checks, lints, and so on, because we have the headroom. And then we reminisce about the days when we did a bi-monthly build on a "build box", forgetting how minimalist it actually was.

jebarker · on Sept 21, 2023

This is true, but there's still choice in how we expand and the default seems to be to do it as wastefully as possible.

lazide · on Sept 22, 2023

As wastefully as we can get away with no?

Not the same thing.

jebarker · on Sept 22, 2023

Yes, your wording is better, I don't believe people are actively trying to be as wasteful as possible.

pbjtime · on Sept 21, 2023

It's what the economy rewards. Simple as that

jebarker · on Sept 21, 2023

I don't totally agree. It's what a short-term view of the economy rewards for sure. But even if that was the only view of the economy I've seen plenty of low-performance software written purely out of cargo culting and/or inability or lack of will to do anything better.