IPC – Unix Signals

jlokier · on Oct 9, 2023

> A new pid is always more than the previously assigned pid during an OS uptime

This isn't true. The pid increments most of the time, but occasionally the new pid jumps down to a seemingly arbitrary value to start a new range. It is switching between unused ranges. This happens on my Linux machines all the time.

However, it doesn't matter for what the article describes: it doesn't depend on the pid always increasing, it only depends on the pid for each process being unique.

Pid reuse doesn't matter often, but it matters occasionally with how sigals are often used. Many scripts are written to signal a process that might have terminated already, e.g. using a pid stored in an environment variable or file. This usually just returns an error, but if there has been enough system activity meanwhile, the pid may belong to a new process, which the script does not intend to signal, so a random process may be killed unexpectedly.

duped · on Oct 9, 2023

I feel a bit nerd sniped: On Linux PIDs are only unique within a PID namespace. But that doesn't matter because the entire point of a PID namespace is to isolate processes.

But it's worth mentioning because it can mitigate the problem you've laid out. If you have a long running process that may need to kill its children, you can start with within a new PID namespace such that it may only kill its children (or descendant).

remram · on Oct 9, 2023

That should only be a problem if you're running as root. Running as a dedicated user is also a mitigation.

rcr · on Oct 9, 2023

This doesn't entirely solve the problem of potentially killing the wrong process though

duped · on Oct 9, 2023

No but it is a lot less likely for a PID to get reused on accident within a new PID namespace.

rcr · on Oct 9, 2023

It's really no less likely, namespace or not. There are better mechanisms for handling this situation, pidfds for example.

rcr · on Oct 9, 2023

Linux introduced pidfds for this reason.

somat · on Oct 10, 2023

Also note that openbsd uses random pids.

Lammy · on Oct 10, 2023

And FreeBSD if so configured:

  $ sysctl kern.pid_max kern.randompid
  kern.pid_max: 99999
  kern.randompid: 594

D12238 sez “If set to a value of (100 <= kern.randompid <= (PID_MAX - 100)), it will be used as modulus when kern_fork.c tries to allocate new PID. Formula is lastpid += arc4random() % kern.randompid”

goodyduru · on Oct 9, 2023

Author here. Thanks for the feedback! Corrected the error.

duped · on Oct 9, 2023

What's wrong with sockets? They have quite a few advantages for IPC over Unix signals.

- They're scoped within the program (no global signal handler that you need to trampoline out of)

- The number of IPC channels you can create is effectively unbounded (not really true, but much less limiting than signals). If another process or part of the program needs IPC you can just open a new channel without breaking code or invariants relied upon by any other IPC channels.

- Reads/writes can be handled asynchronously without interrupting any thread in the program.

- You can use them across the network (AF_INET), VMs (AF_VSOCK), or restrict locally (AF_UNIX)

- Unix sockets can be used to send file descriptors around, even if they're opened by a child after forking. That includes using Unix sockets to send the file descriptor of other sockets (eg program 1 is talking to program 2 on the same machine and program 2 opens an IPC channel with program 3 and wants to send it back to program 1).

I feel like just because you can use signals for IPC doesn't mean you should.

ajross · on Oct 9, 2023

So, nothing is wrong with sockets. Sockets (both unix domain and TCP) are the overwhelming choice for IPC mechanisms in new code, have been for the last few decades, and will be when we all retire.

Nonetheless it's not uncommon to have a communication style where (1) messages are extremely simple, with either no metadata or just a single number associated with them, (2) may be sent to any of a large number of receiving processes. And there, signals have a few advantages: you don't need to manage a separate file descriptor per receiver, you don't need to write a full protocol parser, you can test or trigger your API from the command line with e.g. /usr/bin/kill. They're good stuff.

But do be aware of the inherent synchronization problems with traditional handlers. Signals interrupt the target's code and can't block to wait for it (they act like "interrupts" in that sense), so traditional synchronization strategies don't work and there are traps everywhere. If you're writing new signal code you really want to be using signalfd (which, yeah, re-introduces the "one extra file descriptor per receiver" issue).

duped · on Oct 9, 2023

Using a signal handler for IPC isn't really "simple" though, since the handler needs to be async signal safe itself. You don't need a "full protocol parser" for sockets either. You can send/receive C structs just fine. It's also not hard to write to sockets from the command line.

> But do be aware of the inherent synchronization problems with traditional handlers.

This is why you should almost never use signal handlers for IPC, because they're full of footguns and not actually simple to use.

Increasing the number of file descriptors doesn't seem like much of a burden. If your app actually pushes on those limits you need to be messing with ulimits anyway.

ajross · on Oct 9, 2023

> You don't need a "full protocol parser" for sockets either. You can send/receive C structs just fine. It's also not hard to write to sockets from the command line.

Yeah, that's how you get vulnerabilities. For the same reason that "Thou shalt not write async signal handlers without extreme care", thou must never send data down a socket without a clear specification for communication. It doesn't (and shouldn't) have to be a "complicated" protocol. But it absolutely must be a fully specified one.

And signals, as mentioned, don't really have that shortcoming due to their extreme simplicity. They're just interrupts and don't pass data (except in the sense that they cause the target to re-inspect state, etc...). They're a tool in the box and you should know how they work.

Mostly the point is just "IPC is hard", and choosing tools isn't really the hard part. Signals have their place.

goodyduru · on Oct 9, 2023

Author here. I agree with you on preferring sockets to signals for IPC. It's more straightforward and stable to use it. This article is one of a series I'm writing on IPC. Writing about them ensures I understand them better. I wrote one on sockets[1].

[1] https://goodyduru.github.io/os/2023/10/03/ipc-unix-domain-so...

loeg · on Oct 9, 2023

> A new pid is always more than the previously assigned pid during an OS uptime[1]

This is not universal. Pids can be reused in unixes. The footnote seems to be missing when I went to see if it provided this context.

> Signals are plenty fast. Cloudflare benchmarked 404,844 1KB messages per second. That can suit most performance needs.

Uh, how is Cloudflare sending 1kB messages with signals? Neither this article nor the linked Cloudflare post seem to elaborate on this.

toth · on Oct 9, 2023

If you look at the link to the Cloudfare post [1], they are comparing a bunch of different IPC methods. I bet what I did is for all the other ones (shared memory, mmaped files, unix sockets, etc) they did send 1KB messages, but for signals they are just sending the signal and associated int. It's the only thing that makes sense to me.

[1] https://blog.cloudflare.com/scalable-machine-learning-at-clo...

Denvercoder9 · on Oct 9, 2023

> Uh, how is Cloudflare sending 1kB messages with signals? Neither this article nor the linked Cloudflare post seem to elaborate on this.

Cloudflare links to the code they used; it doesn't actually seem to transfer any message (note that the README there also marks unix signals as broken): https://github.com/goldsborough/ipc-bench/blob/master/source...

jandrese · on Oct 9, 2023

That's a good question. Maybe set up 1kb of shared memory and you notify the far process that it is ready via a signal? Seems a bit clunky.

loeg · on Oct 9, 2023

Cloudflare's table has a separate row for shared memory, so, who knows.

generalizations · on Oct 9, 2023

In FreeBSD at least, PIDs can also be randomized.

goodyduru · on Oct 9, 2023

Author here. I've corrected the errors. Thanks!

dale_glass · on Oct 9, 2023

That's a bit misleading without mentioning all the mess that comes with signals.

Threads are a problem. Reentrancy is a problem, so you can't just call printf from your signal handler. Libraries are a problem, independent users of signals will easily step on each other's toes. There's a lack of general purpose signals, which compounds the problem. Signals are inherited when forking. Signals are not queued and therefore can be lost.

Signals carry no data, so they can't tell the program "This specific pipe closed", you have to figure out yourself which specific thing the signal is relevant to.

I'm probably forgetting more stuff.

TL;DR: Signals are a pain in the butt and IMO undesirable to use under most modern circumstances. If you have something better you can use, you probably should. It would be nice to have kdbus or something similar with more functionality, and less footguns.

butterisgood · on Oct 9, 2023

Signals are such a mess - especially with a multi-threaded process/service.

I suspect very strongly you'll get a much better behaved system if you stick to unix domain sockets.

Then, at least, you get some form of authn from the permissions of the filesystem namespace.

Joker_vD · on Oct 9, 2023

> We can start our processes with the same process groups and send signals to each other using kill(0, signum). With this, there’s no need for pid exchange, and IPC can be carried out in blissful pids ignorance.

Just make sure your processes don't launch any other processes not written by you: you don't know who else might be sneaky enough to use this method. Compounded by the fact that Linux process groups don't nest, this leads to very "funny" debugging sessions, when signals not supposed to arrive suddenly do or the signals that are supposed to arrive don't.

jeffbee · on Oct 9, 2023

Anyone who thinks they understand unix signals is fooling themselves. Anyway, the basis of the claim that you can exchange half a million small messages per second using signals is misunderstanding. The benchmark suite in question passes no data, it only ping-pongs the signal.

https://github.com/goldsborough/ipc-bench