> A new pid is always more than the previously assigned pid during an OS uptime
This isn't true. The pid increments most of the time, but occasionally the new pid jumps down to a seemingly arbitrary value to start a new range. It is switching between unused ranges. This happens on my Linux machines all the time.
However, it doesn't matter for what the article describes: it doesn't depend on the pid always increasing, it only depends on the pid for each process being unique.
Pid reuse doesn't matter often, but it matters occasionally with how sigals are often used. Many scripts are written to signal a process that might have terminated already, e.g. using a pid stored in an environment variable or file. This usually just returns an error, but if there has been enough system activity meanwhile, the pid may belong to a new process, which the script does not intend to signal, so a random process may be killed unexpectedly.
I feel a bit nerd sniped: On Linux PIDs are only unique within a PID namespace. But that doesn't matter because the entire point of a PID namespace is to isolate processes.
But it's worth mentioning because it can mitigate the problem you've laid out. If you have a long running process that may need to kill its children, you can start with within a new PID namespace such that it may only kill its children (or descendant).
D12238 sez “If set to a value of (100 <= kern.randompid <= (PID_MAX - 100)), it will be used as modulus when kern_fork.c tries to allocate new PID. Formula is lastpid += arc4random() % kern.randompid”
What's wrong with sockets? They have quite a few advantages for IPC over Unix signals.
- They're scoped within the program (no global signal handler that you need to trampoline out of)
- The number of IPC channels you can create is effectively unbounded (not really true, but much less limiting than signals). If another process or part of the program needs IPC you can just open a new channel without breaking code or invariants relied upon by any other IPC channels.
- Reads/writes can be handled asynchronously without interrupting any thread in the program.
- You can use them across the network (AF_INET), VMs (AF_VSOCK), or restrict locally (AF_UNIX)
- Unix sockets can be used to send file descriptors around, even if they're opened by a child after forking. That includes using Unix sockets to send the file descriptor of other sockets (eg program 1 is talking to program 2 on the same machine and program 2 opens an IPC channel with program 3 and wants to send it back to program 1).
I feel like just because you can use signals for IPC doesn't mean you should.
So, nothing is wrong with sockets. Sockets (both unix domain and TCP) are the overwhelming choice for IPC mechanisms in new code, have been for the last few decades, and will be when we all retire.
Nonetheless it's not uncommon to have a communication style where (1) messages are extremely simple, with either no metadata or just a single number associated with them, (2) may be sent to any of a large number of receiving processes. And there, signals have a few advantages: you don't need to manage a separate file descriptor per receiver, you don't need to write a full protocol parser, you can test or trigger your API from the command line with e.g. /usr/bin/kill. They're good stuff.
But do be aware of the inherent synchronization problems with traditional handlers. Signals interrupt the target's code and can't block to wait for it (they act like "interrupts" in that sense), so traditional synchronization strategies don't work and there are traps everywhere. If you're writing new signal code you really want to be using signalfd (which, yeah, re-introduces the "one extra file descriptor per receiver" issue).
Using a signal handler for IPC isn't really "simple" though, since the handler needs to be async signal safe itself. You don't need a "full protocol parser" for sockets either. You can send/receive C structs just fine. It's also not hard to write to sockets from the command line.
> But do be aware of the inherent synchronization problems with traditional handlers.
This is why you should almost never use signal handlers for IPC, because they're full of footguns and not actually simple to use.
Increasing the number of file descriptors doesn't seem like much of a burden. If your app actually pushes on those limits you need to be messing with ulimits anyway.
> You don't need a "full protocol parser" for sockets either. You can send/receive C structs just fine. It's also not hard to write to sockets from the command line.
Yeah, that's how you get vulnerabilities. For the same reason that "Thou shalt not write async signal handlers without extreme care", thou must never send data down a socket without a clear specification for communication. It doesn't (and shouldn't) have to be a "complicated" protocol. But it absolutely must be a fully specified one.
And signals, as mentioned, don't really have that shortcoming due to their extreme simplicity. They're just interrupts and don't pass data (except in the sense that they cause the target to re-inspect state, etc...). They're a tool in the box and you should know how they work.
Mostly the point is just "IPC is hard", and choosing tools isn't really the hard part. Signals have their place.
Author here. I agree with you on preferring sockets to signals for IPC. It's more straightforward and stable to use it. This article is one of a series I'm writing on IPC. Writing about them ensures I understand them better. I wrote one on sockets[1].
If you look at the link to the Cloudfare post [1], they are comparing a bunch of different IPC methods. I bet what I did is for all the other ones (shared memory, mmaped files, unix sockets, etc) they did send 1KB messages, but for signals they are just sending the signal and associated int. It's the only thing that makes sense to me.
That's a bit misleading without mentioning all the mess that comes with signals.
Threads are a problem. Reentrancy is a problem, so you can't just call printf from your signal handler. Libraries are a problem, independent users of signals will easily step on each other's toes. There's a lack of general purpose signals, which compounds the problem. Signals are inherited when forking. Signals are not queued and therefore can be lost.
Signals carry no data, so they can't tell the program "This specific pipe closed", you have to figure out yourself which specific thing the signal is relevant to.
I'm probably forgetting more stuff.
TL;DR: Signals are a pain in the butt and IMO undesirable to use under most modern circumstances. If you have something better you can use, you probably should. It would be nice to have kdbus or something similar with more functionality, and less footguns.
> We can start our processes with the same process groups and send signals to each other using kill(0, signum). With this, there’s no need for pid exchange, and IPC can be carried out in blissful pids ignorance.
Just make sure your processes don't launch any other processes not written by you: you don't know who else might be sneaky enough to use this method. Compounded by the fact that Linux process groups don't nest, this leads to very "funny" debugging sessions, when signals not supposed to arrive suddenly do or the signals that are supposed to arrive don't.
Anyone who thinks they understand unix signals is fooling themselves. Anyway, the basis of the claim that you can exchange half a million small messages per second using signals is misunderstanding. The benchmark suite in question passes no data, it only ping-pongs the signal.
This isn't true. The pid increments most of the time, but occasionally the new pid jumps down to a seemingly arbitrary value to start a new range. It is switching between unused ranges. This happens on my Linux machines all the time.
However, it doesn't matter for what the article describes: it doesn't depend on the pid always increasing, it only depends on the pid for each process being unique.
Pid reuse doesn't matter often, but it matters occasionally with how sigals are often used. Many scripts are written to signal a process that might have terminated already, e.g. using a pid stored in an environment variable or file. This usually just returns an error, but if there has been enough system activity meanwhile, the pid may belong to a new process, which the script does not intend to signal, so a random process may be killed unexpectedly.