Nice post and +1 for having a small "hardening" section.
I wish that every systemd example/sample/template came with _extensive_ hardening, since I find it quite confusing. I've used systemd-analyze security <SERVICE> to try to figure out what was needed. For Elixir, I've come up with:
At the beginning I wanted to add all that information and options, but I thought that it can be overwhelming in this article. I wanted to focus on Erlang <-> systemd communication and basic options.
However it may be nice follow-up article where I will describe full hardening process.
'systemd-analyze security' is a fine thing, just make sure to use the latest version of systemd because it was really buggy in the past. For example, the version that ships in RHEL 8 is so buggy it's practically useless.
I came up with this for most of my services that do require a JIT compiler (so Java, dotnet, etc):
You might want to throw away InaccessiblePaths if your application calls external binaries. The stuff I typically write shouldn't do it.
Some of these flags are not strictly necessary because they should be enabled by other switches, but I prefer to keep them to make configuration more obvious and to mitigate possible bugs (there were some in the past).
If your application needs to store anything locally, add some combination of these:
and you can read the resulting path in environment variables RUNTIME_DIRECTORY / STATE_DIRECTORY / CACHE_DIRECTORY / LOGS_DIRECTORY / CONFIGURATION_DIRECTORY if your systemd is new enough.
systemd will make sure that your limited user can read and write these paths, including their content.
Add this if your application does not use a JIT compiler:
MemoryDenyWriteExecute=yes
And this to prevent it from listening on wrong ports in an event of misconfiguration.
SocketBindDeny=any
SocketBindAllow=tcp:5000
These firewalling flags can be useful if your service does not do much networking to external APIs:
With that many options it seems easy to miss something. A whitelist approach would be preferable. The application failing because you missed something would be more obvious than some subtle security hole.
This leaves me wishing there was some kind of report-only tooling where you could do something like, say, freeze the system calls used in a test run to reduce trial-and-error, similar to how you can use SELinux with audit2allow to get a decent starting point. Does anything like that already exist?
Oh, I didn't know about `SocketBindDeny=` and `SocketBindAllow=`. This option may be a little troublesome in case of Distributed Erlang, but in recent versions it can be circumvented. Thanks, I will add it as a better option than adding capabilities.
> Different tools take different approach to solve that issue there. systemd take approach that is derived from launchd - do not do stuff, that is not needed. It achieved that by merging D-Bus into the systemd itself, and then making all service to be D-Bus daemons (which are started on request), and additionally it provides a bunch of triggers for that daemons. We can trigger on action of other services (obviously), but also on stuff like socket activity, path creation/modification, mounts, connection or disconnection of device, time events, etc.
Does this require a good understanding of D-Bus to understand? Because I got completely lost on this...
I'm not sure the author has an understanding of what D-Bus is. The article doesn't really discuss D-Bus at all.
D-Bus is not merged in to systemd. When systemd notices that a service it started is named "dbus", it then registers itself as a D-Bus service; programs (like `systemctl` or `poweroff` then use use D-Bus to send it commands. For comparison, sysvinit accepts commands via a named pipe (`/run/initctl`, or `/dev/initctl` for older versions of sysvinit).
Most services are not made to be D-Bus daemons. A service is only made to be a D-Bus daemon if its unit file says `Type=dbus`; none of the examples in the article say this. You can use D-Bus to ask systemd to start or stop a given service; same as you could use `telinit` to use /run/initctl to tell sysvinit to change the runlevel; this does not make a service started this way a D-Bus service.
The NOTIFY_SOCKET and watchdog functionality that the article discusses... this functionality has nothing to do with D-Bus.
This article? Not at all. Does you need to understand D-Bus to use systemd? Not at all. It is just implementation detail. I have an idea to leverage that in future to slap distributed service management on top of systemd, but in normal operations you will probably never spot that everything is D-Bus backed.
JFYI, with podman [0] you can get all the security benefits mentioned in the article with containers.
This is pretty neat and I love how cleanly the application code reads. I’m curious: is the Erlang VM super fast? I would have expected VM startup time to dominate the overall time to start.
The Erlang VM is plenty fast for starting stuff up that is long running, but I wouldn't use if for things that you need to launch 100's of times per second. But that's also not how it is intended to be used. As for the other kind of fast: number crunching performance is not what Erlang is for. It's for long running complex applications with a lot of moving parts, redundancy and extreme reliability.
I know about Podman, I just wanted to focus on systemd without additional tooling.
Erlang VM startup is ok, but it is not ultra fast, and it can be easily slowed down with many modules in releases or slow applications. Additionally as it was said - Erlang works best in case of long-running instances where the VM is handling spawning and managing of short lived internal processes.
First draft of this article also included the socket activation and FD passing section, but these were making the article way to long, so I moved them to the Part 2 where I will have more space for them. With socket activation Erlang VM startup time is negligible problem.
Yes it does start fairly quick, nothing like the JVM. It's not nearly as fast while it's running, though. It only recently got a JIT, and currently doesn't do any optimization of the generated code.
No, because these states are set on shutdown, and processes are killed in reverse order. So first the `draining` message will be set and then `drained`.
This won’t tell you a host died; a distributed service needs remote monitoring.
Is there a standard protocol for either health probes or watchdog timers? It seems like people aren’t defaulting to SNMP like they used to, and I haven’t heard much about RFC 6241 NETCONF which might be intended to replace it. At work we just probe health with trivial RPCs, but it’d be better for the industry to converge on something.
I wish that every systemd example/sample/template came with _extensive_ hardening, since I find it quite confusing. I've used systemd-analyze security <SERVICE> to try to figure out what was needed. For Elixir, I've come up with:
Plus the use of TemporaryFileSystem and BindPaths to limit the file system.