With a h/t to Dave Halliday here:
Interesting development in systemd-land:
We have this lovely gem from that link:
How to Crash Systemd in One Tweet
The following command, when run as any user, will crash systemd:NOTIFY_SOCKET=/run/systemd/notify systemd-notify ""
After running this command, PID 1 is hung in the pause system call. You can no longer start and stop daemons. inetd-style services no longer accept connections. You cannot cleanly reboot the system. The system feels generally unstable (e.g. ssh and su hang for 30 seconds since systemd is now integrated with the login system). All of this can be caused by a command that’s short enough to fit in a Tweet.
Edit (2016-09-28 21:34): Some people can only reproduce if they wrap the command in a while true loop. Yay non-determinism!
The bug is remarkably banal. The above systemd-notify command sends a zero-length message to the world-accessible UNIX domain socket located at /run/systemd/notify. PID 1 receives the message and fails an assertion that the message length is greater than zero. Despite the banality, the bug is serious, as it allows any local user to trivially perform a denial-of-service attack against a critical system component.
The immediate question raised by this bug is what kind of quality assurance process would allow such a simple bug to exist for over two years (it was introduced in systemd 209). Isn’t the empty string an obvious test case? One would hope that PID 1, the most important userspace process, would have better quality assurance than this. Unfortunately, it seems that crashes of PID 1 are not unusual, as a quick glance through the systemd commit log reveals commit messages such as:
* coredump: turn off coredump collection only when PID 1 crashes, not when journald crashes
* coredump: make sure to handle crashes of PID 1 and journald special
* coredump: turn off coredump collection entirely after journald or PID 1 crashed
Systemd’s problems run far deeper than this one bug. Systemd is defective by design. Writing bug-free software is extremely difficult. Even good programmers would inevitably introduce bugs into a project of the scale and complexity of systemd. However, good programmers recognize the difficulty of writing bug-free software and understand the importance of designing software in a way that minimizes the likelihood of bugs or at least reduces their impact. The systemd developers understand none of this, opting to cram an enormous amount of unnecessary complexity into PID 1, which runs as root and is written in a memory-unsafe language.
Some degree of complexity is to be expected, as systemd provides a number of useful and compelling features (although they did not invent them; they were just the first to aggressively market them). Whether or not systemd has made the right trade-off between features and complexity is a matter of debate. What is not debatable is that systemd’s complexity does not belong in PID 1. As Rich Felker explained, the only job of PID 1 is to execute the real init system and reap zombies. Furthermore, the real init system, even when running as a non-PID 1 process, should be structured in a modular way such that a failure in one of the riskier components does not bring down the more critical components. For instance, a failure in the daemon management code should not prevent the system from being cleanly rebooted.
In particular, any code that accepts messages from untrustworthy sources like systemd-notify should run in a dedicated process as a unprivileged user. The unprivileged process parses and validates messages before passing them along to the privileged process. This is called privilege separation and has been a best practice in security-aware software for over a decade. Systemd, by contrast, does text parsing on messages from untrusted sources, in C, running as root in PID 1. If you think systemd doesn’t need privilege separation because it only parses messages from local users, keep in mind that in the Internet era, local attacks tend to acquire remote vectors. Consider Shellshock, or the presentation at this year’s systemd conference which is titled “Talking to systemd from a Web Browser.”
The article then goes on to several other interesting tech bits. I’m leaving most of them for you to read in the link. This DNS one caught my eye. Why? Because DNS is absolutely critical to security and performance and is best run in a dedicated and very locked down secure way. If someone can pollute your DNS they can, at minimum, break system to system communications, at worst, direct you to their spoof system and get you to enter your credentials and steal all your permissions (which, if a priv user sys admin type means steal the citadel…)
Consider systemd’s DNS resolver. DNS is a complicated, security-sensitive protocol. In August 2014, Lennart Poettering declared that “systemd-resolved is now a pretty complete caching DNS and LLMNR stub resolver.” In reality, systemd-resolved failed to implement any of the documented best practices to protect against DNS cache poisoning. It was vulnerable to Dan Kaminsky’s cache poisoning attack which was fixed in every other DNS server during a massive coordinated response in 2008 (and which had been fixed in djbdns in 1999). Although systemd doesn’t force you to use systemd-resolved, it exposes a non-standard interface over DBUS which they encourage applications to use instead of the standard DNS protocol over port 53. If applications follow this recommendation, it will become impossible to replace systemd-resolved with a more secure DNS resolver, unless that DNS resolver opts to emulate systemd’s non-standard DBUS API.
And folks wonder why I hate SystemD with a passion… Designed wrong, implemented badly, doing things it ought not do, in ways that are broken. And that’s just at first glance… now we know that anyone can hang your system in a non-recoverable state and the DNS can be poisoned. Oh Joy. /sarc;
Guess it’s time for me to put a bit more effort into that non-systemd R.Pi source build project…
I can just see college kids world wide enjoying the fun of handing the campus servers, locking up their roomies machines, and generally creating mayhem. Heck, just put that line of text in a command name, oh, ‘dir’ to catch the Windows Rejects or ‘lsl’ for those who are used to having that aliased… put that in a normal command directory, then walk away… Over the course of weeks, at various time, it will trigger, the system go unstable and systemd lock up. Happy debugging…
Then there are the opportunities for causing grief in ANY critical system running on a systemd box. One trivial command insertion that does not need root privilege and your hospital email or medical records servers go down… or patient monitoring gear if it is based on this.
For the tech types, more in the link.