SystemD: Can’t shut down with it, or without it…

After a few decades of doing something, you get a sense for what is likely to cause “problems” and just what kind of problem is likely to be sneaking up on you. Like a person who has been in battle a long time and just knows that the quiet is just ‘wrong’…

Well a lot of us ‘old hands’ have taken incoming rocks from those in love with SystemD, mostly for our ‘excessive caution’ about it. “Hey, it works fine!” they say. Which is true, right up until it doesn’t. It is in that “doesn’t” part where the Spidey Sense tingles…

https://bugs.launchpad.net/ubuntu/+source/mysql-5.6/+bug/1468804

Ubuntu
mysql-5.6 package

mysql stop/restart fails + breaking mysql + reboots
Bug #1468804 reported by David Favor on 2015-06-25

OK, so this issue has been around for about 1.5 years now. It hasn’t been resolved since the developer could not trivially reproduce it. That is, it is a pernicious bug that only shows up sometimes. Likely at about midnight when you were supposed to meet that tall blond at the bar…

Bug Description

[Status]

Cannot reproduce – see comment 20. Without steps that someone else can follow to reproduce the problem from a fresh system, we don’t expect to make any progress on this issue.

In other words: “Go away kid, you bother me.”

[Original Description]

This is a huge problem.

After several iterations of – service mysql {start|stop|restart} – the mysql service becomes broken + unrecoverable.

The problem becomes – service mysql stop – hangs forever, upon attempting…

exec systemctl stop mysql.service

What’s required then is one of these…

mysqladmin -uroot -p shutdown
pkill mysqld

Neither of these is acceptable.

The biggest problem is this breakage also breaks reboots.

The reboot sequence depends on services doing what’s expected, so when systemctl stop mysql.service hangs, reboot hangs too.

Because of the lunacy of the entire systemctl subsystem, sshd is correctly killed off while msqld hangs.

At this point, there’s no way to ssh into system + a power recycle must be done to complete a reboot sequence.

/bin/systemctl is an executable with no apparent way to debug it’s logic.

Please update this bug with details about how to debug /bin/systemctl + I’ll post output.

This is similar to that “send a blank message and hang the system” problem in an earlier posting. In this case, it is “hang mysql” and you can’t reboot your system… go pull the plug… (What do you do on those systems with interior batteries that can’t be removed… and no power cable… guess you just ought not to run mysql on a tablet or laptop with built in battery… or phone.. or…)

The “help” then goes on…

Thank you for taking the time to report this bug and helping to make Ubuntu better. Unfortunately, we cannot work on this bug because your description didn’t include enough information. You may find it helpful to read “How to report bugs effectively” […]

and on and on… in that usual “go read stuff and don’t bother me with your silly problem” way such things go…

From there, the comments continue with other things that fail to do anything useful…

Redneckery to allow reboots till this bug fixed don’t work because apparently systemctl is seriously brain dead.

So the normal shut magic won’t work…

echo ‘mysqladmin –defaults-extra-file=/etc/mysql/debian.cnf shutdown’ > /etc/rc6.d/K00-msyql-stop
chmod +x /etc/rc6.d/K00-msyql-stop

Will update when I determine a workaround.
[…]
apport-collect 1468804 does not seem to work correctly on headless (server) systems.

Gets into some infinite loop with lynx or whatever tool is running.

ovhfine# uname -a
Linux fine.bizcooker.com 3.19.0-21-generic #21-Ubuntu SMP Sun Jun 14 18:31:11 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
ovhfine# lsb_release -a
No LSB modules are available.
[…]
This doesn’t work either…

echo ‘mysqladmin –defaults-extra-file=/etc/mysql/debian.cnf shutdown’ > /etc/init.d/mysql-hard-stop

ln -s /etc/init.d/mysql-hard-stop /etc/rc0.d/K01-aaa-mysql-hard-stop
ln -s /etc/init.d/mysql-hard-stop /etc/rc6.d/K01-aaa-mysql-hard-stop
[…]
dirwatch /etc | grep OPEN | egrep ‘init.d|rc’ shows that shutdown scripts are no longer run sensibly either.

The K01mysql script never runs, so likely there’s some other shutdown script with a dependency which is touching mysql + hanging before the K01mysql runs.
[…]

Changing the /etc/rc{016}.d/K01mysql scripts to do mysqladmin shutdown doesn’t work, because inotifywait shows these files are never touched at shutdown.

It is a very sad tail of woe…

So basically if you have a mysql service running because you want an SQL database, and you do some ill defined number of stops and starts of it, the whole thing gets bollixed up to the point where you can’t do a reboot and can’t recover. Um, tell me again why I want the machine deciding what state to be in instead of the Systems Admin with Root Privs? (That’s one of the design goals of systemD…)

Sigh…

All I can hope is that enough folks eventually run into enough walls hard enough that they give up on it and roll back to {whatever} the rest of us are using… we, those hardy few, who have our heels dug in and are kicking and screaming…

End Note

Posted from my “prior to Arch Daily Driver”, a non-SystemD Debian with the Dirty Cow exposure still in the kernel. I think I can just swap kernels, but that will be for tomorrow… I’ve also found my prior Arch D.D. backup, and that chip copy is only from about 2 months back, so I’m not remembering any updates to the OS in that time. I think I can just “dd” it back onto the chip and be back where I started a couple of days ago. For now, I’m working fine and it will be about a 2 hour bit of copy and test to be recovered to running on Arch again. Then, once again, I can start working on washing the Dirty COW out of my Daily Drivers…

On a related note: I found my Alpine Linux chip, and was reminded it is good for headless things needing extra security, like a DNS server; so my present plan for the old Pi B DNS Server is to migrate it to an Alpine Linux base and see what I think of it.

Ah, the joys of Systems Admin in a semi-complicated personal shop… when bugs are in the air and dirty cows are milling about…

At least this settles one thing for me, though: With TWO demonstrated bugs that cause the system to hang into non-recoverable land, one of them a normal and customary Systems Admin operations (start | stop services) and the other doable by any user: No way in hell I’m going to use a “systemd afflicted” system for anything important where I can’t just pull the powercord. (Shades of Microsoft standard fix…) I’ll tolerate Debian Jessie or ARCH for a few weeks, months at most, while I “move on”, but that’s about it. (And only to get a non-Dirty COW kernel and even then only if it can not be retrofit into the prior non-SystemD version).

At present, the DNS (and similar headless servers) direction is going to be Alpine and / or Void. The Daily Driver Desktop is a bit harder to scope right now as it needs a whole lot more applications that work on it. Those tend to get tied to systemd… and the few non-systemd releases I’ve looked at often have painful X-Windows start-up processes and are nooby hostile. So that’s going to take a bit more to ‘make it go’ right. With luck, there will be ever more folks headed out into the OpenRC, or SystemV Init, or even rc.d BSD style init land, and I won’t be too alone.

Best candidates so far (in addition to Void and Alpine) are Slackware that stayed with rc.d from way back in the “System V consider it standard – and breaking things gratuitously” and potentially just going with straight BSD. I like BSD more for security and stability, but it’s a bear to get set up right (most applications come out first in Linux land these days… and X has always been a pain). I already have both running, without X, so I guess it’s time to go back to those ports and whack on X bring up for a while…

I also have a dozen or so ‘chips’ with varieties of Debian, Ubuntu, Fedora and similar that are just not interesting to me anymore. I’ll keep a couple of images so I can boot them if desired, but the old ones are Dirty COW afflicted and the new ones are systemD afflicted… so “blah” for all of them. Looks like about $100 of ‘chips’ that will be scrubbed and available for re-use shortly… (For a while I’d just ‘buy another chip, they are cheap’ so as to not waste my time, but now that they are cataloged, it’s short time cost to scrub the duds…)

I’m also going to explore using Puppy as more of a daily driver for basic tasks (postings, email, whatever). It doesn’t meet my needs for a professional environment (doesn’t have a lot of emphasis on things like compiler tool chains, database systems, distcc distributed build monster systems, etc.) but is good for normal user things like browsers and email. I do need to do a Dirty COW search on it, and then I’m pretty sure it’s avoiding systemD, but I need to verify that.

Essentially I’m ‘letting go’ of finding One Golden System to do it all, and instead doing a 3 way split. Alpine and Void both have the hardened kernel / musl / good for headless attitude suited to ‘small appliances’. Puppy is a ‘nice small and fast browser and email appliance’ sort of OS for those “only do a couple of things on this box” security isolated functions. Then there’s a harder search for an industrial strength OS for All Things Big and Complicated… Since my usual “big three” for that (Debian, Ubuntu, Fedora/ Red Hat / CentOS) have all run off to SystemD land.

At least, that’s the plan for now…

Subscribe to feed

Advertisements

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...
This entry was posted in Tech Bits and tagged , , , , , , , . Bookmark the permalink.

18 Responses to SystemD: Can’t shut down with it, or without it…

  1. pg sharrow says:

    The problem you describe seems to be be one My DELL Studio running Win7 Ultimate is having. Runs fine for some time and then reboots, seems to be locked into reboot, as soon as the check is done and the OS starts to load, shut down and then reboot. We have checked everything on the board and it seems to be part of the sleep/hibernation/start/shutdown software of the OS. We have changed everything including the mother board and power supply. If it is left dead for a few days and then started runs fine for days until it hiccups and back to rebooting. My son swears that the box is cursed ;-). I just tell him that Microsoft is a virus. Guess I will have to ask him about the OS installation and upgrades…pg

  2. Larry Ledwick says:

    EM you might have already found this but in case not, this might have some helpful info in it for your needs.

    http://without-systemd.org/wiki/index.php/Main_Page

    I just stumbled on it while searching, so have not looked in detail but it has a page on how to remove systemd from debian_jessie

    http://without-systemd.org/wiki/index.php/How_to_remove_systemd_from_a_Debian_jessie/sid_installation

  3. E.M.Smith says:

    @Larry:

    I think I’ve not seen those before. Thanks for pointing them out.

    I’ve seen something like the first one, but it was a while ago….

    I might try the debian removal process, but my experience on the Pi is that lesser explored paths tend not to be compiled or in the repositories… young ports are like that. Few hands, focused on just making the regular stuff work…

    It will take me a while, though. Between fighting a clogged drain, dishwasher and laundry paused until fixed so washing by hand… and part of the garage scattered around the living roon, and the spouse just announced having a cold, and Dirty COW, and a blown upgrade… and… lets just say I’m about two weeks behind where I was at the weekend… it’s been that kind of week.

    @Sera:

    Cute, very cute!

    @P.G.:

    Run a memtester on it, and either put it on a UPS or check power condition… Memtester exercises the cpu and memory but without the complicated OS in the way. Dirty power can kill you. One Pi that was sporadically rebooting was traced to a power supply that was marginal volts. Aftrr the boot, power demand would rise on using all 4 cores, volts sag, and… reboot…

  4. EM – though you are no doubt annoyed with the system problems, it at least makes me feel not so thick when I come up against them. Getting X up and running on a white box PC where the video card isn’t on the standard list was something I found difficult and it’s at least a bit of comfort that I’m not alone.

    There’s the standard knowledge that since C code is self-explanatory it doesn’t need comments added. Without those comments on what the code is intended to do, however, it seems to me that people lose track of the intention ad write great blocks of code that don’t do what is intended. I wrote a lot of assembler, though, where each subroutine needed to have a description of what it was intended to do (and not do) in order to make it understandable and debuggable. I suspect that the SystemD code loses track of what is intended to happen because of the supposed self-documenting feature of C. Plus of course there are likely a lot of kludges and short-cuts which can get into a race since they aren’t well-documented and it’s just growed.

    These days I don’t want to spend my time getting recalcitrant software to run but I just want the box to work and do what I tell it to (I want to do physics, not systems work). As such I ran Ubuntu for a while (now Lubuntu since Ubuntu got too fat and slow) and probably heading for Mint fairly soon. I also have an Android netbook for when I’m not at the desk, and it’s the most sworn-at system I’ve ever had. It doesn’t do the same things each time I tell it, with the touchpad sometimes moving the pointer down the screen and sometimes moving the screen while leaving the cursor where it is. I can’t see what I’m doing different when it reacts differently. It seems that the system ethos says that all apps that are available will be loaded and run at start-up, then put into sleep until they are used. It’s a poxy system rather than an advance on Linux.

    For a long time now, systems have had the ability to save the current state and hibernate, then on wake-up load the image from disk/chip in a second or two and continue. It seems to me that this image should be the standard boot system, with a fall-back to file-by-file initialisation only when needed after a hardware change. Load the image into memory and you’re up.

    Maybe the root of the OS problem is that the layers of complexity have become too deep (look at Github) for the majority of people to be able to hold a picture of the whole in their mind at one time and thus get things right. I used to find that each subroutine needed to be limited to around a printed page (60 lines or so) in order to be able to avoid bugs. When it gets longer than that, it’s worth splitting it so that each function becomes clear and you don’t lose track of what variables are being modified. Monolithic multi-page code chunks tend to have logic errors somewhere that are hard to find, especially when they’ve been through a few specification-changes and had a number of people change it, maybe re-purposing some variables in the process and not documenting the change.

  5. philjourdan says:

    @p.g. – Microsoft IS the virus (kind of like skynet). ;-)

    Have you tried updating the BIOS (UEFI now I guess)? Had a similar issue and that took care of it.

  6. beng135 says:

    pg, I had trouble bringing my HP Pavilion desktop out of sleep at times — ended up a power supply problem.

  7. @Simon My best experience on an Agile project was where every contract method that was written couldn’t be coded until the Javadoc was first created describing it. Made you think about what you wanted to do. FWIW, they did everything right (pair programming with pairs switching daily to keep the knowledge spread, 3-week iterations, full Scrum adherence – resulting in 1.5+ million lines of code with one to three minor defects generally after releases).

  8. Someone Hiding and Using Bogus Email Address says:

    I’ve had similar issue with a process blocking shutdown on a BSD. in that case it was due to a driver bug, it is not possible to kill processes in the middle of actually-writing-to-hardware, but the driver had a bug so it hanged.
    process was stuck in tstile and unkillable. rest of the shut down operations were not held by one process, it was waiting for this one to complete, and it never did.
    it’s probably possible to force shutdown by command, but it’s not pretty.
    it wants to first complete writes, then sync disk. so you are not syncing to disk and get an unclean shutdown.

  9. pg sharrow says:

    `@philjordon; …Hmmmmmm………. interesting
    Seems to me that the last time I “fixed” it, I forced the BIOS up and reset it. Maybe I need to try that again. That seems to be the point that the reboot happens at. Something in the boot read conflicts with the BIOS setting. The power management seems to be the place that everything points to, It is the place that triggers the cycling…pg

  10. tom0mason says:

    @EM
    We seem to be in sync on OS thoughts at the moment… I was just looking at Puppy again yesterday. I’ve used it before (years ago), it seemed very basic but IIRC, was remarkably robust as I was a very new NOOB back then, and did lots of silly things to it which were easy to fix. It even worked fast on the old IBM Thinkpad 600x (now RIP).
    From there I moved to the other non-systemd distro of PCLinuxOS. A really great OS but the not lightest when it comes to required resources or raw speed. Unfortunately this doesn’t help you because as far as I can see, there is no PCLinuxOS port or support pages for Raspberry Pi.

    Good luck with all of your upgrades.

  11. E.M.Smith says:

    At Best Buy today, I was going to just buy an SD card for a pristine install of Alpine on the Pi DNS server. There was a pack of 5 x 8 GB USB drives ( EMTEC brand ) for $20.

    Having discovered that the 1 core HP Laptop didn’t like to boot from an SD card in the slot it has, but will boot from USB, and remembering that the WiFi router can be a file server if you stick a USB drive into it, and $4 / USB being very cheap… I bought it.

    Consequentially my evening has wasted downloading and trying out OSs on it.

    First off, it’s an old 32 bit machine. Seems once again impatient developers are thinking every machine over 2 years old has been burned and buried and so are not making many 32 bit OS builds anymore. In particular, I was going to try PClinuxOS on it, but no, only 64 bit on their download page (and yes, I have old copies with 32 bit, but that Dirty COW bug means they are now junk…) Sigh. OK, I know I’m different for liking to “use things up”…

    Long story short, I’m typing this from a Debian Jessie build on the USB stick. It’s fairly nice in a typically Debian way. The slow nature of the USB does not seem to be much of an issue.

    I’m likely to put a couple of more OSs on USB for it (that Void release on an SD card could be freed up, and ti’s a nice fast 32 gig card…). The Pi M3 is busy downloading multiple variations of desktop versions for Void as I use this one for posting…

    Making this thing “more than Microsoft” seems to make it much nicer to use… So I’ll likely add it ot the Daily Driver rotation with some OS or another. It will also let me test drive some (whatever 32 bit builds exist) candidate OSs for the R.Pi for example.

    Oh, and a Knoppix build with a new kernel complained.. My guess is that it was expecting 64 bit…

    Oh Well.

    Sidebar on How To Piss Off A Customer:

    Best Buy wanted $32.xx, so I handed them $40 cash; being in a bit of a hurry… The Clerk asked me some irritating question ( frankly I have already forgotten what it was.. something like “Do you have our loyalty card?” (No…) “Do you want to sign up for our loyalty card (No. I want to pay cash and leave.) Point at cash ->$) Clerk then does something with the register and I’m directed to their POS (Point Of Sale or Piece Of Shit… hard to decide…) credit card terminal: “What kind of receipt to I want? ” I say to the clerk “I want to pay cash and leave. Paper would be fine since you need that at the door to show you actually paid for stuff since you don’t give out bags any more” (local ordinance – BYOBag…) No, says the cleak, push the button. I now have to interact with the POS I was trying to avoid by using cash. Paper, eMail, something else… I push “paper”…

    Time Passes…

    “Do you want to be on our…” (No, I want to pay cash and leave.)… Now I need to re-face the POS terminal and find whatever button says “No, I want to pay cash and leave.”… as my readers are at home (not needed to drive), staring at a tiny POS is not making my day… and no I don’t want to give them my email address… I push the ‘go away button’… Time Passes…

    Now some OTHER bit of crap comes up on the POS terminal… I’m not sure what it was as I turned to the clerk and said “I want to pay cash and leave, can you make that go away?”… NO. Push the Red Button… Sigh… “No, I don’t want to swipe a card, wave my phone, etc. etc. Please tell the Manager that if they keep this up I’m not going to bother shopping here any more.” Clerk: “Would you like me to get a manager?” Me: “No, I want to pay cash and leave. Just TELL your manager this is a lousy customer experience.”…

    Time Passes…

    Eventually I get my change back, my paper receipt, and I leave. Overall transaction time WAY too long for someone who just wants to “Pay cash and leave”.

    I’ll not be going back to Best Buy any time soon. If I want to interact that much with a computer screen, I’ll just order from Amazon…

    Sigh. I was inside the “going to just pick up my money, set down the product, and leave” threshold and was contemplating just that when the clerk picked up the money and turned to the register…

  12. Gail Combs says:

    E.M.

    If you want to have a bit of fun with Best Buy and teach them a lesson too…. Have all of us go in and pick out some stuff we actually want. (I am shopping for a new computer) bring it up to the line WITH CASH and when the clerk pulls this intrusive crap we can throw a LOUD hissy fit leave all the stuff and stomp out of the store.

    It is coming up to Christmas so this is a great time to stage this small rebellion against the BORG.

    I did this to a Dodge dealer who refused to fix my BRAND NEW car and cleaned out their entire sales room.

    (Stores count on their customers being polite and putting up with this intrusive crap.)

  13. E.M.Smith says:

    @Gail:

    As Mum was a Brit, I’m prone to “excessive politeness”. I also know that this “crap” is being forced on the local stores (and clerks) by Borg Central. So my method is simply to “vote with my feet”.

    FWIW, I was only in there since it was right next to Walmart and that particular Walmart just can’t get enough folks hired (signs up trying to hire…). I was going to buy the SD card there, but with 4 people in line at the “Photo” counter just to get to the point where I could hand in the “chit” to get the actual device, I decided “Heck with waiting, I’ll just go to BestBuy on my way out”. (That as the clerk said to the person at the front of the line: “I’m new, so I’m going to go ask how to handle this one, I need to find out how to handle the unusual things.” and headed away from the desk…)

    Well, Best Buy was substantially empty. There were about as many “customers” as staff. There was ONE check out clerk on duty and NOBODY was in line…

    So basically a “fit” would get little attention from the nobody there… and the staff would just think me a crank.

    Easier to just “let folks know” what to expect there and then to let their falling sales deliver the Clue Stick. If they are too dumb to “get it”, they will go out of business or be taken over.

    For me, I’ll just be stopping at the OTHER Walmart for SD cards… the one that’s newer and has plenty of staff ;-)

    Shopping choice – a dish “best served cold”…

  14. cdquarles says:

    @EM, at my local Wal-mart, they’re hiring also. The nearest Best-buy is 30 miles away. There are two more Wal-marts between me and that Best-buy. Funny thing about Wal-mart, though, is that if you score “too honest” on their psychology profiles, you will not get a job. That happened to me and my sister. Wal-marts are always busy, even late-night out here in the sticks. That Best-buy was not busy at all.

    I was in the city (Birmingham) earlier on business and stopped by the Best-buy mentioned above. A storm knocked the power out. Their registers were not on emergency power. Ugh. I do have their loyalty card. When I lived in Huntsville, there was one in walking distance of the condo I was living in. That one, and the one in Tuscaloosa were always great places to shop. The one in Tuscaloosa used to be near a Sam’s Club and a Wal-mart, if I am remembering correctly.

  15. philjourdan says:

    BB is the last of a dying breed. The Internet is killing it. I rarely go in myself. I remember one time going in to buy a Blueray DVD for my wife. The guy tried hard to upsell me. What he did was lose a sale – not because of his pressure, but because I realized I had no clue about the latest in Bluerays! SO I went home, researched it, and then bought one on line (that is the best buy I have ever made!).

    I do not have to “feel” a DVD player, nor a TV. And so I rarely go into Best Buy unless I need a special cable ASAP (and the local mom and pop shop is not open).

  16. philjourdan says:

    @CD – Re: Hiring at Walmart – while I have not applied there (and do not plan to unless my retirement goes bad), I recently hit a milestone nearing retirement. As such, when my cousin asked if I could be a “Greeter” at his mother’s Mass of the Resurrection, I eagerly said yes. I told him I am now qualified to be a Walmart Greeter, so would be happy to pick up some experience!

  17. Pingback: More SystemD Bites On the A…? | Musings from the Chiefio

Comments are closed.