Not exactly a bite on the butt, but an unexpected behaviour in any case.
So I’ve moved to Devuan to avoid “issues” with SystemD.
Why am I reporting an issue with SystemD?
I decided to make a backup copy of my working Devuan chip as my “Daily Driver”, having turned my prior Daily Driver into the Headend chip… so I need to clone that one and ‘backout’ some of the cluster specific things. Isolating the two sets of information. To do that, it can’t be running. Which means something else must be running… which was my old Arch Daily Driver as the fast system of choice.
OK, preparatory to the Chip archive / restore, I decided to clean up the disk space a little. I regularly have done this for years on all sorts of systems using a scriptlette I call DU.
Only the first line is active, the others are left in so you can see some of the evolution over time:
du -BMB -s * .[a-z,A-Z]* | sort -rn > 1DU_`date +%Y%b%d` & #du -ks * .[a-z]* .[A-Z]* | sort -rn > 1DU_`date +%Y%b%d` & #du -ks * | sort -rn > 1DU_`date +%Y%b%d%H%M%S` &
The -B option says to set a blocksize to count and MB means millions in powers of 10 (so not needing to deal with 1024 powers…) and to not miss files in the current working dirctory starting with a . that usually means “don’t display and don’t let * grab them either”, then sort it in reverse numeric (so big is on top and 10 sorts before 2 instead of with 1 …) and stuff it in a file named 1DU_{today’s date} the ‘1’ causing it to short to the head of most “ls” listings.
Note that the & at the end launches it as a ‘background job’. A CTL-C will not kill it, you must note the process ID at launch (PID) and do a “kill -HUP PID” in a working terminal window to stop it. This, it seems, may have mattered…
Now I’ve done this for years. On all sorts of systems. Big ones. Little ones. TByte disks. Fast and slow. It bogs down any given disk but not the system. Ever. Disk I/O is always glacial compared to the system.
But not this time…
I launched it on the Pi, against 3 disks. 2 x 1 TB ext4, and 1 x 300 MB ext4 that also had a swap partiion on it. A real disk swap partition makes your SD chip last longer and makes the system faster.
Well, seems all that disk I/O bogs down the Pi integrated I/O chip (used for all things USB and networking). O.K., I can live with that…
Then the system seemed to hang. No screen updates. No cursor movement. No nothin’. Had we another systemD ‘hang the system’? Perhaps, I thought.
So I looked over at the ‘top’ panel. It wasn’t updating. It showed the DU / du process in D Diskwait status. OK, to be expected. Scanning down, systemd was also D diskwait. As was the kswapd daimon.
I have no screencapture of this as I couldn’t get a terminal window to respond to type ‘scrot’.
Well, were we in a lockout? SystemD unable to swap due to diskwait and diskwait unable to release due to lack of swapping for systemD? Who knows – I think. So off to get a cuppa’, think about it, and watch a bit of news…
Coming back about 10 minutes later, systemD still D diskwait. About 20 minutes later, it was not on the page of processes in top… Clicking in windows, hitting return, etc. still gave me nothing. Watching VERY closely, after a while, I could see a slow line of bits being re-written down the ‘top’ panel. It WAS running, just very very slowly.
I went off for lunch and more news…
Eventually, a few hours later, the machine is running normally again.
My Surmise
And here we get a leap off the cliff of conclusion: I surmise that the way SytemD is written is broken in that it is dependent on swap. Either it is so big, it isn’t memory locked, or it is so chatty it must have responsive disk. For “The Unix Way”, certain critical functions are memory locked and can not swap out, and typically don’t need disk to run. Some of those functions, I would speculate, are now in systemD and it is not being locked into memory (or writes into so many files and touches so much stuff it can’t effectively run without rapid disk access nearly instantly).
Which ever way this falls out, it would imply that if swap is placed on very active disks, you will sporadically lock out systemD functions and that will then cause your system to run like molasses in January or potentially hang altogether (though honestly, had I not patience for hours and a keen eye and curiosity, I’d have assumed my system was hard hung in the first 3 minutes of “unresponsive nothing happening”.)
All those decades of system tuning in Unix / Linux land to find just what must be locked into memory and NOT dependent on disk I/O or swap tossed out in the rush to make SystemD the Swiss Army Knife of system functions… so now heavily loaded disk can cause your system to freeze up for the duration of the disk saturation.
OK, that’s my surmise. In any case, it was interesting. Once I’ve got my Daily Driver Devuan chip made, I’ll reboot the same hardware / disks and run the same command and see if it too locks up, or just has slow disk…
Sorry E.M. you are being hemispherist again. Here in Aussie land the viscosity of molasses is strongly influenced by the 35 C temperature in January. I know because I feed it to my chooks. Sorry, I know all the World loves a smart arse. :)
Your diagnosis of your issue sounds right to me. When I did it for a living, always put swap space on the least used/fastest drive. Do you remember drums?
@Andysaurus:
Are you sure I’m not being Canio-polar Inverted? Head up my, er, south polar? ;-)
That’s the problem with old aphorisms you learned 1/2 a century ago used in a global internet world…
Drums? Yup. Had ’em at Amdahl. They were “special” and not everyone was allowed to use them…
In this case I have a 500 GB Seagate Hard Disk that is divided into something like 8 paritions.
/etc/fstab and not showing the 4th partition:
As you can see, 4 of them are used for particular OS images, so generally idle unless I’m playing with that particular release. Gentoo, Slackware, Linux From Scratch, Debian. Then there is a general purpose ext4 partition. I saved the contents of the disk ‘as shipped’ in a shrunken ntfs partition that is also generally idle.
So unless I’m fishing about in some archives in /SG/ext or running one of the other OSs (that I’m not at the moment), it is just swap.
That’s part of what got me. Launching a general “count the size of files” that only hits the inodes not the data blocks, over 3 disks, ought to only cause very sluggish I/O, not cause your system to seemingly lock up and have fully memory resident “top” and X display unable to display…
Basically it isn’t just the program waiting for swap that gets hung, EVERYTHING gets hung waiting for systemd who is waiting for swap… since systemd has fingers in every pie…
But we’ll see if that’s true for sure in about an hour when I’m ready to try it under Devuan or Wheezy…
Well, that pretty much proves it…
Running Devuan, first off, didn’t want to swap until I opened draw, calc, and image viewer along with GIMP and a browser with about a dozen tabs. FINALLY got swap up to 13084 blocks used and swapping, with 3 very large disks being DU’d (one of them the swap disk) and now I’m typing this message as it all is going on.
Now THIS is the *Nix behaviour I expect (and remember). Those disk hog programs just chewing there way through the disks, not upsetting operations at all…
SystemD bites the big one, A-gain…
Just amazing, really. Someone launches some disk intensive activity it can lock up the system until disk congestion abates… which can take a while as it is running sloooowly…
Well, with that, I’m back to closing all those applications that are in the way… but not doing anything bad as they are swapped out…
Any chance it is caching the result of the count and that is going into swap? Could explain the symptoms maybe?
By the way, I am quite careful about the language of my birth, and I am always impressed with your use of it. You got the wrong homonym in the following:
Sorry.
andysaurus says: “Sorry E.M. you are being hemispherist again. Here in Aussie land the viscosity of molasses is strongly influenced…”
It is the Eve before the anniversary of the Great Molasses Flood.
Great Molasses Flood of 1919: Why This Deluge of Goo Was So Deadly
Great post Gail, what an amazing coincidence.
Hi Chief. If putting all the various init files into systemD was such a good idea why not separate it into 3 system D files, D1 D2 D3. Where the D1 has all the memory lock applications. Just saying haha.
@Pearce:
IMHO none of it is a very good idea. But yes, having it only locking up all the time in 1/3 of the processes would be a clear improvement…
@Andysaurus:
Any use of file systems drives increases in memory use for “buffers” (pointers to stuff) and “cache” the stuff. *Nix holds this dear as it is likely to be used again. So when it’s holding too much, it can drop the old or just swap a chunk out. (“swappiness” is a boot time parameter to tell it ‘just don’t swap” to “swap most and only dump if forced” at the other end and all points in between). So “no” you can’t do what you proposed, though there is likely a ‘fix’ in forcing it to just not swap. But then you lose the great benefit of swap space to memory management…
I often have an “issue” with there their they’re etc. etc. Being 1/2 deaf has made it worse, and they layering on a few more languages broke it altogether. When typing, I’m often going so fast I’m not thinking about spelling at all. Just the sounds in the mind and the fingers fly. Unfortunately, Otto (Otto-Pilot…) isn’t as good at diction as the parts above the brain stem. Usually I catch them when I proof read and those parts of the brain are involved. Sometimes not (worse now that vision is also suggesting glasses a good thing, but I’m often not using them…).
I first noticed it getting “worse” after being stone deaf for several months (prior to the eardrum reconstruction). I started having a ‘deaf accent’ both in speech and in writing. Leaving out all sorts of letter that no longer existed for me. After a few months of training myself to ‘speak tactiley’ I got my accent back to clean. Yet typing things you never hear stayed an issue. After hearing recovered to “only” 30 DB down (worse on high notes like some female voices) some of the problems with speaking and typing cleared too, but not all.
Note that NONE of this indicates any failure to understand the differences, failure to know the differences, failure to choose correctly. It’s just one big “going fast transcription error”.
Similarly, I’ll try to type “buy” and it comes out alternately “by” or “but”. Why? I type them more often and Otto “goes there”. Oh Well.
Don’t dare ask me to spell appartement apartemant appartmente whatever… French does it one way, English another, then Spanish and… really, what difference does it make? Olde English is one of my favorite languages too, so colours or is that colors? things…
In short, I refer you to my English and German teacher, Mark Twain…
http://www.mantex.co.uk/2009/10/26/spelling-reform/
Oddly, I can read that just fine, even on my first pass through years ago…
https://www.cs.utah.edu/~gback/awfgrmlg.html
Expresses my sentiments toward German well too (to two tu zu du 2 the echo goes off…)
So (sew soo zo) forgibt me mine pecadillows mit der Englash Ich habe spract it in seberal forms for yars und yars und somtimish I vorget yust vich von ist wat. Dat itsh only 1 per few hunert tousand voids I type in airor, ist un micracle.
The straight (strait strate) jacket of “approval” need not apply…
I have the same stream of consciousness transcription errors problem when typing. Occasionally I actually notice as it happens like I am thinking know and what ends up on the screen in now. Some of that is due to different keyboards I use, and failure to fully stroke certain keys in certain key combinations, others are just muscle memory capture taking over.
I read a lot was a book worm and had a huge spoken vocabulary but never heard many of the words spoken. I was never very good in spelling in school I have difficulty hearing (even as a young kid) the very subtle differences in vowel sounds. I grew up in a period when they were actively dropping all those handy memory riddles (I before E except after C etc.) we were just expected to memorize long lists of word spellings (which drove me nuts and pissed me off as a huge waste of time and effort — I know where the dictionary is if I need to spell a word I rarely use)
I don’t know how many times I have sat at the keyboard trying to get close enough to the proper spelling of a word I want to use that spell check will show me the right spelling – be able to find it in a dictionary (which I will recognize instantly), but for the life of me I cannot recall how to spell the word from memory.
I hear the word in my mind but do not visualize it, and often literally cannot spell a word I use all the time in spoken exchanges. Sometimes I give up in disgust and use a synonym (often having to restructure the sentence to make it sound right). Other times I do a google search for a short phrase using the word I want to use (as close as I can get to it).
The Google search engine is very good at figuring out from context exactly what word I am trying to use.
Then once I see it properly spelled in the suggestion I realize I tried just about every possible combination of the letters except the right one. It is just a “how the brain works” thing I guess like I can remember and recognize people I met years and years ago but have absolutely no clue what their name is.
I think I fit the 17th century english approach with no standard spelling for words, and you just take your best stab at the wording and spelling and as long as other people understand you, that is good enough. I frankly don’t give a crap if I affect someone or effect someone, in fact for years I thought affect was a misspelling of effect. (I must have been out of school the day they covered those words)
Thank you for your comprehensive response to my minor cavil E.M. If that is what is meant by ‘turning the other cheek’ then I can assure you it works. My typing issues are caused by my name. I find it hard to type the word and without adding a y to the end of it. My surname is Boyd, so I have a similar issue with the word boy.