Raspberry Pi / Devuan with OverlayFS Write Protection

Introduction

This ought to work equally well on Debian, Ubuntu and many other OS releases.

The purpose is to lock down the system against file changes by system crackers, intrusive scripts, and other programs or web sites, while providing a user account for web browsing that is also relatively immune from cookies, beacons, trackers and any other sort of intrusive “hack”.

We will be moving the various system directories (and all their programs and configuration files) into read only compressed form in a squashfs file system. Then will provide a writable “overlay” that is on a temporary file system (using tmpfs) that will simply evaporate when you shut down or reboot. Restoring the system to the original state.

You can still use a regular account in the regular way and even use the base system as a typical writable file system. You issue a command to move to the read-only versions when you are ready to lock things down. They system then stays that way until rebooted.

Essentially this makes your Raspberry Pi a lot like a “live CD” system, but without using a CD drive, a USB thumb drive or needing to boot a special operating system.

Differences

I’d already done this on the Odroid XU4 running Armbian. It uses aufs for the union or overlay file system. The Pi supports using overlayfs instead. There are differences between the two. Anyone wishing to see the prior example, that is a bit more scattered as I was learning while writing, can look here:
https://chiefio.wordpress.com/2019/08/13/mounting-read-only-compressed-system-spaces/

Well this has been sort of fun, I guess. The “syntax” of overlayFS is not as, um, “predictable” as one might like. Not too horrible, but just strange. In particular, one must have a “work directory” that can’t be in the same subtree as the mounted file system but must be under the same mount point…

In practice what that means is that say you wanted a tmpfs filesystem to overly /home so you named it /tmpfs/home (as we sort of did for aufs) and then try to mount that as the overlay file system you will get errors if you use either /tmpfs/work or /tmpfs/home/work as your “workdir” directory. The /tmpfs/work isn’t inside (under) the mountpoint of /tmpfs/home and /tmpfs/home/work is not in a different subtree from /tmpfs/home.

The fix, of course, is to add another level of names to the whole shebang and make all your stuff longer and more complex.
“Why? Don’t ask way. Down that path lies insanity and ruin. -E.M.Smith”

So just ignoring “why” and moving on… I’ve decided to use ./tmpfs/home/fs and ./tmpfs/home/wrk as the file system proper and the work directory naming convention.

This also means that for newly created tmpfs file systems you must ALWAYS create those directories. Adding another step to the mix. So far I’m liking the way aufs works better as it is simpler and more intuitive. But on Devuan on the Pi your choice is overlayfs unless you want to build your own kernel (near as I can tell – attempts to install aufs complain about needing kernel headers.)

So just accept that it will be different, a bit kludgy, and you will NOT ask why…

Make A Pristine USER ID

As the goal is to have a pristine base user id that gets squashed into a read only squashfs and has a volatile writable overlay file system that takes all the trackers, beacons, cookies, etc. with it when you reboot, you start with making it.

root@PiM3Devuan2:/sq# adduser ems
Adding user `ems' ...
Adding new group `ems' (1000) ...
Adding new user `ems' (1000) with group `ems' ...
Creating home directory `/home/ems' ...
Copying files from `/etc/skel' ...
Enter new UNIX password: 
Retype new UNIX password: 
passwd: password updated successfully
Changing the user information for ems
Enter the new value, or press ENTER for the default
	Full Name []: E.M.Smith
	Room Number []: 666
	Work Phone []: 6666
	Home Phone []: 666-6666
	Other []: 
Is the information correct? [Y/n] y

At this point I strongly recommend that you login to the account and set any things you like to have a given way. Browser security and privacy settings. Background colors and images. Custom tools you use. Etc. Once you squash it, you will be stuck with exactly and only that stuff until you do log in native, change it, and re-squash it.

I won’t show those customization steps here as they will be different for each person.

Putting Squashed Images and Mountpoints Somewhere

It’s always good to have a system to your names. Feel free to create your own. Here’s what I chose to use. There isn’t much magic in these choices so if you con’t like them, choose something else.

I chose to use a disk partition named /sq short for “squash” as my place to put things. Here’s what’s in it:

root@PiM3Devuan2:/sq# ls /sq
lost+found  mnt  sq  tmpfs 

I’m using ext3 since both ext4 and xfs have had surprise incompatible uprgades that made moving disks between different OS levels a PITA. Thus you get the “lost+found” directory. I put squashfs file system images in /sq/sq and the temporary tmpfs file systems show up in /sq/tmpfs while any other odd mounts go into mnt. So when a squashfs image from /sq/sq gets mounted, it goes in /sq/mnt.

These tmpfs directories are made by the mounting script when the overlay is being done, but it’s worth seeing them:

root@PiM3Devuan2:/# ls /sq/tmpfs/*
/sq/tmpfs/home:

/sq/tmpfs/root:
fs  wrk

/sq/tmpfs/var:
fs  wrk

Here you can see I’ve hand made three directories in /sq/tmpfs for home, root, and var. For /home, we’ll do that one down below as an example.

For /root (the home directory of the superuser) I’ve already done it as the test case. In the process, the script created the sub-directories of “fs”, for the actual tmpfs file system space, and “wrk”for the overlayfs workdir. You will see that in the “protect” script below. The script that launches all this I’ve chosen to call “protect” as that’s what it does ;-)

That’s most of the name space. There are also some mount points for mounting squashfs images in /sq/mnt:

root@PiM3Devuan2:/# ls /sq/mnt
bin  boot  home  lib  root  sbin  var

So doing a “mkdir /sq/mnt/{foo}” for each of them is needed at some point.

Squashing Thngs

Some directories and file systems are just squashed and mounted read-only. They don’t need a writable overlay. Things like /bin /lib /usr /sbin where various system programs live. You just don’t want them to change, even temporarily with an “overlay”. So don’t.

Here’s the ones I’ve made so far in the /sq partition:

root@PiM3Devuan2:/# ls /sq/sq
bin.sqsh  boot.sqsh  home.sqsh	lib.sqsh  root.sqsh  sbin.sqsh	usr.sqsh  var.sqsh

If you change things in your home directory or in, say, /usr by doing a system update, you need to create a new squashed image. So let’s squash that home directory area again and show what it looks like:

root@PiM3Devuan2:/# ls home
chiefio  ems
root@PiM3Devuan2:/# ls -a home/ems
.  ..  .bash_logout  .bashrc  .profile

Pretty bare bones. And it goes fast:

root@PiM3Devuan2:/# mksquashfs ./home /sq/sq/home.sqsh
Parallel mksquashfs: Using 4 processors
Creating 4.0 filesystem on /sq/sq/home.sqsh, block size 131072.
[===================================================================|] 3/3 100%

Exportable Squashfs 4.0 filesystem, gzip compressed, data block size 131072
	compressed data, compressed metadata, compressed fragments, compressed xattrs
	duplicates are removed
Filesystem size 2.22 Kbytes (0.00 Mbytes)
	46.33% of uncompressed filesystem size (4.79 Kbytes)
Inode table size 96 bytes (0.09 Kbytes)
	49.48% of uncompressed inode table size (194 bytes)
Directory table size 80 bytes (0.08 Kbytes)
	77.67% of uncompressed directory table size (103 bytes)
Number of duplicate files found 0
Number of inodes 6
Number of files 3
Number of fragments 1
Number of symbolic links  0
Number of device nodes 0
Number of fifo nodes 0
Number of socket nodes 0
Number of directories 3
Number of ids (unique uids + gids) 2
Number of uids 2
	root (0)
	ems (1000)
Number of gids 2
	root (0)
	ems (1000)

That makes the squashfs home.sqsh file system in a blob of bits.

That same thing is done for every directory / filesystem that you wish to make read only as a squashfs mount and for everyone where you wish to do that but with a tmpfs “overlay” that is writable (like your home directory and the root home directory and /var )

Mounting It all In Place

Here’s the script named “protect” that ties all those squasfs and tempfs bits together.

The first few lines create the tmpfs file systems (compressed in memory but can roll out to swap space if needed) and creates the necessary directories in them.

Then there are a series of “mount” commands. Some of these, like /bin /lib and /sbin, just overlay the read/write versions on the u-SD card with read-only squashfs versions. Effectively protecting them from change while also reducing wear on the SD card.

Then there are some mounts to /sq/mnt for root.sqsh, var.sqsh, and home.sqsh that are all directories that need a writable overlay file system.

Finally, there are three “overlay” file system mounts that integrate the squashfs mounted stuff with a tmpfs writable overlay. This lets programs that demand to write to disk feel like they did so, while it also lets you toss it all away when done.

root@PiM3Devuan2:/# bcat protect
mount -t tmpfs -o size=20M,mode=0700  tmpfs /sq/tmpfs/root
mkdir /sq/tmpfs/root/fs /sq/tmpfs/root/wrk

mount -t tmpfs -o size=200M,mode=0755 tmpfs /sq/tmpfs/var
mkdir /sq/tmpfs/var/fs /sq/tmpfs/var/wrk

mount -t tmpfs -o size=100M,mode=0755 tmpfs /sq/tmpfs/home
mkdir /sq/tmpfs/home/fs /sq/tmpfs/home/wrk

mount -o loop -t squashfs /sq/sq/root.sqsh  /sq/mnt/root
mount -o loop -t squashfs /sq/sq/var.sqsh   /sq/mnt/var
mount -o loop -t squashfs /sq/sq/home.sqsh  /sq/mnt/home

mount -o loop -t squashfs /sq/sq/bin.sqsh  /bin
mount -o loop -t squashfs /sq/sq/sbin.sqsh /sbin
mount -o loop -t squashfs /sq/sq/lib.sqsh  /lib
mount -o loop -t squashfs /sq/sq/usr.sqsh  /usr

mount -t overlay -o lowerdir=/sq/mnt/root,upperdir=/sq/tmpfs/root/fs,workdir=/sq/tmpfs/root/wrk overlay /root

mount -t overlay -o lowerdir=/sq/mnt/var,upperdir=/sq/tmpfs/var/fs,workdir=/sq/tmpfs/var/wrk overlay /var

mount -t overlay -o lowerdir=/sq/mnt/home,upperdir=/sq/tmpfs/home/fs,workdir=/sq/tmpfs/home/wrk overlay /home

So how does it look when running?

I log in and su to root, here’s the before df:

root@PiM3Devuan2:/# df
Filesystem     1K-blocks      Used Available Use% Mounted on
/dev/root        4062912   2732164   1121036  71% /
devtmpfs          438328         0    438328   0% /dev
tmpfs              88764      1712     87052   2% /run
tmpfs               5120         4      5116   1% /run/lock
tmpfs            2068640         0   2068640   0% /run/shm
/dev/mmcblk0p1    130798     35754     95044  28% /boot
tmpfs            2097152         4   2097148   1% /tmp
/dev/sdb10       4062912   1687044   2166156  44% /var
/dev/sdb11       4062912   2204196   1649004  58% /usr/lib
[...]
/dev/sdc5        6127168   2965348   2847248  52% /sq
tmpfs             443800         0    443800   0% /sys/fs/cgroup
tmpfs              88760         8     88752   1% /run/user/1616

Notice that /tmp is also on a tmpfs file system (of 2 GB). As the Pi only has 1 GB of memory and with a browser open runs a few hundred MB to swap you will need to make sure you have a swap partition somewhere. I use a real USB disk for this so have an entry in /etc/fstab for it:

LABEL=WDTB_swap		swap		swap	sw,pri=256,defaults	0 0
LABEL=WDTB_ext3		/WDTB/ext	ext3	defaults,auto	0 2
LABEL=WDTB_sq		/sq		ext3	auto,defaults	0 2
LABEL=WDTB_swap2	swap		swap	sw,pri=32,defaults	0 0

This is the disk I’m using for the MariaDB database of GHCN temperature records, so will always be on this system. I’ve chosen to put the /sq partition on it, and to make 2 swap partitions at each end of the disk, with priority to the one nearest to the squashfs stuff. That way head seeks ought to be minimized much of the time, but there is a lot of swap space if needed. 4 GB worth ;-)

Then I run the “protect” command, and here’s the after:

root@PiM3Devuan2:/# protect
root@PiM3Devuan2:/# df
Filesystem     1K-blocks      Used Available Use% Mounted on
/dev/root        4062912   2732164   1121036  71% /
devtmpfs          438328         0    438328   0% /dev
tmpfs              88764      1728     87036   2% /run
tmpfs               5120         4      5116   1% /run/lock
tmpfs            2068640         0   2068640   0% /run/shm
/dev/mmcblk0p1    130798     35754     95044  28% /boot
tmpfs            2097152        12   2097140   1% /tmp
overlay           204800         0    204800   0% /var
/dev/loop6       1318016   1318016         0 100% /usr
/dev/sdc3      950949916 593002744 309635060  66% /WDTB/ext
/dev/sdc5        6127168   2965348   2847248  52% /sq
tmpfs             443800         0    443800   0% /sys/fs/cgroup
tmpfs              88760         8     88752   1% /run/user/1616
tmpfs              20480         0     20480   0% /sq/tmpfs/root
tmpfs             204800         0    204800   0% /sq/tmpfs/var
tmpfs             102400         0    102400   0% /sq/tmpfs/home
/dev/loop0         29440     29440         0 100% /sq/mnt/root
/dev/loop1       1537280   1537280         0 100% /sq/mnt/var
/dev/loop2           128       128         0 100% /sq/mnt/home
/dev/loop3          6016      6016         0 100% /bin
/dev/loop4          3840      3840         0 100% /sbin
/dev/loop5         34432     34432         0 100% /lib
overlay            20480         0     20480   0% /root
overlay           102400         0    102400   0% /home

Now you can see (at the bottom) how the “overlay” file systems are mounted on /root and /home (with /var up near the top) while the “/dev/loopX” mounts for /lib, /bin, /usr and /sbin are making those read only from the squashfs file system blobs.

I still have a few more bits to do, but it’s all working as it ought. In particular, during this development, /etc changes a lot (/etc/fstab as I change what is mounted, for example) so I need to do that one as well. Then there is a question of “do /boot or not”. Maybe just remount it as “ro” read only. But this is now all substantially locked down.

I can now login as “ems” and I’ll be in an overlay file system with my generic home directory as the base but with a writeable temporary overlay:

root@PiM3Devuan2:/# su ems
ems@PiM3Devuan2:/$ cd
ems@PiM3Devuan2:~$ pwd
/home/ems
ems@PiM3Devuan2:~$ df .
Filesystem     1K-blocks  Used Available Use% Mounted on
overlay           102400     0    102400   0% /home
ems@PiM3Devuan2:~$ ls
ems@PiM3Devuan2:~$ touch junk
ems@PiM3Devuan2:~$ ls
junk

So that’s how you do it!

Subscribe to feed

Advertisements

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...
This entry was posted in Tech Bits and tagged , , , , , . Bookmark the permalink.

25 Responses to Raspberry Pi / Devuan with OverlayFS Write Protection

  1. Nobody says:

    This one is useful. Will PDF it.

    I’ve used Live CDs and Windows with rollback software (school libraries) for years. Never did understand why more products weren’t read only or WORM (shaking head)

    This is my backup server: https://kb.syncplify.me/how-to-protect-your-backups-from-cryptolocker-ransomware/

    Yup, all Linux machines backup to Windows – hahahahaha.

    Always felt ridiculous I never could figure out a way to use a Linux FTP server as a WORM storage device :(

    Now, how about giving us the straight dope on how to setup bulletproof Zeronet while protecting ourselves from craps. Don’t wanna use or maintain the damn thing – just let it run as my middle finger to censorship.

    Have operated a Tor relay for more than 10 years for the same reasons.

  2. E.M.Smith says:

    It makes administration a little more inconvenient to use WORM or RO. For example, I have a /usr/local/bin where I make little scripts more or less on the fly while doing admin. Once that is read only, I can’t do that without a bit new re-squash and mount process.

    Similarly, doing and “update” and “upgrade” or and “apt-get install” have a precursor step of make things writable again and then re-make your WORM or RO version.

    Part of why I like this approach is that the base systems stays comfortable like that. So when working away from hazard, not on the internet, I can just leave it open.

    Then type “protect” and I’m “good to do” on surfing random spots and not being worried about what I pick up. Then a reboot and I’m back to normal.

    So only when I want to install some new software or change some config on the protected version do I need to worry about making a new squashed copy. Even then, it’s just “do what I want to do” and issue the squash command(s) again. All the rest of the process stays as is.

    The only bit that’s a bit tricky is that once you mount a squashfs of /etc by default it becomes hard to stop it as every boot bangs you into it based on config info is the base version. So you must interrupt the boot to get to the base version of things like /etc/fstab to fix it. Thus my having a “protect” script to start things in that mode instead of just a default boot that way.

    This isn’t a perfect protection. There are still attack vectors. Swap is not encrypted (yet…) and even though it is just random chunks of files and in compressed form as compressed tmpfs bits; there is a small potential for data leakage. But that’s for phase 2 ;-)

    Also things like device files and /boot not locked down.

    But, IMHO, it takes out the majority of the “crap” and “issues”. Especially just all the day to day tracker / beacon / cookie stuff. Then good luck to some virus trying to encrypt your “disk” and lock you out. It can’t mess with the top layer and the bottom layer is hidden. For writable user space, I tend to put copies offline too.

    I do feel a whole lot more comfortable about doing things like blog administration and writing articles where I’m browsing to all sorts of random places chasing stuff.

  3. E.M.Smith says:

    I’ve done the “disk shuffle” and moved /sq to a newly made and larger partition. Now /var and /usr are also protected. I’ve updated the posting accordingly.

    It all seems to be working just fine.

    For this particular system, my regular account home directory is on a hard disk partition that is not in the /home squashed partition. This way I can login as “just regular me” even when the system is locked down and have my regular writable space; OR I can login as “ems” in the protected space. This avoids needing to shut down as I change roles.

    The “downside” is simply that if I want that regular home directory space “protected” I need to manually unmount it or do a remount as ‘ro’ read only prior to switching over to ’ems’. A small penalty for the otherwise increased flexibility.

    At this point I’m basically done changing things in this posting. I still have /etc to convert, but it is a simple thing largely the same as the others, so no need to put it in the posting. It is just make a squashfs of it and add the mount command.

    FWIW, in use I don’t notice any performance impact. I’ve got a full browser open (chromium) with a dozen or two tabs, then a couple of terminal windows, and it’s at 177 MB on swap (typical with the browser open and lots of tabs) and a bit under 20% CPU load when I’m not typing (also typical). So it looks like this is a performance impact free action.

  4. E.M.Smith says:

    Oh, and one other minor note:

    Now that this is all set up, I’ve made the /sq partition mount to be ‘read only’. This puts one more barrier between someone trying to change things and my stable system blobs.

    ems@PiM3Devuan2:~$ grep WDTB /etc/fstab
    #  <<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>
    LABEL=WDTB_swap		swap		swap	sw,pri=256,defaults	0 0
    LABEL=WDTB_ext3		/WDTB/ext	ext3	defaults,auto	0 2
    #LABEL=WDTB_sq		/sq		ext3	auto,defaults	0 2
    LABEL=WDTB_sq		/sq		ext3	ro,auto,defaults	0 2
    LABEL=WDTB_swap2	swap		swap	sw,pri=32,defaults	0 0
    

    Notice that I just “commented out” the rw default line and added one with ‘ro’ in it.

    This lets me swap back easily just by changing what line is commented out. So I can “become root” and change /etc/fstab, then reboot, and then I can change the squashfs blobs with updates.

    Where this becomes complicated is when you use /etc as a squashfs file system too. In that case, you must change the /etc/fstab before you mount the squashfs version… So using a “protect” command to launch all this means there’s a non-protected window when you can do system admin work like that without “exotic” bypass of the process needed (like boot to maintenance mode).

    As a test:

    ems@PiM3Devuan2:~$ touch /sq/junk
    touch: cannot touch '/sq/junk': Read-only file system
    

    So at this point, the /sq/sq/*.sqsh images are all on a write protected partition, so can’t be changed. They are mounted as squashfs partitions over the regular system spaces, thus hiding the regular ones from change and being read-only themselves, preventing change. As they are “in use” they can’t be unmounted by an attacker. My login account has a tmpfs write-able overlay in front of it so it acts normally, but any “junk” it picks up is toss in the trash as soon as I reboot.

    That’s a pretty tight lock down. Not impossible to crack, but it won’t be either fast, nor easy. Also, with an htop panel open to monitor systems activity, anyone trying to crack it will tend to show up as significant system use / load.

    FWIW, the attack vector I would use is to try putting another different set of system file on any write-able disk, and then mount them over the squashfs versions. I’m not sure that would work (though I think it ought to be possible to layer Yet Another Mount on top); but the amount of disk activity needed to get that much data onto the system is going to be noticed. Then there’s a very simple fix: Just don’t have that big writable space mounted. I have /WDTB/ext mounted for the database, but that’s just a convenience for me. It would be just as possible to boot up with it not mounted. Then there’s not a lot of space to put those filesystems for the overlay of the overlay…

    Oh, and you have to already be root to make that happen anyway, so there’s a big chicken and egg issue… must be root to assure being root. So some other exotic hack would need to be done to “get root” and then you can make it easier to keep with the added overlay activity.

    Someone getting all that done without causing a noticeable rise in CPU activity in htop, without causing red D Disk Wait states in htop, and without the performance of the system becoming a bit sucky (it IS only a R. Pi after all…) approaches zero probability quite closely ;-)

    FWIW, I’m posting this from the “ems” disposable account with “protect” active. Working great. Only issue so far is that in Chromium, when I was cusomizing things, I forgot to make the default font size larger. So, ok, at some point I need to do that, do a new mksquashfs and fix it.

    For now, it’s OK. Normally the TV / Monitor is at the far back edge of the desk and I’m reading it from about 3 feet away. So I can just lean in a foot for a while ;-) Or scoot my chair up and put the keyboard on the desk instead of my lap ;-)

    Overall, I’m quite happy with the performance and the results.

  5. E.M.Smith says:

    OK, I’m only bout 3/4 done with the diagnosis / recovery cycle, so this will have some (much?) speculation in it. I’m typing this on the recovered R.Pi system, so that’s a good thing…

    I left the Pi running in “protected” mode overnight. It was fine. This morning, I rebooted.

    Boot seemed OK, then at the “launch opendesktop” it just gave me a black screen.

    WT? This launches the hidden / normal layer below the “protected” layer so there ought not be ANY changes in it during the overnight run. IMHO, this sent me down a few wrong rabbit holes.

    I’ll skip much of the dead ends and wrong “fixes”… Eventually I had it booting again just off the plain SD card image. By this time I’d figured out it was likely something wrong in the first layer of “overlay”… I’d set up this system to put a few of the active file systems on the hard disk so as to reduce SD card bit wear. I’d done that years ago as the first step. I also had not-enough space on this SD card as I’d made it “dual boot” with both armhf and arm64 images on it and it’s only 8 GB, so when I started adding software, well, the image got to more than 4 GB. For that reason I’d also moved /usr/lib to a disk partition. So that’s your context. /var and /usr/lib on hard disk.

    Then I’d done all that fancy overlay with squasfs and tmpfs stuff above. That’s a lot of “stuff” going on as I was starting from an already too complex base case. But it had worked.

    Until this morning. When it didn’t.

    Just to make it more fun, in the first build of this chip the armhf partition also had need to expand so I’d put some of it in a 4th ti on the card that was an xfs partition. So the actual layout was like:
    MS boot area | armhf base | arm64 base | armhf/usr
    then with the arm64 /var out on a disk as wias the arm64 /usr/lib.

    Yeah, a mess. Yeah, I ought to have just bought another 8 GB u_SD card and split them. But it worked. And when I had gone to 2 local stores nobody sold anything smaller than 16 GB… And it worked…

    I decided yesterday to put the two halves of the armhf in one partition and move the arm64 over to the other half of the card and that also eliminated that xfs partition.

    So now remember this means I’ve moved / changed the partitions on the u_SD card, I’ve dumped / reloaded the arm64 bag-o-bits, and all and then had also decided “what the heck” and shortened the name of that hard disk from SG500 to SG5 just so save some typing… All this involving a lot of editing of /etc/fstab and scripts with path names and more.

    But it HAD worked and it HAD run all night fine. So in theory all that was fine…

    So all that complexity and mess was in my grill as I was trying to figure out what was wrong. (Note that due to the SD card having /var, the disk having /var overlay it, and then the Squashfs overlay and then tmpfs overlay on that: There are 4 places with log files of various ages…

    So I was lazy and didn’t look at the log files much at first. To get it to boot at all, I’d moved the u_SD card to a USB holder on the XU4 and edited fstab to not mount anything but the base u_SD card bits. That worked, buy why?

    Cutting to the end:

    I had a look at the tail of messages and syslog. Had noticed that messages, syslog and kern.log (on the external disk partition) looked a bit big, but thought: “It was running protected with squashfs /var and tempfs overlay so this can’t matter”… That was an error…

    AFTER doing a bunch of pointless things, and having it again fail to boot with the hard disk /var, I FINALLY noticed that it was at 100% full.

    ALL this time is was just that /var was FULL? Yup.

    Blew away syslog, messages, and kern.log and it boots fine again. (And I’ve spent about an hour backing out / fixing things I’d “tired” before noticing what the real problem was).

    The error message that had filled the logs was not enlightening, and I didn’t save it. IF it returns I’ll add it here. Right now I’m just going to finish the cleaning up.

    The Speculative Bit:

    This worked before going “protected” so the logs were not full then.
    After going protected, writes out not go to that partition…

    So why did it fill up? I’d SPECULATE that something isn’t honoring the partition mount overlay of squashfs / tmpfs, but gets confused by it, and is writing ongoing error messages where you can’t see them until a reboot…when you can’t reboot… I’ll be testing that speculation later on today when the clean up is all done and I have Yet Another Pristine Copy. (I’d also restored my “one day old” arm64 partition dump from the chip reformat and that didn’t fix it… but did give me a couple of hours of “work done since then” to re-do…)

    So the point of this long complaint is 3 fold:

    1) IF you get a black screen at boot after all the stuff worked fine and it is launching your window manager: Check that /var is not full….

    2) It is possible as a working theory that too many levels of overlay leaves messages being written into the lower filesystem and not into the tempfs one. Testing Required (and maybe I need to simplify this whole stack anyway…)

    3) If you just spent all day renaming disk, moving partitions on the u_SD card, generally changing around the admin side of things: Make a back up of the changes before you let it run all night… ALL the changes. Don’t just say “Oh, I took a backup first, that’s enough”… Enough that you get to redo the changes that make it compatible with the new disk and partition names and structure… Yeah, it was “only” a couple of reboots and edits on the other computer and …. but…

    Then, as a closing note:

    I suspect I’d have observed disk:/var full faster had I not run into this as I was just starting my first morning coffee… and had I not ASSUMED that since squshfs:/var and tmpfs:/var were mounted over the disk:/var it could not possibly have been anything to do with the disk:/var …. as it had worked to boot the thing and launch “protect” mode so MUST be fine… Sigh.

    I’m putting my overalls on and going in to the garden to chop on things with sharp instruments…

  6. E.M.Smith says:

    Well, having chopped down a few bushes, one tree limb and “shrunk” a rose bush, I’m feeling much less stressed now ;-)

    Having booted from a fresh start empty log files, the message in question returned:

    Aug 21 19:34:43 PiM3Devuan2 kernel: [    3.990521] usb 1-1.3.4.1: Manufacturer: Seagate
    Aug 21 19:34:43 PiM3Devuan2 kernel: [    4.000094] usb 1-1.3.4.1: SerialNumber: NA01N3J4
    Aug 21 19:34:43 PiM3Devuan2 kernel: [    4.011235] usb-storage 1-1.3.4.1:1.0: USB Mass Storage device detected
    Aug 21 19:34:43 PiM3Devuan2 kernel: [    4.023486] scsi host0: usb-storage 1-1.3.4.1:1.0
    Aug 21 19:34:43 PiM3Devuan2 kernel: [    4.161394] brcmfmac: brcmf_c_process_clm_blob: no clm_blob available(err=-2), device may have limited channels available
    Aug 21 19:34:43 PiM3Devuan2 kernel: [    4.171900] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Oct 23 2017 03:55:53 version 7.45.98.38 (r674442 CY) FWID 01-e58d219f
    Aug 21 19:34:43 PiM3Devuan2 kernel: [    4.222794] usb 1-1.3.1: reset low-speed USB device number 5 using dwc_otg
    Aug 21 19:34:43 PiM3Devuan2 kernel: [    4.510130] WARN::dwc_otg_handle_mode_mismatch_intr:68: Mode Mismatch Interrupt: currently in Host mode
    Aug 21 19:34:43 PiM3Devuan2 kernel: [    4.510130] 
    Aug 21 19:34:43 PiM3Devuan2 kernel: [    4.529043] WARN::dwc_otg_handle_mode_mismatch_intr:68: Mode Mismatch Interrupt: currently in Host mode
    Aug 21 19:34:43 PiM3Devuan2 kernel: [    4.529043] 
    Aug 21 19:34:43 PiM3Devuan2 kernel: [    4.548119] WARN::dwc_otg_handle_mode_mismatch_intr:68: Mode Mismatch Interrupt: currently in Host mode
    Aug 21 19:34:43 PiM3Devuan2 kernel: [    4.548119] 
    Aug 21 19:34:43 PiM3Devuan2 kernel: [    4.567117] WARN::dwc_otg_handle_mode_mismatch_intr:68: Mode Mismatch Interrupt: currently in Host mode
    Aug 21 19:34:43 PiM3Devuan2 kernel: [    4.567117] 
    Aug 21 19:34:43 PiM3Devuan2 kernel: [    4.585451] WARN::dwc_otg_handle_mode_mismatch_intr:68: Mode Mismatch Interrupt: currently in Host mode
    Aug 21 19:34:43 PiM3Devuan2 kernel: [    4.585451] 
    

    There’s only three things on the USB. KB, Mouse, and one Seagate disk… all via a powered hub.

    I took some screen shots of the log file time stamps while in protected mode and not, so I ought to be able to bracket those time stamps above and figure out if they are from when, in theory, the log directory was on tmpfs/squashfs… IF it is, that is, IMHO, a truly horrific behaviour (rather like how now /tmp doesn’t let Root User see all the files in it…). It breaks the expectation about a command. Essentially, making systems admin commands “fail to function, silently”. NOT nice.

    This is the 2nd of these I’ve run into, and it’s starting to look like a pattern. Someone deciding to do something for added “security” that breaks how things are expected to work, and does not flag it to the systems admin. (Yeah, I know, read EVERY release note on EVERY release of EVERY system / OS… only takes about a month every month…)

    But we’ll see. Still a bit speculative at this point. I’d hate to think that I must mount the squashfs/tmpfs /var at boot for this to work, just to bypass some design stupidity… (A quick look at the /var timestamps post garden chopping showed no change of time stamps despite running in protected mode that whole time… so suspicious – but not yet proven).

    So looks like 2 problems / issues still open:

    1) Why that error message and how to remove it? Then I need to remake the squashfs image without the historic almost full fie system message, syslog, and kern.log files…

    2) Verify that squashfs/tmpfs overlay isn’t working and find a fix. (Or if it is working, why did this still blow up? Was it 1 message away from hang and on the reboot that was enough?)

    Ah, the “joys” of admin / debugging…

  7. E.M.Smith says:

    Well, everything is back to working as it was.

    I’m running in “protect” mode.

    Went into /var and did a “touch” of a file, and copied bits into it. Everything looks like it is acting as expected (writing to the tmpfs).

    So now I let it run a while then reboot and see if the lower level file system was still getting writes, or not. If not, then it has to be that the /var was one hair away from full at boot, and then locked up on 100% full. If it IS getting log writes under the other mounts, that’s just wrong…

    Putting this stuff here becuase it is the easiest way to have it preserved and visible from all my various platforms:

    Before the “protect” was issued:

    root@PiM3Devuan2:/# df
    Filesystem     1K-blocks      Used Available Use% Mounted on
    /dev/root        3997376   2810776    976888  75% /
    [...]
    tmpfs            2097152        16   2097136   1% /tmp
    /dev/sda3        8125880   5855508   1850944  76% /sq
    /dev/sda5        3997376   1683752   2103912  45% /var
    /dev/sda8        3997376   2203940   1583724  59% /usr/lib
    [...]
    root@PiM3Devuan2:/# ls -l /var/log/messages /var/log/syslog
    -rw-r----- 1 root adm  91767 Aug 21 20:42 /var/log/messages
    -rw-r----- 1 root adm 119665 Aug 21 20:43 /var/log/syslog
    root@PiM3Devuan2:/# date
    Wed Aug 21 20:44:43 UTC 2019
    

    After the “protect’:

    root@PiM3Devuan2:/# protect
    root@PiM3Devuan2:/# df
    Filesystem     1K-blocks      Used Available Use% Mounted on
    /dev/root        3997376   2810776    976888  75% /
    [...]
    tmpfs            2097152        16   2097136   1% /tmp
    /dev/sda3        8125880   5855508   1850944  76% /sq
    overlay           204800         0    204800   0% /var
    /dev/loop6       1318016   1318016         0 100% /usr
    [...]
    tmpfs              20480         0     20480   0% /sq/tmpfs/root
    tmpfs             204800         0    204800   0% /sq/tmpfs/var
    tmpfs             102400         0    102400   0% /sq/tmpfs/home
    /dev/loop0         29440     29440         0 100% /sq/mnt/root
    /dev/loop1       1536640   1536640         0 100% /sq/mnt/var
    /dev/loop2         26624     26624         0 100% /sq/mnt/home
    /dev/loop3          6016      6016         0 100% /bin
    /dev/loop4          3840      3840         0 100% /sbin
    /dev/loop5         34432     34432         0 100% /lib
    overlay            20480         0     20480   0% /root
    overlay           102400         0    102400   0% /home
    
    root@PiM3Devuan2:/# cd /var
    root@PiM3Devuan2:/var# ls -l log/syslog log/messages
    -rw-r----- 1 root adm 61291 Aug 21 19:58 log/messages
    -rw-r----- 1 root adm 77740 Aug 21 20:30 log/syslog
    root@PiM3Devuan2:/var# df .
    Filesystem     1K-blocks  Used Available Use% Mounted on
    overlay           204800     0    204800   0% /var
    root@PiM3Devuan2:/var# ls
    backups  cache	lib  local  lock  log  lost+found  mail  opt  run  spool  tmp
    root@PiM3Devuan2:/var# touch JUNK
    root@PiM3Devuan2:/var# ls
    backups  cache	JUNK  lib  local  lock	log  lost+found  mail  opt  run  spool	tmp
    root@PiM3Devuan2:/var# cp log/messages JUNK
    root@PiM3Devuan2:/var# !df
    df /var
    Filesystem     1K-blocks  Used Available Use% Mounted on
    overlay           204800    60    204740   1% /var
    

    So here’s hopping it was all just the log files on the hairy edge of full and I didn’t notice…until the reboot put them over the top.

  8. p.g.sharrow says:

    @EMSmith, kind of looks like this thing could use a bit of pruning as well as your garden. 8-) I’m surprised with all the bits you have running around in that machine that it doesn’t get even more confused all the time…pg

  9. E.M.Smith says:

    @P.G.:

    It’s the “playground” machine where all sorts of stuff gets tied… I usually take a “reset” on it about every 6 months to a year with a full re-install from formal release. Last one was Devuan 2.0 ship.

    IF they had a 2.1 release, I’d have had a “do over” on it… but they haven’t.

    I have other “minimalist” chips for other uses…For example, the RockPro64 is a basic Ubuntu totally unadorned and unchanged in any way. Ditto the Odroid N2. The 2 Orange Pi One boards are a base case Armbian install.

    But it is important to have somewhere where you find The Wall… then don’t take any other board that far… I also have a few u_SD chips for this one that are very vanilla. One is a “Build builder” that is used ONLY to build other systems (so known clean install platform) all on one u_SD card of 64 GB and nothing else. There’s also a “daily driver” browser chip and a “headend” chip for the cluster that are pretty generic (though Headend has cluster software and a hard disk…)

    FWIW, I’ve left this run a few hours, then rebooted. Along the way it looks like…. Nothing Happened….

    root@PiM3Devuan2:/# df
    Filesystem     1K-blocks      Used Available Use% Mounted on
    /dev/root        3997376   2810776    976888  75% /
    [...]
    /dev/mmcblk0p1    130798     35754     95044  28% /boot
    tmpfs            2097152         4   2097148   1% /tmp
    /dev/sda3        8125880   5855508   1850944  76% /sq
    /dev/sda5        3997376   1683876   2103788  45% /var
    /dev/sda8        3997376   2203940   1583724  59% /usr/lib
    [...]
    root@PiM3Devuan2:/# ls -l /var/log/messages /var/log/syslog
    -rw-r----- 1 root adm 120675 Aug 21 23:44 /var/log/messages
    -rw-r----- 1 root adm 162258 Aug 21 23:44 /var/log/syslog
    root@PiM3Devuan2:/# date
    Wed Aug 21 23:45:08 UTC 2019
    

    The logs mostly grew by the reboot records and not much else.

    So at this point it is looking like a “My Bad” in that their was an error happening filling logs (that is not in them now… so who knows what, why or why it is gone). I didn’t notice. It was almost a full /var partition (of 4 GB!) when I did the reboot, and the added log entries filled the disk preventing the freedesktop.org code from running so login hangs.

    The rest was me figuring that out, wandering around doing wrong “fixes”, recovering from it all, thinking it was something Really Sly when it was really me being totally unaware of log file size…

    The “kicker” for me was when I checked the squashfs /var and found it too had BIG log files in it… so they were already growing when I’d done that squash…

    So, OK “Nevermind”…. and all is back to working right and I’m needing to spend more time actually looking at the fullness of file systems…

    Oh, and monitoring the logs when it was running “protect” mode showed them being written in the tmpfs area… So my rant / paranoia was entirely wrong. (though a little justified given the other things the recent crop of “developers” has perverted.. [cough /tmp, cough systemD, cough…)

  10. E.M.Smith says:

    Well, right after posting that the error messages returned, in force:

    root@PiM3Devuan2:/var/log# ls -l messages
    -rw-r----- 1 root adm 3884332 Aug 22 00:08 messages
    root@PiM3Devuan2:/var/log# ls -l syslog
    -rw-r----- 1 root adm 4045295 Aug 22 00:08 syslog
    root@PiM3Devuan2:/var/log# ls -l syslog messages
    -rw-r----- 1 root adm 7421437 Aug 22 00:10 messages
    -rw-r----- 1 root adm 7473383 Aug 22 00:10 syslog
    root@PiM3Devuan2:/var/log# 
    

    This is WITHOUT any of the “protect” stuff going on, so I think I’ve got a dirty USB connection somewhere, maybe…

  11. E.M.Smith says:

    Shortly after that prior note I shut down, moved the USB Drive to a different port on the USB Hub (from THE most used one to THE least used one) and moved the hub connector to a different port on the R. Pi. Usually I put it in the lower one nearest the ethernet connector (so all the cables are close). I Regularly plug / unplug it a few times / day as I move the KB / Mouse & Disks from one SBC to another and back… Over a few years, that’s been thousands of plugging events.

    Since that, no errors in the log files.

    It felt a tiny bit loose being unplugged and tighter in the new port…

    So I think I may have just managed to wear down the connector for one of the USB 2.0 slots…

    A bit more monitoring and testing will likely finger exactly which. IIRC I had disk read issues some months back on the hub port that I abandoned too, so it may be that the most used (1 of 7 ) connector on the hub is dodgy. I’ll need to see if it is the Pi or the Hub or both together that’s causing the errors.

    So I think this closes this event.

    Most likely dirty / worn USB connectors causing sporadic log file floods when moving from one SBC to the other (that I’ve been doing A Lot the last week…) interacting with my modest Professional Paranoia about all the changes I’d made and hiding behind my ASSUMPTION that the log file had to be FINE as it was fine at shutdown… having not noticed that it was a bit full, and completely missing that when the error flood comes it can be a huge torrent.

    OK, so I’ll be working on my “Pay attention to log files and file system usage %” skills …

  12. p.g.sharrow says:

    The Raspi is supposed to be disposable cheap!, your use must be ageing it at high speed, a feature? no doubt, test it until it breaks….pg

  13. E.M.Smith says:

    Yeah, I’m pushing it way past design goals…

    I’m thinking it’s the Pi that’s the problem. The same hub and disk are plugged into the XU4 and it has had zero issues.

    I think that pretty much says it is either the Pi physical connector, or something with loading up so much stuff, all on one disk (in something like a dozen partitions…) with a stack of overlays, is blowing through something in the USB / disk IO stack.

    We’ll see how it shakes out. Next thing I’m going to do is move the chip to my other R.Pi M3 (eliminating the hardware question). Then will be a reinstall of software from scratch on another u_SD card to eliminate software corruption / bit rot and SD card wear questions.

    IF at the end of that, it persists, then it has to be the layering of too many things in the configuration. I’m pretty certain that isn’t the problem as I’ve seen the errors manifest even when not in “protect” mode (i.e. when running plain Devuan 2.0 straight with only regular hard disk mounts).

    But “we’ll see”. It’s also been relatively hot in my office so I suppose running hot might be an issue, but the thermometer has not been showing up so I’m pretty sure that’s not it. But as a common mode of failure for older hardware, who knows…

    Oh, and the disk itself is my oldest and smallest one. I’m deliberately running it hard so that wear happens on it, not on my newer ones. It is a USB 2.0 disk while the newer ones are USB 3.0, so I might be pushing to too hard too.

    Who knows. We’ll see if I can regularly cause the problem, then regularly remove it. Then I ought to be able to finger the exact cause. Until then, that platform is a bit suspect.

    Oh, and the Pi is known to have “issues” with old SD cards as the card starts to fail and corrupt (why I try to put active partitions on real disks…) so it is also possible that even with that care, this several year old u-SD card is losing it’s bits…

    Anyway, it’s more a minor annoyance issue since I can just slide over to other platforms (and regularly do). Worst case is that I buy a new R. Pi M4, USB 3.0 hub, do a fresh software install and move on ;-)

    But with 4 or 5 SBCs not in service at any one time, I think I can just use one of them ;-)

  14. E.M.Smith says:

    Oh, one other minor note on log files:

    I’ve noticed that my use of SBCs is in conflict with an assumption by Unix / Linux about machine use. In particular, there are housekeeping routines that move log files along every night. So for messages and syslog, they age one day, then you get several that are compressed. This is done every night.

    But what happens when your computer is never up “at night” when this is supposed to happen?

    Then the “today” log file just keeps growing… Here’s the set on the XU4:

    root@odroidxu4:/var/log# ls -l syslog*
    -rw-r----- 1 root adm     597 Aug 22 06:35 syslog
    -rw-r----- 1 root adm 1277741 Aug 22 06:25 syslog.1
    -rw-r----- 1 root adm   23127 Aug 10 06:25 syslog.2.gz
    -rw-r----- 1 root adm  207417 Aug  9 06:25 syslog.3.gz
    -rw-r----- 1 root adm    4812 Jul  3 06:25 syslog.4.gz
    -rw-r----- 1 root adm   91619 Jul  2 06:25 syslog.5.gz
    -rw-r----- 1 root adm  629960 Jun 27 06:25 syslog.6.gz
    -rw-r----- 1 root adm  149575 Jun  5  2018 syslog.7.gz
    

    I deliberately left this machine running overnight just to let it move along that big log file. Note that there are 7 incremental copies, the last 6 compressed. This is supposed to be a one week daily series… In reality is is just shy 3 MONTHS of logs. So June 27 really has 22 days worth in it ( 27-5=22) and it looks like July 2, 3 I left it running over night.

    I think this is contributory to my /var/log filling up “issue” (though the biggest part was just that runaway error stream).

    I think maybe there needs to be a change of assumption on how the logs are handled… Maybe at boot time something that looks at the date stamps and “moves things along” if they are older than a week or two…

  15. E.M.Smith says:

    FWIW, I’ve cobbled together a little script to zero out the older log files “on command”. Named “zerologs” I can now dump the older log file copies with the command “zerologs”.

    ls -l /var/log/messages* 
    echo
    ls -l /var/log/syslog*
    echo
    ls -l /var/log/kern.log*
    echo
    echo first Messages:
    
    echo "" > /var/log/messages.1
    for i in 2 3 4 
    do
    	echo "" > /var/log/messages.$i.gz
    done
    
    echo Then syslog
    
    echo "" > /var/log/syslog.1
    for i in 2 3 4 5 6 7
    do
    	echo "" > /var/log/syslog.$i.gz
    done
    
    echo Finally kernel logs
    
    echo "" > /var/log/kern.log.1
    for i in 2 3 4 
    do
    	echo "" > /var/log/kern.log.$i.gz
    done
    
    echo Done with zeroing of logs
    echo
    ls -l /var/log/messages* 
    echo
    ls -l /var/log/syslog*
    echo
    ls -l /var/log/kern.log*
    echo
    

    Not exactly fancy, more like a crowbar to the log files, but better than doing it long hand every time the logs are old…

    After running it, now those log files described in the prior comment look like:

    Done with zeroing of logs
    
    -rw-r----- 1 root adm 642 Aug 22 07:13 /var/log/messages
    -rw-r----- 1 root adm   1 Aug 22 07:14 /var/log/messages.1
    -rw-r----- 1 root adm   1 Aug 22 07:14 /var/log/messages.2.gz
    -rw-r----- 1 root adm   1 Aug 22 07:14 /var/log/messages.3.gz
    -rw-r----- 1 root adm   1 Aug 22 07:14 /var/log/messages.4.gz
    
    -rw-r----- 1 root adm 8121 Aug 22 07:14 /var/log/syslog
    -rw-r----- 1 root adm    1 Aug 22 07:14 /var/log/syslog.1
    -rw-r----- 1 root adm    1 Aug 22 07:14 /var/log/syslog.2.gz
    -rw-r----- 1 root adm    1 Aug 22 07:14 /var/log/syslog.3.gz
    -rw-r----- 1 root adm    1 Aug 22 07:14 /var/log/syslog.4.gz
    -rw-r----- 1 root adm    1 Aug 22 07:14 /var/log/syslog.5.gz
    -rw-r----- 1 root adm    1 Aug 22 07:14 /var/log/syslog.6.gz
    -rw-r----- 1 root adm    1 Aug 22 07:14 /var/log/syslog.7.gz
    
    -rw-r----- 1 root adm 240 Aug 22 07:13 /var/log/kern.log
    -rw-r----- 1 root adm   1 Aug 22 07:14 /var/log/kern.log.1
    -rw-r----- 1 root adm   1 Aug 22 07:14 /var/log/kern.log.2.gz
    -rw-r----- 1 root adm   1 Aug 22 07:14 /var/log/kern.log.3.gz
    -rw-r----- 1 root adm   1 Aug 22 07:14 /var/log/kern.log.4.gz
    

    I don’t really have any use for log files from back in June and July, or even early August…

    I’ll likely need to run this about once / month. Maybe once every 2 or 3 months ;-)

    (And yes, I know the “standard” is to “cat /dev/null” not “echo” but most folks know echo and fewer know about /dev/null and it isn’t always available on some more exotic hardware / cut down OS builds).

  16. E.M.Smith says:

    FWIW, this morning I swapped the Pi M3 for my other Pi M3. Same u_SD card. Same software. Same Hub & Disk.

    So far NO errors. No problems. /var not growing. I’ve swapped over to “protect” mode after about an hour of running regularly, and it’s been no issues for several hours.

    I’m going to run this way for a day or two and if this keep ups, well, they we’ll know. It’s something about that other Pi hardware. IF we get to that point, I’ll need to inspect all the connectors and such..

    Given that it has been the main desktop device with several (often many) plug / unplug events every day for “a few” years (whenever I bought my first Pi M3…looks like about 3 years:
    https://chiefio.wordpress.com/2016/07/03/raspberry-pi-model-3-first-impressions/ )
    so over 1000 and up to 5000 plug / unplug events almost all to the same USB slot… Maybe I ought to clean the contacts ;-)

    But “we’ll see”. It needs a day or two and a couple of reboots on this hardware before I’m willing to declare victory and move on…

    THE important point for me is that the “protect” mode is, in fact, working as it ought.

    (I hate hardware problems. I always start out thinking they are software or configuration issues, then chase strange error messages for a long while, and only slowly accept that maybe the hardware is broken… Only when it’s a hard hardware fail, smokes, or the error calls out hardware; only then do I go to hardware testing first. I really need to change that…)

  17. E.M.Smith says:

    FWIW, found an interesting utility. Debsums. It validates every package on the system.

    apt-get install debsums
    

    Then just

    debsums
    

    or “debsums -a” to add config files to the mix.

    More details at:
    https://blog.sleeplessbeastie.eu/2015/03/02/how-to-verify-installed-packages/

    Ran it on the Pi. Everything passes as “OK”. So no software corruption which implies no n_SD card corruption.

    FWIW I did have one “hang / restart” event after a few hours of “protect” mode on this hardware. No idea if that’s a related issue, unrelated, whatever. I’m now running in “unprotected” mode for the rest of the day and we’ll see if anything returns, or not.

    I have had one case with {some disk I’ve forgotten which one… maybe Toyshiba?) where when swap was on it, and something important to the OS was swapped out, AND the disk had gone idle, it didn’t wake up from the sleep spin down state so the OS could load the part of it’s brain it had swapped out. That was a few years ago on Raspbian, so who knows now. Just something to keep in mind. It was also the case that the one hang / reboot was only during the hot part of the day. Basically there may be some additional instability issue… or not. Who knows….

    But at least whatever it is (or isn’t) it isn’t corrupted bits or a corrupted u_SD card.

    I’ve still not had any return of the error messages / var filling up; so I’m pretty sure that’s a too worn USB socket on the other Pi M3 board. If further testing confirms it, that board can go to the server farm where they don’t have anything in the USB ports (or I can just use one of the other 3 … ;-)

    With that, I now return you to The Trump Show… with a cast of millions of deranged zombies all chasing the one sane man on the planet, who for some odd reason is bright orange… as they are all shouting “Orange Man Bad!” over and over again…

  18. E.M.Smith says:

    Not directly related, but a little…

    I’d been using 2 to 3 disks on this particular system, with things balanced between them. Swap space on all of them. Database on one, system stuff on another.

    In this conversion, I’d moved it ALL onto one small old disk. A Seagate 500 MB USB 2.0 disk.

    Not really surprising, but it ends up in D disk wait states a lot now. (Partly I wanted to find out how much using tmpfs file systems would mitigate disk I/O so needed to concentrate some of the disk I/O …)

    Well, every time I launched Chromium there would be a gaggle of red D disk wait indicates in the htop status display showing processes waiting on disk. Chromium uses a lot of memory and stuff ends up on swap. A LOT of stuff. As tmpfs ends up using memory, some of it ends up on swap too.

    So today I just did a “partial” protect set. Just the squashfs overlays. Not the tempfs parts. I also added back that second swap disk and gave it priority (though I already had about 188 MB on the SG500 disk swap…). Here’s what swap looks like after adding that disk and launching chromium, then opening a few pages:

    swapon -s
    Filename      Type	Size	Used	Priority
    /dev/sda2     partition	4194300	185988	128
    /dev/sdb2     partition	1048572	364412	2048
    

    Yeah, over 600 MB on swap on a machine with 1 GB total memory / 867 MB after the graphics use is handed to the GPU. It is usually considered bad form to use more swap than the size of memory have or to allocate more than twice memory size. I’m approaching that usage level and have gone way past that in allocation. I’ve got almost 6 x memory available to the CPU allocated as swap space. I’ve used about 3/4 of my memory capacity in swap space. Another 226 MB in swap and I’m starting to be in the range where there’s a lot of swap happening and more memory is very important.

    (At least, per the “rules of thumb”… in reality, a lot of this is web pages that, if I don’t click on that tab, I don’t care. When I do click on it, something swaps in and I get a little wait time. Traditional swap rules of thumb are for running active programs computing over that space to function, not swapping it out because I may not look at that tab today… Which is why I’m willing to allocate 6 or even 10 x memory as swap space. It’s almost ALL my tendency to keep a few dozen tabs open in a browser….)

    That’s the set up. Here’s the kicker:

    After adding that swap and setting it as priority, launching the browser is faster, there are fewer D waits in the htop, and it just works better. There’s still some, but also I’ve not done a reboot to force all swap onto the new swap space (or just swapoff / swapon that lower priority partition) so some stuff may still be swapping in from it.

    What is clear is that the present configuration is very disk intensive and hitting I/O limits on it, some of it related to longer seeks between swap and non-swap partitions.

    On the XU4 I’d put the /sq space with the squashed file systems on the u_SD card precisely to avoid any head seek issues as the SD cards are random access. It is working very well. (Then again, it has 2 GB memory, 8 faster processors, and swap on a USB 3.0 disk…). I didn’t do that here simply because this chip is already stuffed over capacity ( I had to move /usr/lib to a hard disk partition to make it work at all…).

    So my necessary conclusion from this is just that trying to run it all from one hard disk on separate partitions in the “Old Style Unix Way” is not going to work well. The I/O speed is too low, the memory is too small, and the disks are not good at seeks over any distance.

    The present layout is:

    vfat boot partition
    swap – a 4 GB partition
    /sq – 8 GB. squasfs images to use for overlay in protect mode
    /var – a 4 GB /var partition
    6 GB of “other stuff not used right now”
    /usr/lib – 4 GB of the /usr/lib library space
    32 GB of “other stuff not used right now”
    a presently unused 2nd swap space of 4 GB for use in testing which swap location is better
    /SG/ext – 340 GB of active storage including my present home directory

    Very much a sub-optimal layout, especially if you will be hovering over swap a lot when using a browser AND needing to read / write cache files in the home directory a lot from that same browser but at the other end of the disk.

    So adding swap on a second spindle helps get rid of those long seeks. Swap can have a head sitting over it while the other disk has that head over my home dir cache area as the browser launches. No more see of D Disk waits at launch…

    I’d expected swap near /sq would work better for the squashfs file system mounts and the tmpfs use. Swap near intense system space is often A Good Thing. But when the browser is a much larger part of the swap load, well….

    I’d still like to get down to just one disk on the R. Pi (it seems odd to have a dinky system and several GB of disks on it ;-) but I also don’t want to move partitions around on that disk AGAIN. (It would be the 3rd time in about a week…) So I’m going to put it off for a while.

    When I get around to it, I’ll be building a new system on a new bigger u-SD card, putting /sq on the u-SD card (so zero seeks…) along with all of /var /lib /usr etc. Again, zero seeks either when live or when running with overlay squashfs system. Then have a nice sized swap right next to my home dir space on the hard disk… I’ll accept the added bit wear on the SD card from /var on it in exchange for the lower level of wear when a squashfs overlay is in place along with a tmpfs writeable overlay.

    So the net-net of it is just that running with squashfs / tmpfs overlay and lots of swap activity changes the way partitions are best laid out. You want more random-reads from u-SD card and less disk head seeks, especially to swap space if tmpfs has swapped any blocks out.

    I’ll probably get back to that next week…

    For now, I’m running with the added swap disk, and I’ve decided to layer on the “protect” mode one group at a time and see if (when?) it becomes prone to a hang again (in the hope that fingers the cause).

    I’ve got a new set of scripts with really informative names like p1, p0, p2… As I type this, I’m in p1 mode:

    root@PiM3Devuan2:/# bcat p1
    mount -o loop -t squashfs /sq/sq/bin.sqsh  /bin
    mount -o loop -t squashfs /sq/sq/sbin.sqsh /sbin
    mount -o loop -t squashfs /sq/sq/lib.sqsh  /lib
    mount -o loop -t squashfs /sq/sq/usr.sqsh  /usr
    

    So ONLY running with the non-writable file system overlays. Nothing ought to be writing into those spaces unless I’m doing a system update, so ought to be Just Fine to lock them down. I also did a re-creastion of all the squashfs images this morning (as I’d added some software and scripts) then ran “debsums” over it and all was OK.

    I’ll be running this way until I’m satisfied No Bad Thing is happening at all. Probably until tomorrow. Then, one at a time, I’ll add in the file systems that get a squashfs overlay and a tmpfs overlay over that.

    At this point it ought to me just a Q.A. cycle. There ought not be any more failure / lockup issues. But “we’ll see”. That’s what Q.A. is all about, isn’t it? Having a look see…

  19. E.M.Smith says:

    Well…. that was “interesting”…

    I decided to move off of the SG500 disk for swap, so did a “swapoff /dev/sda2” . After a minute or two all 189 or so MB was released. Some just deleted as not longer needed, others moved to the other disk swap area.

    All well and good. Somewhere over 600 MB of swap used.

    I clicked on the Chromium window to bring t to the front and activate it…

    The htop lit up with a dozen+ red D diskwait processes. Including “loop 3” that is /usr squashfs overlay. And it stayed that way for about 1/2 an hour…

    Unlike prior “hangs”, I had positioned the Chromium window where I could still see the htop display. It became clear the “hang” was just a swap thrash. I could still click on an open terminal window and type in commands. It would take 1/2 minute or so to register, but it would happen.

    Eventually I got tired of waiting and just issued a “halt” in that terminal window.

    So now I’m back.

    At this point I think I have a clear handle on what was wrong before.

    The critical problem was a faulty USB connection filling /var until things broke. The secondary problem, what I was calling a “hang”, was in fact a swap thrashing that I could not see as chromium was the top window.

    A swap of hardware looks to have fixed the /var filling up issue. Cutting back to just the squashfs mounts has let me see the swap thrashing issue.

    But why is swap thrashing?

    1) First off, I was approaching 700 MB of swap used when this happened. That’s almost at the “recommened max” of using 50% of total address space as swap and 50% real memory. That’s known to run into issues with needed system “stuff” being swapped out to make room for user “stuff”.

    2) Chromium is a memory hog. At some point my open tabs would be more memory than physical memory, just for Chromium. How will that be handled?

    3) At first Chromium launch (with ‘open old tabs’ setting) 180 MB or so rolls out to swap. This is NOT chromium stuff as it is just booting. It will include mostly system stuff as I’m running nothing else.

    4) Dumping some of that stuff in the “consolidate swap” activity will result in some of it, when needed, being loaded from /usr/lib and /var again. With /usr/lib on a squashfs /usr partition, this will require the squashfs code and memory space to be reloaded / returned to memory as well.

    5) At this point, the system has bits on swap that it needs to do its job. Squashfs has bits of program and data on swap that it needs to do its job of feeding the system the files (binaries) it wants. Chromium, coming to the foreground, wants all of its swapped out chunks back in memory.

    6) They all three end up taking turns swapping in, doing a bit, swapping back out, next one in, do a bit, swap out…. We end up running at disk transfer speed, not memory speed.1 second becomes an hour…

    At least, that’s my working theory on the “hangs”…

    Without squashfs, the system binaries can just be read in from disk / SD card and run. No need to oscillate between them, the squashfs code, and releasing bits from swap. My guess is that a few more tabs open in Chromium (to where it has 800 MB or so on swap) and this same thing (swap thrash) can happen without squashfs adding to the load. I don’t know that I’ll test that case any time soon ;-) Maybe next week…

    For now, I’m just going to restart the QA cycle without the /usr overlay in place. IF it is the loop3 / squashfs of /usr that’s the place having swap issues, this will eliminate that and show it. If it isn’t, then it’s just generally too much swap in use (as /bin /sbin and /lib are not showing up with D wait states on their loop devices…)

    But, as usual “We’ll see”…

    It may just be that what with having too small a memory (by “modern” browser writer standards…) and a known weak I/O channel on slow USB 2.0 disk, using a lot of swap is just a bad idea and using a Big Fat LOADS of tabs open browser even worse… on the Pi M3.

    So watch your swap usage and if it starts getting up to about 700 MB, think about what you are doing…

    Now, without /usr:

    root@PiM3Devuan2:/# bcat p0
    mount -o loop -t squashfs /sq/sq/bin.sqsh  /bin
    mount -o loop -t squashfs /sq/sq/sbin.sqsh /sbin
    mount -o loop -t squashfs /sq/sq/lib.sqsh  /lib
    root@PiM3Devuan2:/# p0
    root@PiM3Devuan2:/# df
    Filesystem     1K-blocks      Used Available Use% Mounted on
    /dev/root        3997376   2810984    976680  75% /
    devtmpfs          438328         0    438328   0% /dev
    tmpfs              88764      1612     87152   2% /run
    tmpfs               5120         4      5116   1% /run/lock
    tmpfs             177520      3288    174232   2% /run/shm
    /dev/mmcblk0p1    130798     35754     95044  28% /boot
    tmpfs            2097152         4   2097148   1% /tmp
    /dev/sdb3        8125880   5855936   1850516  76% /sq
    /dev/sdb5        3997376   1684000   2103664  45% /var
    /dev/sdb8        3997376   2203960   1583704  59% /usr/lib
    /dev/sdb14     350521980 233825092  98884204  71% /SG5/ext
    tmpfs             443800         0    443800   0% /sys/fs/cgroup
    tmpfs              88760        12     88748   1% /run/user/1666
    /dev/loop0          6016      6016         0 100% /bin
    /dev/loop1          3840      3840         0 100% /sbin
    /dev/loop2         34432     34432         0 100% /lib
    
  20. E.M.Smith says:

    And another test….

    On the off chance it might be something to do with squashfs, I copied the same three directories to some of those unused partitions and remonted them as “ro” read only. So now /bin /sbin and /lib are still Read Only, but from real disk not from a squashfs. Did a reboot, and again launched the browser.

    Clicking across tabs, one at a time, I’d see swap use rise as that tab loaded. At about 500 MB, once again the system had an apparent hang. This time even the cursor would not move. The disk with swap on it was still blinking, so I’m pretty sure it was running just that the code for “move mouse” was swapped out….

    So the same mode of “failure” I’ve seen for a couple of days now, but NOT while using squashfs, or tmpfs overlays. (There is still a tmpfs /tmp file system, but it isn’t an overlay, just an fstab defined filesysem).

    So at this point the “bad behaiour” is down to 2 possibles:

    1) Just doesn’t like swap all that much for some reason and Chroium uses a lot.

    2) System does not like one of /sbin /bin or /lib mounted Read Only.

    There is the ever-present possible of “Something I just don’t’understand”… But it doesn’t get a number (or, if it did, it would be element 0 on our list…)

    The really really good thing is that the test of ro disk partitions says it isn’t anything at all related to squashfs, tmpts, or overlays.

    One page I was opening is a “suspect” web page from an unknown possible spammer, so it is also possible that it is trying to hack my machine and write over one of those directories and that is what is causing the hang. I’d hope trying to write a ro file system would not hang things, but who knows… Here’s the disk layout:

    root@PiM3Devuan2:/# df
    Filesystem     1K-blocks      Used Available Use% Mounted on
    /dev/root        3997376   2810980    976684  75% /
    devtmpfs          438328         0    438328   0% /dev
    tmpfs              88764      1604     87160   2% /run
    tmpfs               5120         4      5116   1% /run/lock
    tmpfs             177520         0    177520   0% /run/shm
    /dev/mmcblk0p1    130798     35754     95044  28% /boot
    tmpfs            2097152         4   2097148   1% /tmp
    /dev/sdb3        8125880   5855936   1850516  76% /sq
    /dev/sdb6        3997376     17600   3770064   1% /sbin
    /dev/sdb5        3997376   1720464   2067200  46% /var
    /dev/sdb7        1998672     16116   1877700   1% /bin
    /dev/sdb8        3997376   2203960   1583704  59% /usr/lib
    /dev/sdb12       8125880    128508   7577944   2% /lib
    /dev/sdb14     350521980 233824964  98884332  71% /SG5/ext
    tmpfs             443800         0    443800   0% /sys/fs/cgroup
    tmpfs              88760         8     88752   1% /run/user/1666
    
  21. E.M.Smith says:

    With all file systems writable, at about 1 GB of swap in use, some of the same issues start to show up (lots of D disk wait on new tabs, cursor unresponsive for a few seconds) except the degree is less. It has not “hung”, and the cursor responds in seconds instead of minutes.

    So, OK, I’m pretty sure if I pushed it to 2 GB with even more tabs I’d eventually make it really sick.

    IMHO, this shows the “hang” on swap thrash is made worse by tmpfs / squashfs etc (that makes sense as now you must have CPU work to decompress those files and memory to do it in) but it is a matter of degree, not kind. About 500 MB swap used acts about like 1 GB with all disk writable (and much of that from the u_SD card so zero rotational latency to read and no head seeks).

    My conclusion, then, is that this “overlay” protect mode works fine and is not the cause of the “issue”, it just makes it show up at less memory usage. So OK, don’t open Chromium with 20 tabs open one of them a movie and the others heavy page weight blog editing pages…

    Or use a machine with 2 GB of memory…

    With that, I’m closing out this investigation of “odd behaviour” as it clearly isn’t related to this thread but is a workload limit issue of the Pi.

    I’ll continue running in “protect” mode, but in the original form where I’d only run the stripped down login for minimal tabs open at any one time.

    Any “big stuff” that needs it I’ll do on the XU4.

    Here’s the file system layout as tested with all the misc bits from the u_SD card (like /sbin /bin /lib …)

    chiefio@PiM3Devuan2:/$ df
    Filesystem     1K-blocks      Used Available Use% Mounted on
    /dev/root        3997376   2810980    976684  75% /
    devtmpfs          438328         0    438328   0% /dev
    tmpfs              88764      1604     87160   2% /run
    tmpfs               5120         4      5116   1% /run/lock
    tmpfs            2064940         0   2064940   0% /run/shm
    /dev/mmcblk0p1    130798     35754     95044  28% /boot
    tmpfs            2097152         4   2097148   1% /tmp
    /dev/sdb3        8125880   5855936   1850516  76% /sq
    /dev/sdb5        3997376   1720624   2067040  46% /var
    /dev/sdb8        3997376   2203960   1583704  59% /usr/lib
    /dev/sdb14     350521980 233813524  98895772  71% /SG5/ext
    tmpfs             443800         0    443800   0% /sys/fs/cgroup
    tmpfs              88760         8     88752   1% /run/user/1666
    
  22. E.M.Smith says:

    Some issues just won’t let you go…

    I’m prone to “persistence”. Until something is 100% known, I fiddle with it. Nibble at it. Return to it. Try another approach.

    So while just using this particular Pi I’d try subsets of the full “protect” mode and I’d try just using partitions as “ro” Read Only real disk. What’s the minimum you can do, what’s the max, which is better.

    Well, that otg error returned.

    Aug 24 11:31:02 PiM3Devuan2 kernel: [   32.979080] WARN::dwc_otg_handle_mode_mismatch_intr:68: Mode Mismatch Interrupt: currently in Host mode
    Aug 24 11:31:02 PiM3Devuan2 kernel: [   32.979080] 
    Aug 24 11:31:02 PiM3Devuan2 kernel: [   32.979155] WARN::dwc_otg_handle_mode_mismatch_intr:68: Mode Mismatch Interrupt: currently in Host mode
    Aug 24 11:31:02 PiM3Devuan2 kernel: [   32.979155] 
    Aug 24 11:31:02 PiM3Devuan2 kernel: [   32.979279] WARN::dwc_otg_handle_mode_mismatch_intr:68: Mode Mismatch Interrupt: currently in Host mode
    Aug 24 11:31:02 PiM3Devuan2 kernel: [   32.979279] 
    

    A lot of it. “tail -f /var/log/syslog” was a constant stream.

    -rw-r----- 1 root adm 51342319 Aug 24 11:31 /var/log/kern.log
    -rw-r----- 1 root adm 51326179 Aug 24 11:31 /var/log/messages
    -rw-r----- 1 root adm 51487640 Aug 24 11:40 /var/log/syslog
    Filesystem     1K-blocks    Used Available Use% Mounted on
    

    Yah, having just zeroed the logs yesterday, they were at 51 MB and rising fast.

    What mode was the system in? What had I done?

    I had mounted /usr/lib as read only. That was ALL.

    The same “real USB disk” partition I’ve used for months (years?) as “rw”. Just figured: Well, if not doing squshfs why not just make the partition read only? So changed fstab to “ro” mount. Basic file mount read only ought to be fine…

    So that error message is not due to a worn out USB connector on the other board. That was an error. Had I taken the bait and run with it, not realizing I’d changed two things at once? Hardware and NOT used a read only /usr/lib?

    OK, I made the /usr/lib mount rw and rebooted. Now everything is fine. I need to repeat this cycle a couple of times to confirm it. (Having once accepted the first correlation as causal I’m not doing that again…) At this point, though, there are only 2 choices. The ro mount of /usr/lib is causal and will regularly reproduce, OR, it’s a pernicious bug of some sort unrelated to what I’m doing as ALL I’m changing is one mount from rw to ro and back (and if that doesn’t turn it on and off, and I’m not changing anything else, who knows what’s causing it).

    After a few more “change and reboot” cycles, IF I have this reliably switching on and off, then I’ll try a writeable tmpfs overlay over a ro /usr/lib to see if just making it look writable is enough, or if any read only mount in the stack causes it to barf.

    I know, painful to watch someone thrashing on a debug process. But that is the nature of debugging. You think you have it. You run with it. You find no wagon attached to that handle. You trudge back to the starting point and grab another handle… Eventually you get a handle on IT.

  23. E.M.Smith says:

    Changed /usr/lib back to RO and rebooted. No Problem.

    OK, “it isn’t about me”…

    Nothing changed in the hardware config. Not even a wiggle of a wire. I have the power supply on a “clicker” where I push a button, it powers off. Push again, it powers on, so not even picking up the board or a plug / unplug of power. Not even touching the table to unplug the power brick.

    I’ve tested that I really did get a Read Only mount (so didn’t fumble the change in /etc/fstab)

    At this point I have to lean toward “some kind of pernicious bug in bring-up process”. Sometimes it comes up stable. Sometimes with runaway otg error messages. I’d say it could be PSU related, but the disks are on a powered hub so no load on the Pi PSU there, and it’s a factory standard Pi PSU that I’ve used for many boards for years without issue (and even with a loaded USB on the Pi so stuff taking power from it). Nor is there a low volts indication on the screen.

    I did learn one nice thing. Monitoring the log files I saw this pop up on launch of chromium:

    Aug 24 12:14:09 PiM3Devuan2 chrome[2923]: [2923:2923:0824/121409.012183:
    ERROR:browser_dm_token_storage_linux.cc(101)] 
    Error: /etc/machine-id contains 0 characters (32 were expected).
    

    Devuan, at least, is presently giving the middle finger to the folks wanting a machine ID tracker ;-)

    So now what?

    Well, I’ll just be launching a log file monitor script every time I boot up and have it running in a small terminal window. Watching to see when / if the bug shows up again. I probably ought to do a search for it as a known bug, but have avoided that. It can be as much time sink as help.

    Here’s my monitoring dinky shell script:

    root@PiM3Devuan2:/usr/lib# bcat logmon
    while true
    do
      echo
      ls -l /var/log/kern.log
      ls -l /var/log/messages
      ls -l /var/log/syslog
      df /var
      sleep 30
    done
    

    Then, in another panel, I have “tail -f /var/log/syslog” running. It gives you the last 10 lines of a file, and then any new lines that show up, as they show up.

    They are both happy right now. So unhappy 2 boot-ups ago. Changed mount to RW and rebooted. Happy. Changed it back to RO and rebooted. Happy. So first unhappy not related to change of mount type. Unknown causality. Sigh.

  24. E.M.Smith says:

    Full “protect” mode, Chromium browser, 16 tabs open (and viewed so the bits loaded), 662 MB on swap, and no problems. At this point I suspect the thrash lock before was in part due to whatever was driving the log files to massive writes / growth tying up the disk and making swap activity (away from that part of the disk) difficult.

    So there’s some unknown “bug” in something sporadically causing the otg error storm, and it has nothing to do with the “protect” stuff other than that it screws up system performance…

  25. E.M.Smith says:

    Well, it has been most of the day and into the night now in “protect” mode with about 750 MB on swap and everything is still fine.

    At this point I think the prior failures were when high log activity was happening, but I had no monitor running to see it. Clearly it isn’t anything to do with the squashfs or tmpfs mounts, given how well this run has been.

    So I have some pernicious error causing runaway log activity sporadically, and that’s about it.

    So once again I’m going to try to “close out this issue” and just hunt the cause of the log flood as it arises, and without comment until / unless an interesting cause shows up. So I’m just going to do “the usual” of system software update, check connections, etc. and not worry until it shows up again (but keep the monitors running all the time…)

Anything to say?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.