f2fs – Flash File System – Is A Disk Hog

There is a file system that is more “Flash Memory Friendly” for use on things like SSD drives and uSD cards. It is named f2fs on Linux.

I won’t go into how it works other than to say it tries to reduce how often blocks are re-written on flash memory so that it doesn’t wear it out.

I installed it on Devuan (or Debian) with:

apt-get install f2fs-tools

Then used gparted to create an f2fs partition on the uSD card on the Odroid XU4. (This is running the system I made via grafting Devuan UserLand onto an Ubuntu kernel and boot process).

I saved the Ubuntu firmware, and kernel modules into partitions named:

Filesystem    1K-blocks    Used Available Use% Mounted on
/dev/mmcblk1p3   999320  531548    415344  57% /Ub/firmware
/dev/mmcblk1p5   999320   55932    890960   6% /Ub/modules

So that I could then copy them back into the Devuan image and boot the /boot image of the kernel from Ubuntu, it would find the expected modules and firmware, and all would be good. It worked.

But now I thought “Hey, I don’t need those partitions as a staging / saving area anymore, I can play with them”. And proceed to make “big enough” (I thought) f2fs replacements.

Well, I made /f2/modules about 100 MB that was almost double the 53 MB used, figuring that would be plenty. It wasn’t. During the copy, I got “out of space” errors.

OK… Deleted the f2fs partitions, remade /f2/modules as about another double in size, and did the copy again:

Filesystem    1K-blocks    Used Available Use% Mounted on
/dev/mmcblk1p5   999320   55932    890960   6% /Ub/modules
/dev/mmcblk1p7   210944  139616     71328  67% /f2/modules

So a measured 54 MB of data that takes 56 MB of ext3 space, now occupies 139 MB of flash space. That’s more than double.

I’m going to copy /Ub/Firmware next and see if it more than doubles too. Files are stored in “blocks”, and a file of, say, 100 bytes, will take one whole block. Originally these were 512 bytes minimum, and up to 4k. Over time, the minimum has moved to 1k, and in some cases 4k bytes. This makes it more efficient to retrieve large files (fewer blocks, seeks, reads) but less efficient at storing a lot of very small files.

It is possible that the f2fs file system just uses a Very Large Block size. Potentially it tries to match it to the Flash Block Size that IIRC is often 16k bytes. That would make sense since only whole blocks are read / written at once; so limiting it to one item of data would reduce “read / modify / write” operations on that block. (IF, for example, 4 Linux 4k blocks mapped onto 1 16k flash block, then change any one of those four and the whole 16k block gets read, modified, re-written).

In that case, really LARGE files will not show this “bloat” issue, but lots of little files will.

Note that file fragments can also cause some of this, though to a lower degree. Say you have a 12.8 K file and write it to 4 K blocks. 3 x 4 or 12 K of the file will fill blocks, but the 0.8 K will be left. It will be stuck, as a fragment, into a 4 K block, leaving 3.2 K of empty space in that block.

In prior decades folks would specify a ‘fragment’ size smaller than the block size (essentially 2 different block sizes) so that less space was wasted. You could have a 4 K block / 512 Fragment, and the leftover 0.8 K would be put in a 512 byte sized space wasting less. This has become less common as storage has become much larger and cheaper. I suspect this is not being done on Flash and it is aligning all writes to full flash blocks.

Doing the same thing on the /lib/firmware contents (saved into /Ub/firmware) also grew in size, but not nearly as much. IIRC it has more files of large size in it:

Filesystem    1K-blocks    Used Available Use% Mounted on
/dev/mmcblk1p3   999320  531548    415344  57% /Ub/firmware
/dev/mmcblk1p8   831488  673620    157868  82% /f2/firmware

So “only” from 531 MB to 673 MB. A “mere” 27% larger.

Which I think confirms the notion that this is an issue of block size / fragment size being very large.

Conclusion

The key takeaway here is that if you are going to move some file system from ext to f2fs, do a trial copy first to see how much space expansion is going to happen, then size your final target file system.

Because “reasonable” guesses of “about the same” are not likely to succeed.

Does the “wasted space” matter? Depends on how big your Flash device is and how much you care about reducing wear on it. Also on how many small files your system / directory has in it, so how much is wasted. Measuring is your friend here.

Subscribe to feed

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...
This entry was posted in Tech Bits. Bookmark the permalink.

3 Responses to f2fs – Flash File System – Is A Disk Hog

  1. E.M.Smith says:

    Well there’s a clue right there…

    root@XU4uDevuan3:/f2/usrlib# ls -ld /usr/lib /f2/usrlib/
    drwxr-xr-x 84 root root 8192 Mar 14 06:24 /f2/usrlib/
    drwxr-xr-x 84 root root 4096 Mar 14 06:24 /usr/lib
    

    Directories with just a few entries do not fill the whole block and are often the same as the block size / fragment size of a given file system. /f2/usrlib is an f2fs copy of /usr/lib from an ext3 file system. This shows the expected 4 k block on the ext file system, but also shows that f2fs is using an 8 k block for the directory. So pretty much every directory and every file / fragment under 8 k is going to grow by about 4 k of unused space…

    FWIW, /usr/lib was about 1.6 GB and grew to 2 GB in the copy. About a 25 % gain in space used.

  2. E.M.Smith says:

    https://zonedstorage.io/linux/fs/

    says block sizes are 4 k

    Limitations
    f2fs uses 32 bits block number with a block size of 4 KB. This results in a maximum volume size of 16 TB. Any device with a total capacity larger than 16 TB cannot be used with f2fs.

    BUT it looks like there’s a fair amount of other “stuff” going on including duplicate copies of various meta-data blocks so it can just swap which one is being written (so reduce various issues with copy on write and garbage collection and…)

    https://lwn.net/Articles/518988/

    This is talking about the metadata and block tracking parts of the file system, not the data blocks stored. Different storage strategies are used for actual data vs metadata vs file system housekeeping data. LFS is Log File System. FTL is Flash Translation Layer (the stuff in the uSD or SSD that is actually managing the flash memory space internally).

    There are three approaches to management of writes in this area. First, there is a small amount of read-only data (the superblock) which is never written once the filesystem has been created. Second, there are the segment summary blocks which have already been mentioned. These are simply updated in-place. This can lead to uncertainty as to the “correct” contents for the block after a crash, however for segment summaries this is not an actual problem. The information in it is checked for validity before it is used, and if there is any chance that information is missing, it will be recovered from other sources during the recovery process.

    The third approach involves allocating twice as much space as is required so that each block has two different locations it can exist in, a primary and a secondary. Only one of these is “live” at any time and the copy-on-write requirement of an LFS is met by simply writing to the non-live location and updating the record of which is live.
    This approach to metadata is the main impediment to providing snapshots. f2fs does a small amount of journaling of updates to this last group while creating a checkpoint, which might ease the task for the FTL somewhat.

    Looks like with all the cache, journal, and log buffering it can be pretty fast though:
    https://www.linux.org/threads/comparison-of-file-systems-for-an-ssd.28780/

    The ‘time’ command will run another command, in this case, the ‘cp’ command. The file ‘sample.txt’ is being copied from the current folder to the drive labeled ‘SSD’. The time is listed after the command is finished executing. The times, in seconds, given by the command are below for each filesystem checked.

    EXT2 1.662 602 MB/s
    EXT3 1.074 931 MB/s
    EXT4 .772 1295 MB/s
    F2FS 1.060 943 MB/s
    FAT32 3.085 324 MB/s
    HFS+ .946 1057 MB/s
    JFS 1.370 730 MB/s
    NTFS 7.637 131 MB/s
    ReiserFS 1.310 763 MB/s
    UDF 2.194 456 MB/s
    XFS .4935 2026 MB/s
    

    Not as fast as ext4, but faster than ext2, about the same as ext3, and much faster than FAT32 or NTFS. Even faster than ReiserFS for this sized write (at 5 GB things change).

    So, OK, saves wear on the uSD card and it’s fast. But uses about 25% more space on bigger file systems and may double space on lots of little files in a small file system. Got it.

    Overall conclusion is that I’m going to start using it for medium-active file systems of some size. I’ll still leave “mostly static” stuff on other file systems (for now…) and very active stuff (like home directory, swap and /var ) will go on a hard disk partition if possible. Extremely active stuff to zram if at all possible (swap, /tmp, log files, …)

    FWIW: 2 x 32 MB uSD cards that have been in frequent use for about 5 years have started to have occasional odd behaviours. (One has had a failed file system / loss of super blocks / bad fsck). So probably reaching EOL (or my card reader / USB adapter is about to die… I’ve had 2 SD/USB adapters bite the dust already in that time period… ). So I’m looking to reduce wear on my inventory of uSD cards until I get around to buying more ;-)

    I’ve made systems before where only /boot was on the uSD and everything else was disk partitions, but that starts to be a pain with a dozen SBCs as you would need a dozen HDDs and maybe even a few USB Hubs to provide power to the disks… (R.Pi can’t power most HDD directly in my experience, but using a powered USB Hub lets them work. Odroid and Rock products seem to have the power to not need a hub. That’s what the use of uUSB / USBc for power does… limits your power. Round Barrel Connectors let you pump through the juice…)

    So for many of my systems, it all still lives on a uSD card… Only the 3 main Daily Driver / Infrastructure SBCs get powered USB hubs.

    Well, back to the experimenting… Next step is to mount /usr/lib from f2fs and see if things still work, or not. Bascially find out how much of the OS filesystem name space can run from f2fs. As I understand it, booting from f2fs requires special adaptations / care in construction… I’m unlikely to get there for a few months as I’m happy to let /boot be on FAT32 or ext.

  3. E.M.Smith says:

    The f2fs file system is a bit odd some times. The Odroid N2 doesn’t want to mount it even though gparted on it will format a file system to it just fine. f2fs-tools is there and installed. Maybe it’s a mount option thing… Using “auto” in fstab on the Odroid XU4 fails to mount at boot, but take it out and “defaults” works great.

    I’ve copied /var /usr and /lib onto f2fs partitions on the Odroid XU4 and it is running fine. It seems a touch “snappier” on opening programs and things. Other than that, not noticed any difference (yet).

    BTW, you can’t change an f2fs file system label with gparted after you made it, need to delete and remake it. Odd as it works on other file system types. Oh Well…

    I’ve just mounted /var /usr and /lib over the existing stuff on / (i.e. left the regular version in place under the mount so I can just NOT mount the f2fs versions and it’s all back to normal) for testing. It’s a neat little trick ;-)

    ems@XU4uDevuan3:~$ df
    Filesystem                  1K-blocks    Used Available Use% Mounted on
    /dev/mmcblk1p2               12192904 7357696   4192752  64% /
    /dev/mmcblk1p8                6187008 4076460   2110548  66% /usr
    /dev/mmcblk1p1                 130798   16742    114056  13% /boot
    tmpfs                         2097152   10776   2086376   1% /tmp
    /dev/mmcblk1p3                4192256  995284   3196972  24% /lib
    /dev/mmcblk1p9                4188160 2326472   1861688  56% /var
    /dev/sda7                     4062912 1323624   2529576  35% /home/ems
    /dev/sda6                     4062912 2019536   1833664  53% /media/ems/XU4_var
    

    In the last two lines, you can see my home directory is on a real HDD over USB and just below it the USB disk partition where I HAD been running /var so all that “var”iable stuff put I/O wear off the uSD chip. But you do get rotational delay and occasional start up lag…

    /boot is a FAT partition (as the Ubuntu base from which I made this Devuan system does that)
    / is an ext4 partition again as that’s what Ubuntu base / boot expected.

    The rest of the uSD file systems are now f2fs. They don’t amount to much:

    root@XU4uDevuan3:/# du -ms * | sort -rn
    
    3003	usr
    1965	media
    1938	var
    1280	home
    656	lib
    17	boot
    10	sbin
    10	bin
    8	etc
    7	run
    

    That “media” entry is the old /var on USB HDD so can be ignored. usr and var are on f2fs. The “home” is my home director on ext3 USB HDD so not relevant. Then lib is on f2fs, boot has to stay FAT32.

    So the stuff that maybe could be moved to f2fs is sbin, bin, etc (a total of 28 MB of stuff that ought not be changing very much anyway…) and not much else. (everything else being either like /run and /proc that really ought to stay with / for boot time, or less than one MB so what’s the point?

    Note that on the larger file systems total disk used starts to come back into line. /var is now 2.3 GB and on ext3 HDD was 2.0 GB. /usr also only went up a small amount, and /lib actually shrunk just a tiny IIRC.

    ems@XU4uDevuan3:~$ file -s /dev/mmcblk1p3
    /dev/mmcblk1p3: F2FS filesystem, UUID=69480876-170e-4033-90f7-8698e1438354, volume name "f2fs_lib"
    ems@XU4uDevuan3:~$ file -s /dev/mmcblk1p8
    /dev/mmcblk1p8: F2FS filesystem, UUID=508add0b-e5cf-4206-9834-576108284fe2, volume name "f2fs_u"
    ems@XU4uDevuan3:~$ file -s /dev/mmcblk1p9
    /dev/mmcblk1p9: F2FS filesystem, UUID=e7d3b055-199a-4413-8679-135f806b78c6, volume name "f2fs_var"
    ems@XU4uDevuan3:~$ 
    

    I’m going to leave it running like this for a while and decide if I like it, or not. Eventually I MIGHT delete the old /usr /var /lib that’s lurking under the mount points… OTOH, I’ve got plenty of free space and they will never change / wear the chip under another mounted file system… And it’s kind of like a backup copy ;-)

    Anyway…that was the exploration for this afternoon / evening. Get the f2fs file system installed, and most of the OS moved onto it to do some evaluations.

Anything to say?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.