LVM Bits – Scraping Saga Continues

From the “Well that was fun, NOT!” department…

I’ve had a ‘site scrape’ on GHCN, NOAA/NCDC, GIStemp of various degrees for ‘a while’. The last increment downloaded a huge chunk of ‘superghcnd’ data and essentially ‘filled my disk’. So I added some dribs and drabs of other disks as a stop gap. A 2 TB physically small WD and a 1.5 TB Seagate that is in that “awkward stage” between full TB increments. That, then, filled to about 6 TB out of a 6.82 TB formatted capacity.

Thanks to generous donations (with a special h/t to P.G. Sharrow for a very generous donation) I’ve added another 4 TB (gross size, 3.64 TB formatted) disk.

So I decided to remove the 1.5 TB disk ( 1.36 TB formatted and likely only about 900 MB of it actually in use at the moment as it was added last so all the free space ought to be on it).

No Problem, I think. It is all built on LVM and that’s the whole point of LVM. Letting you add and remove physical volumes without disturbing the logical structure.

But why do this at all?

Well, the Orange Pi One I’ve been using as a scraping engine has one USB port, so you plug in a hub to get more. I have 2 hubs. A 4 port and a 7 port. The 7 port is used on my primary machine as it typically runs with mouse, keyboard, and a few disks. So with 3 disks already on the Orange Pi, adding a disk fills all the ports. This means no more disks AND that it can’t be run ‘headful’ with keyboard, mouse, and monitor. It’s better to keep a port or two open so at least you can boot with keyboard and monitor for command line recovery purposes.

Well, I plugged the new disk into the 4 port hub and added that disk. I did a shutdown / reboot prior to doing the data move, just to assure all things were fresh at boot time. It would not let me open a new SSH session after the reboot / power up. Well, something is not right… Skipping a lot, it seems I had left a ‘mount this disk’ in the /etc/fstab file from something months ago during the initial set-up and at boot time, no longer having that disk, it fails to finish the boot. Fixed that, and rebooted. It came up fine.

By then I’d also moved the mouse, keyboard, monitor, large USB hub etc. over to the Orange Pi One so as to make maintenance quicker and avoid those “ssh denied” problems. Added the disk to the LVM and all is good, right? I used gparted to format the disk as mostly LVM with a 1 GB linux swap instead of using pvcreate as in the docs, and then did a ‘vgextend’. Shows 10 TB of disk and ready to migrate off the old disk.

Well, at that point one uses pvmove in a nearly trivial command. My old disk partition was named /dev/sdd1 so the command is “pvmove /dev/sdd1”.

I got the error message that the kernel module dm-mirror was not installed. What the?…

Seems that on Armbian, when you install LVM, it does not install all the dependencies. In particular the kernel module used by pvmove. That’s the kind of stuff you find in “young ports”. Over time, as folks run into those bugs, report them, and future builds are fixed, they “go away”. That doesn’t help me, now, though.

OK, I can either get into the land of rebuilding kernels and modules and / or system updating and / or all the attendant QA et all, or I can “move on”. I chose to “move on”. Simply export the whole LVM, then import it to the Raspberry Pi on Devuan. Debian is a MUCH more robust and older project so that kind of stuff is pretty much stamped out. Devuan is a special purpose build on top of Debian, so benefits from all that.

The move was flawless, and the file system came up nicely. Now the Orange Pi is free for other tasks, or I can later move the LVM back if desired. Time for that pvmove to empty the odd disk (and make room for larger disks later…).

I launched pvmove and it complained that there were no extents to allocate. What the? I just added 4 TB of extents! But pvdisplay said they were all spoken for. After some degree of head scratching, I found that I needed to do a lvreduce command to release some of the extents. Since I’d not expanded the actual file system built on top of the logical volume, I didn’t need to shrink that first. It was already at 6.82 TB. So I did an lvreduce to about 7.5 TB (leaving lots of room beyond the end of the file system, while also having plenty of room to move the 1.5 TB disk). Then the pvmove worked. Or rather “is working”.

The docs warn that “pvmove is slow”. That is a gross understatement. It Oh So Slowly makes a copy, the whole time not just moving data blocks but file system structures as well, then eventually releases the extents on the old disk when the new copy is ready to go. I launched it last night at about 10 PM? something like that. Today, after lunch, I’m at 68% moved… I figure it will be done at about the 24 hour mark.

Then I can do a vgreduce and eventually physically remove the disk from the system. All told, it will be about 48 hours from start to finish to add the 4 TB disk and remove the 1.5 TB disk. “someday” I get to do this all over again for the 2 TB disk when it gets replaced by something bigger… or maybe I’ll just go buy another 7 port USB hub and forget about it…

In Conclusion

So that’s the kind of adventure you end up in, when using data center production techniques like LVM on a $16 brand new card with an operating version (Armbian) aimed mostly at small internet appliances.

For now, I’m planning to leave the Orange Pi One powered off. Eventually I’ll put a newer copy of Armbian on it (i.e. do an update cycle) and make sure it has dm-mirror installed. Until then, running the LVM stack on the Raspberry Pi is fine. When I get it slammed with a model run, I’ll want to move that scrape process somewhere else, but for now it’s fine. This also will let me explore some kind of case for the Orange Pi (that is currently loose on the desktop as it doesn’t fit Pi cases) and make a more lead dressed and clean system out of the present “pile of parts” on the desktop. With luck, a re-scrape of cdiac will not take more than the 2 x 4 TB big disks and I can recover that 2 TB Western Digital for other things. IFF it ends up spilling out to the whole stack, well, I’ll add some big disk and ‘move on’.

What I’d like to do is get the CDIAC files on the 2 TB disk and leave NOAA on the LVM group as it is big enough to need it. That way they can be moved between systems separately and without all the fuss involved in moving an LVM group. But we’ll see as the process sorts out and the final scrape is done. I think I’m not going to re-scrape NOAA, just because I have a clean scrape of it and the whole point is to avoid loss if the site gets ‘reduced’, so I do NOT want to stay in sync with it. But that is for the future… in a day or two when moving to the new disk finally, oh so slowly, completes…

Subscribe to feed

Advertisements

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...
This entry was posted in Tech Bits and tagged , , . Bookmark the permalink.

24 Responses to LVM Bits – Scraping Saga Continues

  1. p.g.sharrow says:

    @EMSmith I apologize for causing you so much trouble, LoL I am glad to see you assigned the task to a Pi.
    Query, just how much IO can you cram through a USB hub? …pg

  2. E.M.Smith says:

    @P.G.:

    The Pi limits at USB 2.0 and the hub is a 3.0 so it isn’t and can not be the limit…

    https://en.wikipedia.org/wiki/USB_3.0

    USB 3.0 is the third major version of the Universal Serial Bus (USB) standard for interfacing computers and electronic devices. Among other improvements, USB 3.0 adds the new transfer rate referred to as SuperSpeed USB (SS) that can transfer data at up to 5 Gbit/s (625 MB/s), which is about ten times as fast as the USB 2.0 standard. Manufacturers are recommended to distinguish USB 3.0 connectors from their USB 2.0 counterparts by blue color-coding of the Standard-A receptacles and plugs, and by the initials SS.

    USB 3.1, released in July 2013, is the successor standard that replaces the USB 3.0 standard. USB 3.1 preserves the existing SuperSpeed USB transfer rate, now called USB 3.1 Gen 1, while defining a new transfer rate called SuperSpeed USB 10 Gbps, also called USB 3.1 Gen 2, which can transfer data at up to 10 Gbit/s (1.25 GB/s, twice the rate of USB 3.0), bringing its theoretical maximum speed on par with the first version of the Thunderbolt interface.

    So the hub just isn’t a limit until I buy some kind of USB 3.0 or 3.1 based board.

    All the disks I have are 3.0 as is the hub, and whenever I have an actual need for speed, I’ll get some kind of 3.0 based board ( I think the Cubie Truck is one?). For now, I’m fine with “launch it and go to bed” as nothing I’m doing is really time critical and I can multitask with other systems for the duration.

    5 Gbit ( 625 MByte) per second would move a 600 Gbyte disk in about 1000 seconds or about 20 minutes… give or take some I/O overlap and some buffer issues…

  3. gallopingcamel says:

    In the 1970s I worked at ICL (International Computers Limited). Some of our programs took so long to run that there was a hign probability that the program would crash before completing its task.

    Forty years later computers are six orders of magnitude faster but it continues to amaze me how software can be so inefficient that processes such as the ones you are using can take days to complete.

  4. poitsplace says:

    For some reason today it struck me as amusing…all these recent “save the data” displays by scientists to “save” the data from Trump. But skeptics have been saving the data for years, having seen it get “adjusted” out of recognition or conveniently “lost” by those same scientists for so many years.

  5. E.M.Smith says:

    @G.C.:

    Well, I am slinging TeraBytes… and doing it with a $35 computer…

    @Poitsplace:

    Yeah… shoe, meet other foot….

    I’ve been “saving the data” since GHCN V1… now they are worried the v3.x+ modified adjusted homogenized crap will be lost. Well, I want it too, but more for evidence than for touting…

  6. E.M.Smith says:

    The saga continues…

    So to remove the 2 TB disk ( 1.82 TB usable) is involving a bit more fancy moves. Seems that the file system+(excess allocated in the logical volume) was just a wee bit bigger than 2 x 3.64 TB so I can’t just “pvmove” off of it into the added disk. I must shrink the file system first, then shrink the logical volume, then shrink the volume group.

    This, then, runs into the thing nobody likes to talk about: There is no standard way to measure a megabyte.

    Now if you shrink things too much, since these things are in layers, you can cut off the toes of your data and screw up your file system. So first you must shrink your file system enough to fit on those 2 x 3.64 TB disks. (Then you can shrink the logical volume and then you can shrink the volume group and then you can remove the physical volume… after you move and consolidate the actual data…)

    BUT: How big is what and in which units? Is a Megabyte 1000 Kilobytes? And just how big is a Kilobyte?

    Due to folks basically being lazy and liking base 10, a kilobyte is sometimes 1000 bytes and often 1024 bytes. Lately, in an attempt to “fix that”, new terms have been created.
    https://en.wikipedia.org/wiki/Kibibyte
    https://en.wikipedia.org/wiki/Mebibyte

    So ubergeeks are now talking in kibiBytes and MebiBytes … yet loads and loads of software, boxes, and people are not. “Good luck with that”…

    So: 1000 bytes x 1000 bytes is a MegaByte
    and 1024 bytes x 1024 bytes is a MebiByte.

    So what’s 1000 x 1024? (A kilo kibiByte)? Um…

    This little trap is laying all over the place for you in a disk shrinking operation. It is complicated more by there being file system blocks that don’t show up as used in a ‘df’ command for how much disk is used… (things that make the actual file structure work).

    So when I do a df, I get:

    /dev/mapper/TemperatureData-NoaaCdiacData 
    7207448472 5895069124  982746232  86% /LVM
    

    so you might think it was 5.89 TB (or is that TiB?) used. In which case a 6 TB (or 6,000 MB or is that MiB?) file system would be fine.

    Well, I was going to shrink it, but decided to do a ‘dry run’ first. The command used (that itself calls ‘resize2fs’) is ‘fsadm’ and has a convenient ‘just pretend’ flag of “-n”. The “-v” means ‘be verbose in messages’.

    root@Headend:/# fsadm -n -v resize /dev/TemperatureData/NoaaCdiacData 6000000M
    fsadm: "ext4" filesystem found on "/dev/mapper/TemperatureData-NoaaCdiacData"
    fsadm: Device "/dev/mapper/TemperatureData-NoaaCdiacData" size is 8375186227200 bytes
    fsadm: Parsing tune2fs -l "/dev/mapper/TemperatureData-NoaaCdiacData"
    fsadm: Resizing filesystem on device "/dev/mapper/TemperatureData-NoaaCdiacData" to 6291456000000 bytes (1830620160 -> 1536000000 blocks of 4096 bytes)
    fsadm: Dry execution resize2fs /dev/mapper/TemperatureData-NoaaCdiacData 1536000000
    

    Hmmmm…. so my 7.2 TB file system is 8.3 TB? Or is that 7.2 TiB is 8.3 TB? See where this is going? (The 8.3 includes blocks allocated in the logical volume but not in the file system)

    AND my resize to 6,000 M is really resizing to 6,291.456 M. (One guesses MB vs MiB maybe…) Well, OK, that OUGHT to be big enough, but I’m a bit worred so decide maybe to make it bigger? What the heck, as long as I’m under 7.28 TB (or TiB?) or so I’ll be ok…

    root@Headend:/# fsadm -v resize /dev/TemperatureData/NoaaCdiacData 6400000M
    fsadm: "ext4" filesystem found on "/dev/mapper/TemperatureData-NoaaCdiacData"
    fsadm: Device "/dev/mapper/TemperatureData-NoaaCdiacData" size is 8375186227200 bytes
    fsadm: Parsing tune2fs -l "/dev/mapper/TemperatureData-NoaaCdiacData"
    fsadm: Resizing filesystem on device "/dev/mapper/TemperatureData-NoaaCdiacData" to 6710886400000 bytes (1830620160 -> 1638400000 blocks of 4096 bytes)
    fsadm: Executing resize2fs /dev/mapper/TemperatureData-NoaaCdiacData 1638400000
    resize2fs 1.42.12 (29-Aug-2014)
    Please run 'e2fsck -f /dev/mapper/TemperatureData-NoaaCdiacData' first.
    
    fsadm: Resize ext4 failed
    

    Humphf. Even though cleanly unmounted it forces me to do an fsck (that itself calls e2fsck for ext2, ext3, and ext4 type file systems) and “check” the disk. Doing this is in fact a very good idea, and it does report some size information, so OK… I run e2fsck directly:

    root@Headend:/WD3/ext/chiefio# e2fsck -f /dev/mapper/TemperatureData-NoaaCdiacData 
    e2fsck 1.42.12 (29-Aug-2014)
    Pass 1: Checking inodes, blocks, and sizes
    Pass 2: Checking directory structure
    Pass 3: Checking directory connectivity
    /lost+found not found.  Create? yes
    Pass 4: Checking reference counts
    Pass 5: Checking group summary information
    
    /dev/mapper/TemperatureData-NoaaCdiacData: ***** FILE SYSTEM WAS MODIFIED *****
    /dev/mapper/TemperatureData-NoaaCdiacData: 1466535/457662464 files (0.2% non-contiguous), 1502525324/1830620160 blocks
    

    Lost & Found is where files that are found but disconnected from the directory structure get placed so you can decide where to put them back. It is typically missing in ext4 file systems like this one as the journalling makes it unlikely to be needed. OK, I let it be created.

    Now, down at the bottom, notice it gives you real infomation? It is reporting “blocks” which are in size 4096, which is what much of the rest of LVM uses. I’ve used 1,502,525,324 actual 4kiB blocks out of 1,830,620,160 in the file system. That first number is 6,010,101,296 1kiB blocks (the ‘1 k’ blocks *Nix traditionally reports is 1024 bytes). That means I either need 6010 M or maybe 5,731 M of “whatever” unit is in play… but I’m pretty sure it is 1000 x 1000 (base 10) for the M and then KiB for the blocks… maybe.

    So I figure the file system pad I added, just in case, looks like enough … re-launch the command and several hours later:

    fsadm -v resize /dev/TemperatureData/NoaaCdiacData 6400000M
    fsadm: "ext4" filesystem found on "/dev/mapper/TemperatureData-NoaaCdiacData"
    fsadm: Device "/dev/mapper/TemperatureData-NoaaCdiacData" size is 8375186227200 bytes
    fsadm: Parsing tune2fs -l "/dev/mapper/TemperatureData-NoaaCdiacData"
    fsadm: Resizing filesystem on device "/dev/mapper/TemperatureData-NoaaCdiacData" to 6710886400000 bytes (1830620160 -> 1638400000 blocks of 4096 bytes)
    fsadm: Executing resize2fs /dev/mapper/TemperatureData-NoaaCdiacData 1638400000
    resize2fs 1.42.12 (29-Aug-2014)
    Resizing the filesystem on /dev/mapper/TemperatureData-NoaaCdiacData to 1638400000 (4k) blocks.
    The filesystem on /dev/mapper/TemperatureData-NoaaCdiacData is now 1638400000 (4k) blocks long.
    

    Oh Joy. 1,638,400,000 is bigger than 1,502,525,324 so I’ve not trimmed the toes off my file system. Oh Joy.

    root@Headend:/WD3/ext/chiefio# df /LVM
    Filesystem                                 1K-blocks       Used Available Use% Mounted on
    /dev/mapper/TemperatureData-NoaaCdiacData 6450634432 5895073176 260538808  96% /LVM
    

    Oh Joy, I got it done with only 260 MB (or is that MiB? or 260,000 KiB?) of free space in excess. (After shrinking the logical volume and volume group I can grow it back out to 100% of the real space without so much angst…)

    Now I’m going to take a tea break and contemplate the next step. Shrinking the Logical Volume (that includes the unallocated space beyond the file system space) and the Volume Group (that includes all the stuff on all three disks excluding any non-commited to the LVM bits like Swap). It’s a 3 Layer Cake and I’ve just shrunk to one top layer…

    At this point, my main advice is: If you are thinking “I’ll just toss these old cruddy spare disks into the LVM for now and buy a new good one to replace them later”, just go buy the damn disk now.

    So easy to add, so painful to remove…

  7. Larry Ledwick says:

    Been fighting with that sort of thing at work. We try to select files to back up which will fit on the tapes and almost fill them, so we monitor file size. We have one monitoring application that reports file size but its numbers don’t match the linux reported file size exactly. Close but not quite right.

    I had to reverse engineer the numbers to figure out what it thought a megabyte is, ended up doing our own calculation based on df -k numbers so we knew exactly what we were dealing with to avoid having a backup demand a 4th tape on a data set that should fit on 3 tapes and then only write to it for 30 seconds before finishing the backup, and basically wasting an entire LTO-3 tape for a few kilobytes of data.

  8. E.M.Smith says:

    OK, after a certain amount of nail biting and checking things 4 times I went ahead and reduced the logical volume size. In theory, I could have just done the pvmove without this step as there were enough blocks between the free and the allocated but not yet used by the file system, but I wanted to see a big enough chunk in ‘free’. I had 1.9 M blocks allocated and 1.6 M used, so taking 100,000 and putting them on the free list ought to be fine by a factor of 3…

    root@Headend:/WD3/ext/chiefio# lvdisplay
      --- Logical volume ---
      LV Path                /dev/TemperatureData/NoaaCdiacData
      LV Name                NoaaCdiacData
      VG Name                TemperatureData
      LV UUID                ynlyz4-9FN3-4mGP-4jiv-T5Mc-YmBf-OD2P1g
      LV Write Access        read/write
      LV Creation host, time orangepione, 2017-02-10 19:35:33 +0000
      LV Status              available
      # open                 1
      LV Size                7.62 TiB
      Current LE             1996800
      Segments               4
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     256
      Block device           253:0
    

    So I launched the command:

    lvreduce -l -100000 /dev/TemperatureData/NoaaCdiacData 
      WARNING: Reducing active logical volume to 7.24 TiB
      THIS MAY DESTROY YOUR DATA (filesystem etc.)
    Do you really want to reduce NoaaCdiacData? [y/n]: y
      Size of logical volume TemperatureData/NoaaCdiacData changed from 7.62 TiB (1996800 extents) to 7.24 TiB (1896800 extents).
      Logical volume NoaaCdiacData successfully resized
    

    2 x 3.64 ought to be 7.28 of space so 7.24 ought to fit fine.

    root@Headend:/WD3/ext/chiefio# lvdisplay
      --- Logical volume ---
      LV Path                /dev/TemperatureData/NoaaCdiacData
      LV Name                NoaaCdiacData
      VG Name                TemperatureData
      LV UUID                ynlyz4-9FN3-4mGP-4jiv-T5Mc-YmBf-OD2P1g
      LV Write Access        read/write
      LV Creation host, time orangepione, 2017-02-10 19:35:33 +0000
      LV Status              available
      # open                 0
      LV Size                7.24 TiB
      Current LE             1896800
      Segments               4
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     256
      Block device           253:0
    

    Now how many extents ( 4 KiB blocks) show as free?

    root@Headend:/WD3/ext/chiefio# pvdisplay
      --- Physical volume ---
      PV Name               /dev/sdf1
      VG Name               TemperatureData
      PV Size               3.64 TiB / not usable 2.00 MiB
      Allocatable           yes (but full)
      PE Size               4.00 MiB
      Total PE              953605
      Free PE               0
      Allocated PE          953605
      PV UUID               1TbQEs-kh6h-0NGd-1cnn-rw95-M2iy-7tnoW8
       
      --- Physical volume ---
      PV Name               /dev/sdd1
      VG Name               TemperatureData
      PV Size               1.82 TiB / not usable 4.00 MiB
      Allocatable           yes (but full)
      PE Size               4.00 MiB
      Total PE              476667
      Free PE               0
      Allocated PE          476667
      PV UUID               1rV3Dc-xTSU-3g1d-9D6k-Q1CR-zI7U-EBPuQh
       
      --- Physical volume ---
      PV Name               /dev/sde1
      VG Name               TemperatureData
      PV Size               3.64 TiB / not usable 2.00 MiB
      Allocatable           yes 
      PE Size               4.00 MiB
      Total PE              953605
      Free PE               487077
      Allocated PE          466528
      PV UUID               Sf3a98-80vp-6eFD-gNOc-ssUX-UX10-KJKi4l
    

    So even if the move can’t fit inside the ‘allocated but not yet assigned to the file system” blocks of about 1 TB it WILL fit inside the 487077 free blocks / extents as the smaller disk only has 476667 extents.

    Now, in theory, a pvmove ought to do it (in about 30 hours…)

    So I’m going to launch that and check back here in (hopefully) a day or so when something interesting happens.

  9. jim2 says:

    Dang. Too bad the data isn’t available over the onion network. You could have downloaded it all again by now.

  10. E.M.Smith says:

    A nice write up of LVM for beginners:

    https://www.howtoforge.com/linux_lvm

  11. gallopingcamel says:

    In 1993 I was building the DFELL (Duke University Free Electron Laser). This laser uses relativistic electrons as the working fluid as Newcomen used steam.

    The electrons circulated in a synchrotron with 104 magnets each weighing >1,000 pounds. In order to “tweak” the design of these magnets a FEA (Finite Element Analysis) program was needed but there was nothing that would complete a magnet analysis in less than a day.

    Then the BINP (Budker Institute of Nuclear Physics) in Novosibirsk, Russia gave me a copy of a 6,000 point FEA which was so efficient that magnet analysis runs were completed in minutes on a “486” personal computer..

    This Russian FEA program enabled me to model the surface temperature of the moon with an RMS error of <0.3 Kelvin. I was able to show that the temperature of an airless earth is not 255 Kelvin as "Consensus" climate scientists claim.
    https://tallbloke.wordpress.com/2014/04/18/a-new-lunar-thermal-model-based-on-finite-element-analysis-of-regolith-physical-properties/

    My point is that if you need efficient software you might need friends in Novosibirsk.

  12. E.M.Smith says:

    @G.C.:

    It isn’t the software that’s slow. It is the block by block: Pick up block, make journal entry, write block, verify block, update journal entry, make entry for old block in journal, erase block, update journal entry, find next block, make journal entry, pick up block…

    There’s a whole lotta seekin’ goin’ on…

    Journaled file systems are nice in that they are less prone to failure, but their efficiency drops rather a lot since the journal is rarely right in the middle of sector where you are writing the next block of your TB file…

    Now you could make that more efficient by batching up a bunch of them at once, but then you are less secure, and if there is ANYTHING you want in disk maintenance programs that can run while your file system is live and in use it is robustness to failure…

    Basically, what I’m saying is that this is a design choice for ultra-careful ultra-paranoid ultra-reliable disk maintenance you can do live instead of “done quick and risky”… and I’m glad it is this way.

    Now, per Russians: Yes, they write very direct and efficient code. Yes, they are very good programmers. Same thing can be said of the fighter jet designs and their rockets. MANY contribute to Linux, so it is quite possible that LVM has some Russians on their team. Linux is Global. Linus is from Finland, IIRC. Much of Libre Office (Open Office) came out of South America? (after Sun made their more closed option). As did Mate IIRC. Encryption was almost 100% overseas from the USA back when our government was daft enough to make export a crime. Can’t export from the USA? OK, develop outside the USA and maybe let them import… Any given part of Linux may have a dozen folks working on it, most of them in different countries… (Yes, a lot comes from some places like Red Hat R&D Labs too..)

    I don’t think I mentioned it in the first grousing about how long this runs, but it is claimed by the manual that you can do all this disk adding, file system resizing, logical volume shrinking etc. etc. live… When doing things that difficult, in so trick a way that the file system can be active while you are doing it, well, slower is better ;-)

    BTW, I’m typing this in a browser on the system that’s busy doing a pvmove of the LVM files off one disk onto another…. Just sayin’… it isn’t like it is bothering my work flow…

  13. gallopingcamel says:

    Chiefio,
    Thanks for your comments on Russian programming expertise. As a hardware person I am clueless about what makes software fast or slow. It annoys me that my original IBM PC in 1984 could do things my modern computer can’t. IMHO modern computers are more about glitz than performance.

    In 1984 my PC had one 360k floppy with the program while the other 360k floppy had the data. It did not take long to turn out 600 personalized job applications and addressed envelopes using “Mailmerge”………….something I can’t do today with a PC that is thousands of times more powerful.

    While anecdotal evidence is of limited value it does tend to stick in one’s memory. The Russian engineer who gave me a copy of what is now “Quickfield” set up a simple problem on my IBM 486 computer and then hit the “Enter” button while remarking that it would take at least 15 minutes to complete the run. We got the command prompt back in five seconds and the engineer apologized because he thought the program had crashed.

    In fact the FEA program had done its job about 100 times faster than in Novosibirsk where the fastest PC was a “286”. Back then it would probably have been a crime to sell a 486 PC to my esteemed Russian colleagues.

    The Duke University Free Electron Laser is still the world’s brightest gamma ray source in the >10 MeV spectrum:
    http://www.aps.anl.gov/asd/diagnostics/papers/publications/p4569_1.pdf

    Vladimir Litvinenko was the lead author of the above paper. The ethnicity of the co-authors may be of interest:
    Americans = 13
    Russians = 10
    Chinese = 2
    Koreans = 2
    Irish = 1
    Welsh = 1 (this camel)
    Australian = 1
    Iranian = 1
    German = 1

  14. gallopingcamel says:

    Totally “Off Topic”. Some of the most interesting people I have ever met are Russians and Iranians.

    While the rank and file in Russia and Iran love America their governments are hostile.

  15. Larry Ledwick says:

    Circa 1989
    Mail merge word perfect on 8088 = 45 hours
    same job on a 286 at work = 4.5 hours
    same job on our new 386 – 33 mhz = 45 minutes
    Processor speed makes a difference but only if the code is written efficiently.

    Faster processors have made coders lazy, they just slap things together until they work and only worry about efficiency if the result is too slow to tolerate.

    My work place works with data files (tables) with millions of rows (soon to be billion plus), and jobs that 7 years ago would run for 20+ hours now run for 2 hours on data files 10x larger. With processing work load increasing at 35% / year we are in a constant race with technology and coding efficiency to keep the job run times manageable.
    That is a combination of processor speed, multiple cores, multi-thread execution where possible, use of solid state drives rather than spinning spindle disks, data base optimizations and sql query optimization, elimination of unnecessary or redundant processing steps etc.

    In the old days you had to write efficient code or the jobs would run for several days. Now you can get away with crappy code for most jobs because the processor speed makes the delay tolerable.

    From a functional point of view a user needs to see 2-3x increase in processing speed to even notice jobs run quicker. The practical difference between a job step taking 0.3 second and 3 seconds is hardly noticeable if it only has to be executed once. If you stack a few million executions on top of each other that difference adds up and becomes a burden to the user and then it becomes practical for the designer to look at code efficiency.

  16. E.M.Smith says:

    OK, I’ve gotten to the end of the data migration. Here’s the “pvdisplay” showing the (current) three disks with the 2 TB WD itty bitty (physically) one as empty:

    root@Headend:/WD3/ext/chiefio# pvdisplay
      --- Physical volume ---
      PV Name               /dev/sdf1
      VG Name               TemperatureData
      PV Size               3.64 TiB / not usable 2.00 MiB
      Allocatable           yes (but full)
      PE Size               4.00 MiB
      Total PE              953605
      Free PE               0
      Allocated PE          953605
      PV UUID               1TbQEs-kh6h-0NGd-1cnn-rw95-M2iy-7tnoW8
       
      --- Physical volume ---
      PV Name               /dev/sdd1
      VG Name               TemperatureData
      PV Size               1.82 TiB / not usable 4.00 MiB
      Allocatable           yes 
      PE Size               4.00 MiB
      Total PE              476667
      Free PE               476667 
      Allocated PE          0
      PV UUID               1rV3Dc-xTSU-3g1d-9D6k-Q1CR-zI7U-EBPuQh
       
      --- Physical volume ---
      PV Name               /dev/sde1
      VG Name               TemperatureData
      PV Size               3.64 TiB / not usable 2.00 MiB
      Allocatable           yes 
      PE Size               4.00 MiB
      Total PE              953605
      Free PE               10410
      Allocated PE          943195
      PV UUID               Sf3a98-80vp-6eFD-gNOc-ssUX-UX10-KJKi4l
    

    So now I need to do a ‘vgreduce’ to take the disk out of the volume group and then remove it (and if you don’t want it to be a LVM disk again blank it one way or another)..

    I’m going to leave it recognized as an LVM volume just because when I do the next round of scraping, if I “suddenly” need another TB or two, this is the standby disk…

    Oh, and after removing the disk I need to grow the LV to full size and then expand the file system into it so I can actually use it… Supposedly this is all easier than the alternative “old way”, but frankly, I’m not seeing it at the moment… I can see how in a normal production environment with a few TB “online and ready to go” being able to dispatch chunks as desired to any given file system would be a big advantage, but for slugging around TB lumps on ‘every few months’ schedules, not seeing much difference. Especially when it comes to decommissioning a little “laptop” sized disk (that doesn’t take prolonged steady use well) and replacing it with a full sized ( 5 1/4 inch?) disk. (The “MyBook” series of WD disks have full sized disks in them, the “passport” series uses laptop dinky disks that expect to sleep a lot…)

    Well, on to the final lap… Isn’t it “fun” dealing with TB of data? /sarc;

    @G.C.:

    The Russians have developed a sardonic and /sarc; culture to a fine art over their centuries of Government Oppression… the survivors “have clue” and are skilled at expressing it in subtile ways… I suspect Iranians have a similar context / history… I’ve only known a few, but they, too, were clueful.

    @Larry:

    The last time I saw code efficiency even being talked about was the Macintosh ( I was at Apple then, and to get those graphics they hand coded VERY Tight ROMS ). Since the Mac moved off the 68000 family they went to “higher level languages” and kind of quit caring. PCs of that era had already “moved on”. Even the chip makers now make fast silicon and layer microcode on top of it with only modest care for the inefficiency of that abstraction layer, then an OS abstraction layer gets added, then a bunch of fat libraries (though the compiler guys do spend some attention on efficiency still, especially the guys working with embedded systems using Musl and similar), then we layer all that onto a “Virtual Machine” for administrative convenience and then run Java on it in a Java Virtual Machine layer and …

    But by then you have spent about 90% of your hardware on “convenience” and crap code.

    That’s what Moore’s Law has gotten the desktop user since 1980…

    And why my $35 Pi M3 is almost fast enough for what I need ( 90% of the time it’s fine) and the $45 Odroid is just dandy. I figure about another Moore’s Cycle (or about a year from now, given when I bought those boards) and I’ll be trying to figure out what to do with the other cores… As it stands, most of the time I’m using one core or 1.5 cores. It’s plenty just because I don’t use several of those “layers” from the PC world…

    When the day comes that we need to do something other than Moore’s Law Counting… we’ve got a couple of orders of magnitude of performance to recover from decades of crappy code and layers of abstraction. I suspect that day is “soonish” as cell size for switches and memory is approaching some quantum limits (we can’t cut an electron in half…)

    Oh Well.

    In the world of Big Iron there are still folks who care about efficiency. I kind of miss working in that world. The Cray FORTRAN compiler was a dream in the things it would do to actually optimize your code. To some extent the guys doing CUDA and other GPU coding are in the efficiency boat too… low level close to the hardware and using every clock and gate…

  17. E.M.Smith says:

    OK, I’ve removed the disk from the volume group. It’s still there physically, but not doing anything. I can unplug it, or finish the process of making it a non-LVM disk if desired:

    root@Headend:/WD3/ext/chiefio# vgreduce TemperatureData /dev/sdd1
      Removed "/dev/sdd1" from volume group "TemperatureData"
    root@Headend:/WD3/ext/chiefio# vgdisplay
      --- Volume group ---
      VG Name               TemperatureData
      System ID             
      Format                lvm2
      Metadata Areas        2
      Metadata Sequence No  21
      VG Access             read/write
      VG Status             resizable
      MAX LV                0
      Cur LV                1
      Open LV               0
      Max PV                0
      Cur PV                2
      Act PV                2
      VG Size               7.28 TiB
      PE Size               4.00 MiB
      Total PE              1907210
      Alloc PE / Size       1896800 / 7.24 TiB
      Free  PE / Size       10410 / 40.66 GiB
      VG UUID               tFENIC-cL7y-GPuf-jmim-nwTM-lNKb-Mpb3Hx
       
    root@Headend:/WD3/ext/chiefio# pvdisplay
      --- Physical volume ---
      PV Name               /dev/sdf1
      VG Name               TemperatureData
      PV Size               3.64 TiB / not usable 2.00 MiB
      Allocatable           yes (but full)
      PE Size               4.00 MiB
      Total PE              953605
      Free PE               0
      Allocated PE          953605
      PV UUID               1TbQEs-kh6h-0NGd-1cnn-rw95-M2iy-7tnoW8
       
      --- Physical volume ---
      PV Name               /dev/sde1
      VG Name               TemperatureData
      PV Size               3.64 TiB / not usable 2.00 MiB
      Allocatable           yes 
      PE Size               4.00 MiB
      Total PE              953605
      Free PE               10410
      Allocated PE          943195
      PV UUID               Sf3a98-80vp-6eFD-gNOc-ssUX-UX10-KJKi4l
       
      "/dev/sdd1" is a new physical volume of "1.82 TiB"
      --- NEW Physical volume ---
      PV Name               /dev/sdd1
      VG Name               
      PV Size               1.82 TiB
      Allocatable           NO
      PE Size               0   
      Total PE              0
      Free PE               0
      Allocated PE          0
      PV UUID               1rV3Dc-xTSU-3g1d-9D6k-Q1CR-zI7U-EBPuQh
    

    http://tldp.org/HOWTO/LVM-HOWTO/removeadisk.html

    has an example.

  18. E.M.Smith says:

    And the home stretch…

    You will note in the above there were about 10k extents left ‘free’ for 40 GB. I decided to just leave them ‘as is’ since first off it isn’t much out of 7 TB and second I might want to play with building a second file system just for grins.

    Given that, all I needed to do was extend the file system. The resize2fs command complains and wants you to run fsck -f. As I’m usually not fond of forcing things, I decided to see if it would let me run it plain. Well, it won’t. Here’s the “before and after” file system sizes via df and the process in between, including my playing with being minimal on fsck, which it didn’t like:

    root@Headend:/WD3/ext/chiefio# mount /LVM
    root@Headend:/WD3/ext/chiefio# df /LVM
    Filesystem                                 1K-blocks       Used Available Use% Mounted on
    /dev/mapper/TemperatureData-NoaaCdiacData 6450634432 5895073176 260538808  96% /LVM
    root@Headend:/WD3/ext/chiefio# umount /LVM
    root@Headend:/WD3/ext/chiefio# resize2fs /dev/TemperatureData/NoaaCdiacData 
    resize2fs 1.42.12 (29-Aug-2014)
    Please run 'e2fsck -f /dev/TemperatureData/NoaaCdiacData' first.
    
    root@Headend:/WD3/ext/chiefio# e2fsck /dev/TemperatureData/NoaaCdiacData 
    e2fsck 1.42.12 (29-Aug-2014)
    /dev/TemperatureData/NoaaCdiacData: clean, 1466535/409600000 files, 1499509686/1638400000 blocks
    root@Headend:/WD3/ext/chiefio# !re
    resize2fs /dev/TemperatureData/NoaaCdiacData 
    resize2fs 1.42.12 (29-Aug-2014)
    Please run 'e2fsck -f /dev/TemperatureData/NoaaCdiacData' first.
    
    root@Headend:/WD3/ext/chiefio# e2fsck -f /dev/TemperatureData/NoaaCdiacData 
    e2fsck 1.42.12 (29-Aug-2014)
    Pass 1: Checking inodes, blocks, and sizes
    Pass 2: Checking directory structure
    Pass 3: Checking directory connectivity
    Pass 3A: Optimizing directories
    Pass 4: Checking reference counts
    Pass 5: Checking group summary information
    
    /dev/TemperatureData/NoaaCdiacData: ***** FILE SYSTEM WAS MODIFIED *****
    /dev/TemperatureData/NoaaCdiacData: 1466535/409600000 files (0.2% non-contiguous), 1499509684/1638400000 blocks
    root@Headend:/WD3/ext/chiefio# !re
    resize2fs /dev/TemperatureData/NoaaCdiacData 
    resize2fs 1.42.12 (29-Aug-2014)
    Resizing the filesystem on /dev/TemperatureData/NoaaCdiacData to 1942323200 (4k) blocks.
    The filesystem on /dev/TemperatureData/NoaaCdiacData is now 1942323200 (4k) blocks long.
    
    root@Headend:/WD3/ext/chiefio# mount /LVM
    root@Headend:/WD3/ext/chiefio# df /LVM
    Filesystem                                 1K-blocks       Used  Available Use% Mounted on
    /dev/mapper/TemperatureData-NoaaCdiacData 7647249548 5895068984 1402434492  81% /LVM
    

    With that, I’ve now got 1.4 TB of free space in the file system. That ought to be enough for the rescrapes I have in mind. That is on just 2 volumes (matched size and brand) of disks that ought to be robust to either long running or storage for a few years.

    So with that, this resize operation comes to an end and sometime (tomorrow…) I’d review and start the re-scrape.

  19. jim2 says:

    I curious why you are doing this. Is it so you can do your own modeling of temperature? Or are you monitoring the data for unexplained changes?

  20. gallopingcamel says:

    If there is an award for slow software I would like to propose it be awarded to the HP 1500 laptop that I bought a couple of years ago. In spite of having a couple of Giga-bit per second processors it could not keep up with my two finger typing speed.

    I sent the laptop back to HP but they could not fix it. After hours on the Internet it was apparent that plenty of people had the same problem but none of the proposed solutions worked for me.

    I solved the problem by donating the laptop to Goodwill.

  21. E.M.Smith says:

    @Jim2:

    Odds are good Trump will toss a lot of the Climate Change garbage. I’d like to assure quality data is preserved and crap data saved as evidence…

  22. p.g.sharrow says:

    @EMSmith; While ruminating over your works, as well as morning coffee, it occurs to me that most of the criteria for the beer-can computer system have been meet. Also that a wise next project for this creation is a scrape of your blog, posts, comments and the contents of the links as there is a rather vast amount of wisdom contained herein. It would be a LARGE shame if this were lost…pg

  23. E.M.Smith says:

    @P.G.:

    I’ve already scraped my own site. A couple of times… And posted how to do it… but if you want, I can post the actual command, though that seems a bit self serving…

  24. Larry Ledwick says:

    The other way to safe guard the content of this and other useful sites it to be sure that the wayback machine is scraping the site periodically. I have not checked recently but I think most popular sites gets on their periodic visit list.

    Looks like they are getting snapshots fairly regularly (every 2-4 weeks )
    https://web.archive.org/web/*/https://chiefio.wordpress.com/

Comments are closed.