March 2007 Archives

Rewriting root

Another root filesystem recovery HOWTO.

Recently I had a hard disk develop major faults such that the root filesystem went readonly. Although rebooting caused it to come back up fine, a SMART check (hdparm -t long /dev/hda) showed that it was failing, so I requested that the server operators replace the hard disk (it's a leased server not under my direct control).

Since they could not fit both the old and new disks in the server at once, instead they copied the contents of the failing disk to a /backup directory on the new disk, which they had installed a fresh copy of Fedora Core 4 on to.

Rather than try to upgrade the new installation to Fedora Core 6 and get everything configured back to the way it was, instead I opted to swap the files in /backup/ with the current /, i.e. replace the whole new root filesystem with the old one to get everything back as it was before the change. I also had to do this live over the network, since I have no physical access to the server.

If you try to do "mv /bin /OLD/bin; mv /backup/bin /bin" you will run into major problems - after the first mv, the mv command won't work because it has moved to a different directory, and if you work around that, you will still eventually run into nasty problems related to /lib.

Instead, I used mount --bind / /backup/mnt to make the root filesystem visible under /backup so that I could then do chroot /backup (making sure /backup/dev/ was populated with device files first). I was then able to replace, under /backup, /mnt/{bin,boot,etc,lib,...} to replace the "real" root filesystem's files (i.e. mv /mnt/bin /mnt/old-bin; cp -a /bin /mnt/bin etc).

The only thing that went wrong with this approach was that rebooting failed, I had to ask the server operators to manually restart the machine. I think this was probably due to a problem with the shutdown sequence.

If you have problems with files / resources being "busy", make sure you have stopped all non-essential services - i.e. anything other than the network interfaces and SSH - before you start the move. Also, note you won't be able to move /dev, /proc, /sys, or any other directories that are mount points or which have mount points under them (though you could investigate mount --move).