Table of Contents
Data Recovery
Data recovery may broadly divided into two categories - the failing disk or wrong disk management command is executed by mistake.
I personally prefer data backup over data recovery for the reason that when a disk is failing we may or may not be able to fully recover the data. It is always better to have a backup of critical data, and I have an automated setup of Borg Backup that takes care to back up all important data, on daily and weekly basis, to another disk. However, there are many situations - partition deleted by accident or lv is removed so on - when data recovery techniques comes handy for fast recovery.
This post covers recovery steps specific to GPT partition table, LVM and Ext4 fs; a frequently used stack in Linux. It also shows how to intentionally corrupt the test disk so that you get enough practice by recovering it. The steps are more or less same to other FS types and partition table schemes: for other types use relevant utilities.
Practice Setup
For practice and testing, it is convenient to use a pen drive. For commands, in this post, we use variables such as $TEST_DISK or $TEST_DEV for disk and devices such as /dev/sdc or /dev/sdc1 etc., This is done so to avoid any harm to /dev/sdc that may have been taken up by a live disk in the system. Use lsblk
to know the device name of the pen drive and replace /dev/sdx
with it in following exports,
export TEST_DISK=/dev/sdx; export TEST_DEV=/dev/sdx1 ; export TEST_DEV2=/dev/sdx2
The $TEST_DISK refers to the disk whereas $TEST_DEV and $TEST_DEV2 refers to first and second partition of the disk.
Create Test Disk in CLI
We can use fdisk, parted or gparted to create the test disk. The sgdisk and gdisk are similar partition tools that are meant specifically for GPT tables. We use sgdisk to create the test pen drive from cmd line.
The sgdisk
is non-interactive that writes directly to the disk, so be careful when you are using it. No rollback! Safer alternatively would be the interactive tool gdisk
which make changes in memory and writes to disk only when you explicitly write at the end of session. Export variable as explained above before executing these commands.
umount $TEST_DEV $TEST_DEV2
sgdisk -og $TEST_DISK
partprobe $TEST_DISK
sgdisk -n 1:1MiB:201MiB $TEST_DISK
sgdisk -n 2:202MiB:502MiB $TEST_DISK
partprobe $TEST_DISK
sgdisk -p $TEST_DISK
wipefs -a $TEST_DEV $TEST_DEV2
mkfs.ext4 $TEST_DEV
mkfs.ext4 $TEST_DEV2
Disk Images
For data recovery of failing disk, it is best to work on image of disk rather than physical disk itself. GNU ddrescue is used to rescue failing disk. It copies data from a disk to an image or another disk trying hard to rescue data in case of read errors. It uses log file to speed up the recovery process in case of multiple runs.
Do not attempt a file system check on a failing drive, as this will likely make the problem worse. Mount the disk read-only.
Create Recovery Image
Create backup image
ddrescue -d $TEST_DISK backup.img backup.log // if fails then try with 3 retries
ddrescue -d -r3 $TEST_DISK backup.img backup.log // 3 retries
Mount Recovery Image
We can mount and inspect the image,
losetup -f -P backup.img // create loop devices
losetup -l // list lo loop devices
mount /dev/loop16p1 /mnt
Clone the Failing Disk
Directly clone the failing disk,
ddrescue -d -r3 $TEST_DISK /dev/sdx1 backup.log
We can also clone the image created by ddrescue,
ddrescue -f backup.img /dev/sdx clone.log // copies recovered image to /dev/sdx
dd if=backup.img of=/dev/sdx --status=progress // or we can also do the same with dd
More info: Ubuntu Data Recovery.
Recover GPT
Backup and Restore GPT
Instead of using testdisk to recover the GPT, it better to have backup of the table. The gdisk is interactive GUID partition table (GPT) manipulator is used to backup and recover partition table. the gdisk holds changes in the memory and disk is updated only on write (w option).
gdisk $TEST_DISK
Use option b to backup GPT to file. To recover the GPT from file, use r -> l -> w.
b - backup table to file
r - recovery mode
l - load table from backup file
w - write changes
To re-read The Partition Table without rebooting,
partprobe $TEST_DISK
We can also use dd to take backup
dd if=$TEST_DISK of=gpt.backup bs=512 count=34
dd if=gpt.backup of=$TEST_DISK bs=512 count=34 // recover
Notes: For a bootable disk, a Protective MBR must be located at LBA 0 (i.e., the first logical block) of the disk if it is using the GPT disk layout. The Protective MBR precedes the GUID Partition Table Header to maintain compatibility with existing tools that do not understand GPT partition structures.
More info on GPT Partition Table Format.
How to corrupt GPT
To practice recovery, we can simulate the table corruption by zeroing certain sectors.
Most disks are divided into sectors of 512 bytes and the first 34 sectors of the disk are used by GPT. The MBR is located in the 1st sector of the disk (LBA 0); the GPT disks still have a “protective” MBR in this sector for backward compatibility. The GPT proper starts at the 2nd sector of the disk (LBA 1) where the Partition table header is located and followed, normally, by 32 sectors (16 KiB or 16384 bytes) containing the actual partition entries.
LBA0 - MBR (1 sector) LBA1 - Partition table header (1 sector) LBA2–33 - Partition entries (32 sectors)
The below command shows the GPT table. The GPT table starts with EFI PART in ASCII.
dd if=/dev/sda bs=512 skip=1 count=33 | hexdump -C
To dump fs in part 1 starting at 2048 sector (1MB) use,
dd if=$TEST_DISK bs=512 count=3 skip=2048 | hexdump -C
If disk is zeroed before GPT is created, then following shows MBR, GPT and File System in first partition.
dd if=$TEST_DISK bs=512 count=2051 | hexdump -C
00000000 hex omitted |................| // sector 0 - MBR, sector 0
*
000001f0 |..............U.|
00000200 |EFI PART....\...| // sector 1-33 - GPT
00000210 |.e..............|
*
00100400 |........s...e...| // sector 2048 - Part 1 FS
We can intentionally corrupt the GPT and MBR using,
dd if=/dev/zero of=$TEST_DISK bs=512 count=34 // zeros the GPT and MBR
dd if=/dev/zero of=$TEST_DISK bs=512 count=33 seek=1 // zerors the GPT (seek skips MBR sector)
When GPT is zeroed but fs data is not touched, then we can recover the disk by creating a new GPT and adding new part of same size. New GPT recognize the fs without any error.
Points to Note:
Creating fresh GPT will zeros the GPT area but not the MBR. For testing, zero the first 34 sectors with dd if=/dev/zero of=$TEST_DISK bs=512 count=34.
Creating part alters the part table at sector 1, but it doesn’t put anything at start sector of the part. The mkfs.ext4 writes at first sector of the part.
Recover the corrupted GPT
Corrupt the GPT as explained above. Now gdisk $TEST_DISK
shows following message
****
Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
verification and recovery are STRONGLY recommended.
****
The GPT maintains a backup table at the end of disk. We can recover from the backup table using gdisk $TEST_DISK
and use r (recovery mode) -> c (load backup table) -> w (write).
If you have taken gpt backup using gdisk -> b option then restore the table from it using gdisk’s r (recovery) -> l (load backup) option.
Recover Lost Partition
There are situation when we accidentally delete a partition and then it is lost. We can simulate this by deleting a partition using gdisk or sgdisk. The backup created by GPT at the end of disk is not useful in this case as the deleted part is also removed from the backup. We can restore the GPT if we have backup created using gdisk b option as explained earlier.
In case you don’t have GPT backup file then try to recover using testdisk
utility which is good tool to recover lost partition.
sgdisk -d 1 $TEST_DISK
To recover file
testdisk $TEST_DISK
The steps to recover are, Select EFI GPT -> Proceed -> Analyze -> Quick Search. The data parts are shown in green. Select any one and press Enter. Choose Write option to write the gpt. It may create extra blank partitions, delete such parts with gdisk.
In case testdisk
is unable to write the GPT, then find out the lost partition’s first and last sector with Analyze -> Quick Search. Then you can use it re-create the partition using gdisk.
If you have lot of partitions in disk then it is better to take backup of the GPT as explained below. The backup comes handy recover the lost partition.
Recover LVM
LVM backup files are in /etc/lvm/backup
and /etc/lvm/archive
. The backup file is readable and shows the lvm conf before or after executing lvm change commands. We can use these files to restore corrupted or lost lvm.
LVM Metadata
LVM maintains two types of metadata in each PV to manage. Sectors are zero indexed and sector 0 is not used by LVM.
pvcreate $TEST_DEV
pvck $TEST_DEV
Found label on /dev/sdc1, sector 1, type=LVM2 001
Found text metadata area: offset=4096, size=1044480
The pv label is on first sector. The sector 8 to sector 2048 (1MB space) is reserved for text metadata. The fs starts , normally, from sector 2048.
The pvcreate
writes physical volume label to mark the block device as an LVM PV in any of sector 1 to 4. By default, the pvcreate command places the label in the 1st sector device sdc1,sdc2 etc., This label can optionally be placed in any of the first four sectors, since the LVM tools scan the first 4 sectors. The physical volume label begins with the string LABELONE. The pvremove
wipes the labels by zeroing these sectors.
We can view the PV label using,
dd if=$TEST_DEV bs=512 skip=1 count=4 | hexdump -C
The “real” metadata comes when we do a vgcreate
and lvcreate
. The configuration details of a volume group are referred to as the metadata. By default, an identical copy of the metadata is maintained in every metadata area in every physical volume within the volume group. LVM volume group metadata is stored as ASCII. The metadata area is a circular buffer. View metadata with,
dd if=$TEST_DEV bs=512 skip=8 count=4 | hexdump -C
More info on LVM Metadata.
Recover from missing PV
The PV may be missing because of PV label corruption or removed, by force, with command pvremove -ff $TEST_DEV
.
We can simulate missing PV with any of below commands. Then vgs
will show missing PV warning.
dd if=$TEST_DEV bs=512 skip=1 count=4 | hexdump -C
dd if=/dev/zero of=$TEST_DEV bs=512 seek=1 count=1 // zero the pv label in first sector
or
pvremove -ff $TEST_DEV
vgs
Now, we can’t restore it with vgcfgrestore -f /etc/lvm/backup/bk bk
. To recover PV, find out the missing PV’s uuid and the device name using vgs
. Deactivate any active lv in the vg and recreate the PV either using lvm backup or archive file.
lvchange --activate n bk/bk
pvcreate --test --uuid "<uuid>" --restorefile /etc/lvm/backup/bk $TEST_DEV // dry run
pvcreate --uuid "<uuid>" --restorefile /etc/lvm/backup/bk $TEST_DEV
lvchange --activate y bk/bk
pvck $TEST_DEV
Finally restore VG to fix the warning “VG bk is missing the used flag in PV header”
vgcfgrestore --test -f /etc/lvm/backup/bk bk
vgcfgrestore -f /etc/lvm/backup/bk bk
Recover from missing LVM text metadata
dd if=$TEST_DEV bs=512 skip=8 count=4 | hexdump -C
dd if=/dev/zero of=$TEST_DEV bs=512 seek=8 count=4
The missing text metadata will trigger warning in lvs, vgs and pvs. Repair first disk
pvck --repair -f /etc/lvm/backup/bk $TEST_DEV
Run lvs, vgs and pvs. If it shows metadata mismatch in second disk then do,
pvck --repair -f /etc/lvm/backup/bk $TEST_DEV2
Run lvs, vgs and pvs. Now things should be ok else run
lvchange --activate n bk/bk
pvcreate --test --uuid "<uid of pv>" --restorefile /etc/lvm/backup/bk $TEST_DEV
pvcreate --uuid "<uid of pv>" --restorefile /etc/lvm/backup/bk $TEST_DEV
lvchange --activate y bk/test
pvck $TEST_DEV
Finally, check fs
fsck.ext4 -f /dev/mapper/bk-bk
Recover from LV and VG errors
We may remove lv and vg by mistake. Use backups in /etc/lvm/archive
folder to recover from this.
lvremove bk bk // LV removed accidentally
vgcfgrestore -l bk // list backup in archive for a VG
vgcfgrestore -f /etc/lvm/archive/bk_00042-1381804649.vg bk
lvs
vgremove bk
vgcfgrestore -l bk
vgcfgrestore -f /etc/lvm/archive/bk_00042-1381804649.vg bk
vgs
lvs
For more info: LVM Recovery
Recover Ext4
View Superblock and Backups
The backups are created as soon as fs is created with mkfs.ext4, there is no need to mount or write files. For less than 120M part, no superblock backups are created! They are created when part size is 130M (approx).
dumpe2fs $TEST_DEV | grep -i superblock
Primary superblock at 0, Group descriptors at 1-13
Backup superblock at 32768, Group descriptors at 32769-32781
Backup superblock at 98304, Group descriptors at 98305-98317
The superblocks are relative to the partition. The first backup of superblock is at 32768 block (4k) within $TEST_DEV (/dev/sdx1) and not of $TEST_DISK (/dev/sdx).
We can also use following cmd where -n is for dry run. It is safe say Y to overwrite warning message as it doesn’t create or touch the fs.
mkfs.ext4 -n $TEST_DEV
Creating filesystem with 495616 4k blocks and 123904 inodes
Filesystem UUID: 3fef5d22-6bf8-493c-bbfd-3441128497e2
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912
Use dd to view the super block. The ext4 default block size is 4096b. The following cmds dump primary at 0, first backup at 32768 and second backup at 98304. To convert ext4 block to disk sector multiple ext4 block by 8. Ex: ext4 block 32768 is disk sector 262144.
dd if=$TEST_DEV bs=512 skip=2 count=3 | hexdump -C
dd if=$TEST_DEV bs=512 skip=262144 count=1 | hexdump -C
dd if=$TEST_DEV bs=512 skip=786432 count=1 | hexdump -C
Overwrite Superblock and Recover
Overwrite first superblock with zero.
dd if=$TEST_DEV bs=1 skip=1024 count=512 | hexdump -C
dd if=/dev/zero of=$TEST_DEV bs=1 seek=1024 count=256
Recover it with fsck.ext4,
fsck.ext4 $TEST_DEV
fsck.ext4 -vf $TEST_DEV
Note: Run fsck for partition such as /dev/sdx1. Don’t run it for /dev/sdx as it will corrupt GPT unless you have formated full disk as ext4 without any partition table. It is possible to recover GPT with gdisk but don’t know whether it works in all cases.
Recover Deleted Files
Out of file un-delete tools - testdisk, photorec, ext4magic and foremost, the photorec is able recover many types of files. The testdisk can’t recover from ext4, while foremost can recover limited set of files types. The ext4magic sometime able to recover the files depending on state of the ext4 journal.
The photorec
ignores the filesystem and goes after the underlying data, so it will work even with a re-formatted or severely damaged filesystems and/or partition tables. The photorec is unable to recover txt and text based files such as .java etc.,
Text files cannot be recovered by automatic “scanning” tools like foremost or photorec because there’s no way to identify them amongst the other data as txt file doesn’t have a header.
To tryout photorec, mount $TEST_DEV2 and cp some files and delete some. Run photorec $TEST_DEV2
and use option Proceed -> Choose partition unknown (whole disk) -> Search -> Choose Ext4 -> Destination Folder (select current dir) -> C. The recovered files are saved in recup_dir
folder.
To recover using ext4magic first create backup of journal and then recover from it.
debugfs -R "dump <8> /tmp/backup.journal" $TEST_DEV2
ddrescue -r3 -v $TEST_DEV2 backup.img backup.log
ext4magic backup.img -a "$(date -d "-1days" +%s)" -j /tmp/backup.journal -r
Or you directly use the disk without taking image.
debugfs -R "dump <8> /tmp/sdc2.journal" /dev/sdc2
ext4magic /dev/sdc2 -a "$(date -d "-1days" +%s)" -j /tmp/sdc2.journal -r
The recovery depends on state of journal, and may times ext4magic is unable to recover any files. The flag -r will only recover complete files, that were not overwritten. To recover broken files, that were partially overwritten, use flag -R. This will also restore not-deleted files and empty directories.
It is not possible to view deleted files by mounting the backup.img as loop device with mount -o loop,ro backup.img /mnt
.
Recover Txt Files
If some content of the txt file is known, then we can grep for the text and recover the txt file,
grep --line-buffered -a -b -o -F -e 'Some known text' $TEST_DEV2
dd status=none if=/dev/sdc2 bs=4096 skip=$(( 136318474 / 4096 -1 )) count=2 | less
The ext4magic may recover some of the txt files.
Mount with ext4 offset
If you have taken the partition layout backup with fdisk /dev/sdc -l
, then it is possible to mount fs directly without recreating the GPT.
Suppose the partition layout is,
Device Start End Sectors Size Type
/dev/sdc1 2048 1026048 1024001 500M Linux filesystem
/dev/sdc2 1028096 1437696 409601 200M Linux filesystem
In ext4 fs, the first 2 sectors (count 2) are zeros and super block starts at 1024 (count 3). If offset 00000438 contains magic number 0x53ef then it is ext fs superblock. Search for fs magic number,
dd if=$TEST_DISK bs=512 skip=2048 count=3 | xxd -a
dd if=$TEST_DISK bs=512 skip=1028096 count=3 | xxd -a
Now we can mount with offset.
mount /dev/$TEST_DISK -o ro,offset=$((512*1028096)) /mnt
mount /dev/$TEST_DISK -o ro,offset=$((512*2048)) /mnt
When you run wipefs -a to erase the fs, it simply erases the magic number at 0x00000438 and outputs message /dev/sdc1: 2 bytes were erased at offset 0x00000438 (ext4): 53 ef.
Ext4 FS without GPT
Normally partitions are created in disk, but it is also possible to create fs without gpt spanning the entire disk.
mkfs.ext4 $TEST_DISK
parted $TEST_DISK print
Partition Table: loop.
Number Start End Size File system Flags
1 0.00B 2030MB 2030MB ext4