Data recovery after using ‘dd’ on the wrong partition
During my recent experiment with the Raspberry Pi, I accidentally used dd on the wrong partition and wrote the Pi system image onto my 4TB hard drive that is full of software, documents, and movies. Worst, I only found out about it until a day later when I noticed my 4TB partition had disappeared and was replaced with the Pi standard partition structure. Although I had a backup for the 4TB hard drive, it was created over three months ago, and recently downloaded softwares and movies were simply not there. Nothing too serious since all my important files are backed up in several different places, but I still would want to get back the recent files that I downloaded without having to restore the 4TB drive from an old backup and downloading these files again.
My first attempt was to use TestDisk. It has saved me several times when I accidentally ran fdisk on the wrong partition and destroyed the partition information. In those cases, a quick scan that took less than 15 minutes was all needed to restore the original partition structure. However, in this case, a quick scan by TestDisk returned no useful results and showed only duplicates of the Raspberry Pi partitions from the image which was wrongly used. So what was so different now?
The answer is because the image which I used for dd is more than 4GB in size and probably overwrote all useful partition information which may exist at the beginning of the drive. In my experience with TestDisk, it will most likely succeed if you have simply partitioned the wrong drive, as the original partition table can still be recovered by scanning the entire hard drive looking for signature of common file systems (NTFS, FAT, FAT32, etc.). This was not the case in my scenario since the 4GB Raspberry image has been written onto the first 4GB of the hard drive, overwriting all useful partition information, nixing any chances of a quick recovery. Had the hard drive been splitted into two different partitions, with the first being larger than 4GB, TestDisk would still be able to recover the second partition by looking for its header.
Not giving up yet, I tried the deep scan option in TestDisk. My first attempt to run the deep scan on Ubuntu 16 seemed to cause the 4TB drive to be disconnected repeatedly resulting in tons of errors. I had to run it on Windows as an administrator before the deep scan could finish without errors. This took more than 12 hours and returned some apparently useful results:
Although a few partitions, FAT16, FAT32, NTFS and HFS were found during the scan, they were actually from disk images which once existed on the overwritten partition, and did not belong to the 4TB hard drive partition structure. In particular, the Mac500M partition was actually a HFS disk image meant to be used with Mini vMac. TestDisk also failed to find the backup MFT (Master File Table), which usually exists at the end of the disk drive after the actual file system data. The MFT backup can sometimes be used by data recovery tools to reconstruct the original file system when the main MFT has been lost. So essentially I had wasted 12 hours on the deep scan for nothing.
At this point I could have tried PhotoRec, a forensic data recovery tool by CGSecurity. In my previous experience with PhotoRec, it will be able to recover a lot of files, even those that have been deleted for a long time. The recovered files will be without their original names or directory structures, since PhotoRec ignores file system information. The software will simply assign new names to recovered files, based on what it can find, for example, by using the document metadata or the first few words of the content, if the file is a document. Since PhotoRec is designed to be a forensic tool, rather then a data recovery tool, it should be used as a last ditch attempt when you cannot find any other way to recover your data. As I was not so desperate, I decided to skip PhotoRec for now.
Next, I decided to use Microsoft DiskPart to recreate the original partition structure on the 4TB hard drive. This was simply a single NTFS partition that covered the entire hard drive, configured to be using the GPT partition scheme. After that I ran Recuva on the newly created NTFS partition to see if it could find any files. The process took over three days to complete, with tons of recoverable files being found, albeit with no directory information attached. All files were simply assigned to the root folder. Regardless, I was able to locate the newly downloaded files, restored those to another drive and copied them to the backup that I had. In a sense, the data recovery mission has been successful, despite not being able to restore the original file system on the overwritten disk. Even if I was able to, the 4GB of data which has been overwritten would be lost forever without any way to tell which files could be affected due to hard disk fragmentation. Restoring the newly downloaded file individually and being able to verify that they were intact before merging them with the backup is probably a better alternative.
The bottom line? Make sure that you disconnect all unused devices when running ‘dd’ (or similar tools such as Win32DiskImager) and check the partition name carefully before executing the command. ‘dd’ stands for ‘Destroy Disk’ and you may never know what damage may occur just from a simple oversight. In my case, I was lucky to restore the files, but you may not be so lucky. So, just remember to be very careful when doing anything that has to do with dd.
See also
Get back your deleted files on a Mac with Disk Drill