Skip to main content

Linux Data Recovery

Recently I had a RAID5 array crash on me. The array was composed of 3 Western Digital 250GB disks controlled by a 3Ware 9550SX card. This array had been in continuous operation for nearly 4 years. Yet, about 12 days ago one of the drives appeared to have crashed. As luck would have it, though, the PSU was also failing in this box, so the +5V line stopped working and took another drive offline. That was the end of the array.

3Ware/LSI was a great help. They created a custom application that was able to recover the original RAID header information. After attaching a new PSU to the box, 2 of 3 drives were online and the LSI tool made the array online too (but degraded).

That was zero day and I was still hopeful. I downloaded R-Tools linux recovery application and created a rescue CD. Stuck the CDROM into the failed system and started the recovery process. After about 4 days, R-Tools consumed the entirety of a 500GB disk that I had attached to the system and it was not done. So I gave up on R-Tools and tried Disk Patch, but that couldn't even recognize the drive array (no driver for the 3ware card). Then I found a forensic tool from Italy called CAINE. CAINE had Test Disk built into its operation environment, which was able to recognize the partition information on the LVM volumes and was able to rewrite it successfully. But still, nothing could mount the file system.

So I downloaded Phoenix Linux Recovery. It has a fun interface and looks nice and pretty, but it did not discover any of the LVM volumes. I tried their quick scan and their deep scan. It wasn't until a couple days of support interaction that I was told it does not support LVM volumes.

I went back to R-Tools and gave it a regular expression to match on the file names that I needed. Of the 300MB of files that were all named using the same method (32-character hash code), it found 1. That scan tool another 3 days.

Nearly at the point of giving up, I installed CentOS 5 on that 500GB spare drive, attached it to the motherboard, and changed the bios to give it boot priority (higher up on the boot device list, above the 3Ware card). With CentOS installed, i was able to run LVM and get the list of volumes on the drive array and see that its partition information was intact. So I ran e2fsck with the "-y" option on the array's volume and waited. Then I ran e2fsck about 4 more times before it finally was done fixing bad inode references and such.

Now I was able to mount the root file system, but all of the files were in "lost+found." So I did a "du" on the directory to see where I was at and spotted my original directory structure during the du process output. Control-C, change du to a du with a pipe through grep, and I found my files! Then tar, gzip, and scp, and the files were safely tucked away on more secure hardware.

I paid probably $700 for the various software products that all failed to anything useful. The two tools that worked for me were CAINE and e2fsck, both of which are FREE. Quotes from Kroll-OnTrack had the recovery cost between $3000 and $10,000. Every service wanted an upfront $300 fee to diagnose the RAID array.

Using LVM to partition your array makes future recovery from a crash more difficult. Make sure that you attach the crashed array to a new install of your original OS type and try to discover the extent of your damage. e2fsck can run in non-volatile mode, which means it will report the errors on your volume, but will not make any changes. In the end, using the "-y" option will allow you to sit back and watch the magic.

Popular posts from this blog

Clustered Foolishness

I had morning coffee with a well respected friend of mine recently. Aside from chatting about the usual wifery and family, we touched on the subject of clustered indices and SQL Server performance. A common misconception in the software industry is that a clustered index will make your database queries faster. In fact, most cases will demonstrate the polar opposite of this assumption. The reason for this misconception is a misunderstanding of how the clustered index works in any database server. A clustered index is a node clustering of records that share a common index value. When you decide on an index strategy for your data, you must consider the range of data to be indexed. Remember back to your data structures classes and what you were taught about hashtable optimizations. A hashtable, which is another way of saying a database index, is just a table of N values that organizes a set of M records in quickly accessible lists that are of order L, where L is significantly less than M. ...

Deadly Information

Remember back to 2006 when a young girl killed herself [1] , [4] after being tricked and harassed by a faux boy she found on the Web using MySpace. The trial against the faux boy, an adult woman (Lori Drew), did not result in prosecution for the death of Megan, much to the dismay of many.  Yet, today we read about another trial where someone is being accused of second degree murder because they may have mentioned something slanderous about another person who was later killed by a hit man [2] . In this case, though, the person on trial is a former FBI agent who was working deep cover to infiltrate organized crime. In both cases, someone released information to third parties that resulted in the death of another person.  Neither defendant in either of these cases actually committed the act of murder, though. In the case of the FBI agent, though, the murder charge is being taken seriously. Yet, in the MySpace slander case, the murder charge was not taken seriously. How are t...

Faster Climate Change

CNN reports that a WWF study has found that global climate change is happening faster than predicted in 2007 and that there will not be any arctic ice by 2013, or 2040. [1] Then it goes on to say that global sea level will increase by 1.08 meters by the end of the century, which is 2100, 92 years from now. Quite honestly, nobody really cares what is going to happen to the planet in 98 years. Why? Because in 98 years we (as humans) will either: (1) Obliterate ourselves because God told us to do it. (2) Eat eachother because there will no longer be any land available to grow crops and sustain living quarters for our 50 billion people. (3) Suffocate because our planet will no longer smell nice thanks to 50 billion people producing lots of solid waste in our oceans. (4) Leave the planet because there will no longer be enough fresh water to sustain our lives. Wait a minute. Consider (4) for a moment. Where can we get an abundance of fresh water TODAY? Anyone? Yeah, the arctic! It's goin...