On Mon, Dec 20, 2010 at 4:52 PM, John Stoffel
<john@stoffel.org> wrote:
>>>>> "Eric" == Eric Martin <eric.joshua.martin@gmail.com> writes:
Eric> Now, the main reason for my email. My mythbox / home file
Eric> server runs linux soft-raid in RAID5 with 3 500GB disks. One
Eric> disk went bad a while ago and the array never rebuilt itself.
So you only had two 500Gb disks left in the array?
Correct. I just received my new disk yesterday and dd_rescued the data from the bad disk onto a fresh disk. the array won't start, so here's the info you asked for
Eric> The other day, the second disk went bad. Am I hosed?
Possibly, it depends on how bad the second disk is. What I would do
is try to use dd_rescue to copy the 2nd bad disk onto a new disk
(possibly your original bad disk if you feel brave!) and then try to
re-assemble your raid 5 using that. You might or might not have
corruption in the filesystem, so make sure you run an fsck on it.
Now, in the future, you should run a weekly check of the array for bad
blocks or other problems, so that you get notified if a disk dies
silently. I use the following crontab entry:
BTW, this is what I was missing. there was no warning that my disk was bad! Like I said, I have backups but the last one failed so I want something tighter
Eric> I've been googling for 'rebuild bad linux software raid' but all
Eric> I get is the rebuild command. Also, I don't see any tools that
Eric> will move bad data to another spot on the disk. This is my
Eric> first time using software raid so I'm in a bit over my head.
The first thing is to ask for help on the linux-raid mailing list,
which is hosted on vger.kernel.org.
But somethings you can do to help is to give us more information.
Like:
cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md2 : inactive sda3[0](S)
4000064 blocks
md3 : inactive sda4[0]
483315456 blocks
unused devices: <none>
mdadm -E /dev/sd...
or /dev/hd... depending on whether your SATA or IDE drives.
Basically, use the devices you got from the /proc/mdstat output as
your basis.
Give us this output, and we should be able to help you more.
livecd / # mdadm -E /dev/sda3
/dev/sda3:
Magic : a92b4efc
Version : 00.90.00
UUID : 8374ea27:6e191996:e56f6693:e45468a9
Creation Time : Sat Jul 11 17:14:31 2009
Raid Level : raid5
Used Dev Size : 4000064 (3.81 GiB 4.10 GB)
Array Size : 8000128 (7.63 GiB 8.19 GB)
Raid Devices : 3
Total Devices : 2
Preferred Minor : 0
Update Time : Mon Dec 13 03:59:27 2010
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 1
Spare Devices : 0
Checksum : fc28b820 - correct
Events : 0.496842
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 0 8 3 0 active sync /dev/sda3
0 0 8 3 0 active sync /dev/sda3
1 1 8 19 1 active sync
2 2 0 0 2 faulty removed
livecd / # mdadm -E /dev/sda4
/dev/sda4:
Magic : a92b4efc
Version : 00.90.00
UUID : c7b07c90:cbd50faf:bc824667:2504996b
Creation Time : Sat Jul 11 16:52:52 2009
Raid Level : raid5
Used Dev Size : 483315456 (460.93 GiB 494.92 GB)
Array Size : 966630912 (921.85 GiB 989.83 GB)
Raid Devices : 3
Total Devices : 2
Preferred Minor : 3
Update Time : Thu Dec 9 11:13:25 2010
State : clean
Active Devices : 1
Working Devices : 1
Failed Devices : 2
Spare Devices : 0
Checksum : d43b9ad8 - correct
Events : 0.15550817
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 0 8 4 0 active sync /dev/sda4
0 0 8 4 0 active sync /dev/sda4
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
livecd / # mdadm -E /dev/sdb3
mdadm: cannot open /dev/sdb3: No such file or directory
livecd / # mdadm -E /dev/sdb4
mdadm: cannot open /dev/sdb4: No such file or directory
livecd / # mdadm -E /dev/sdc4
/dev/sdc4:
Magic : a92b4efc
Version : 00.90.00
UUID : c7b07c90:cbd50faf:bc824667:2504996b
Creation Time : Sat Jul 11 16:52:52 2009
Raid Level : raid5
Used Dev Size : 483315456 (460.93 GiB 494.92 GB)
Array Size : 966630912 (921.85 GiB 989.83 GB)
Raid Devices : 3
Total Devices : 2
Preferred Minor : 3
Update Time : Thu Dec 9 11:13:06 2010
State : active
Active Devices : 2
Working Devices : 2
Failed Devices : 1
Spare Devices : 0
Checksum : d34e516a - correct
Events : 0.15550815
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 1 8 20 1 active sync
0 0 8 4 0 active sync /dev/sda4
1 1 8 20 1 active sync
2 2 0 0 2 faulty removed
livecd / # mdadm -E /dev/sdc3
/dev/sdc3:
Magic : a92b4efc
Version : 00.90.00
UUID : 8374ea27:6e191996:e56f6693:e45468a9
Creation Time : Sat Jul 11 17:14:31 2009
Raid Level : raid5
Used Dev Size : 4000064 (3.81 GiB 4.10 GB)
Array Size : 8000128 (7.63 GiB 8.19 GB)
Raid Devices : 3
Total Devices : 2
Preferred Minor : 0
Update Time : Mon Dec 13 03:59:27 2010
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 1
Spare Devices : 0
Checksum : fc28b832 - correct
Events : 0.496842
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 1 8 19 1 active sync
0 0 8 3 0 active sync /dev/sda3
1 1 8 19 1 active sync
2 2 0 0 2 faulty removed
/dev/sda is a good disk, /dev/sdc is the bad disk and /dev/sdb is the good disk that has the clone of /dev/sdc. Curiously, mdadm -E doesn't work on /dev/sdb even though the partitions are setup correctly
thanks!
John