Mdadm Raid 6 - overheated. Have 6 of the 8 drives assembled

I force assembled the array - and it seems to have worked to a point

sudo mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Thu Aug 16 09:52:47 2018
        Raid Level : raid6
        Array Size : 35162345472 (33533.43 GiB 36006.24 GB)
     Used Dev Size : 5860390912 (5588.90 GiB 6001.04 GB)
      Raid Devices : 8
     Total Devices : 6
       Persistence : Superblock is persistent

       Update Time : Wed Apr 21 12:26:21 2021
             State : clean, degraded
    Active Devices : 6
   Working Devices : 6
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : resync

              Name : ARTSERVERZ:0
              UUID : ea7ff0be:03a20678:c211e74b:7b66d018
            Events : 137283

    Number   Major   Minor   RaidDevice State
       0       8       64        0      active sync   /dev/sde
       -       0        0        1      removed
       2       8       16        2      active sync   /dev/sdb
       3       8       32        3      active sync   /dev/sdc
       -       0        0        4      removed
       5       8       96        5      active sync   /dev/sdg
       6       8       48        6      active sync   /dev/sdd
       7       8       80        7      active sync   /dev/sdf

but when I try to mount it I get

sudo mount  /dev/md0 /media/EmpireDellRaid/
mount: /media/EmpireDellRaid: unknown filesystem type 'jbd'.

If I force a file system I get

sudo mount -t ext4 /dev/md0 /media/EmpireDellRaid/
mount: /media/EmpireDellRaid: wrong fs type, bad option, bad superblock on /dev/md0, missing codepage or helper program, or other error.

if I do a lsblk -f I get

sdb             linux_raid_member ARTSERVERZ:0 ea7ff0be-03a2-0678-c211-e74b7b66d018
└─md0           jbd                            5fdaf11a-39b5-4b78-adb5-84ed03033c47
sdc             linux_raid_member ARTSERVERZ:0 ea7ff0be-03a2-0678-c211-e74b7b66d018
└─md0           jbd                            5fdaf11a-39b5-4b78-adb5-84ed03033c47
sdd             linux_raid_member ARTSERVERZ:0 ea7ff0be-03a2-0678-c211-e74b7b66d018
└─md0           jbd                            5fdaf11a-39b5-4b78-adb5-84ed03033c47
sde             linux_raid_member ARTSERVERZ:0 ea7ff0be-03a2-0678-c211-e74b7b66d018
└─md0           jbd                            5fdaf11a-39b5-4b78-adb5-84ed03033c47
sdf             linux_raid_member ARTSERVERZ:0 ea7ff0be-03a2-0678-c211-e74b7b66d018
└─md0           jbd                            5fdaf11a-39b5-4b78-adb5-84ed03033c47
sdg             linux_raid_member ARTSERVERZ:0 ea7ff0be-03a2-0678-c211-e74b7b66d018
└─md0           jbd  

Any thoughts?

thanks

(this is more academic - we do have multiple backups but was hoping to figure this out…)

this seems to be something in dmesg…

[26964881.001869] EXT4-fs (md0): bad s_want_extra_isize: 9630

dont you need to format or import a filesystem on the block device?

like, md0p1?

like the difference between a HDD, and the partition on the HDD?

[edit]
Nope, system uses md0 for both apparently

Spitballin’ here,
Missing drive may have the data necessary to complete raid, clone the missing 2 to new drives.
Check each drive for physical damage, eliminate damaged disk.
But what this reads is one of the “good” disk got a hotspot which damage a section enough to read good initaly but time of use cannot.

mdadm is a great thing!

Find out which package provides the tools for the jdf file system (never heard of that, but what do I know :wink: ), install it and let fsck loose on the array.

Alternatively, given you’ve already ensured a backup of the data is available, reformat the drives into a known well-supported Linux fs, like ext4, JFS, XFS or BTRFS, complete with the fs tools it requires (like jfsutils, etc).

HTH!

it was ext4… I am 99.9 % sure…

This was raid6 - I should be able to lose 2 drives and still have complete data recovery.

…but what of a 3rd drive being damaged…impersonating a good drive…

As root, do:
fsck /dev/md0 && mount -a
If it fails, post the exact error message.
(this does a fs check on the array, then tells the kernel to mount all filesystems in /etc/fstab)

If /dev/md0 fails as fs, then check individual drives this way:
fsck /dev/sd[b,c,d,e,f,g]
Again, post the exact error if it fails.

if I fsck any of the drives including md0 all I get is

fsck from util-linux 2.31.1

sam

Add the -C and/or -V flag before the device list. The -C flag shows progress (it can be a while for large disks to complete!) and the -V flag gives a “running commentary”. Not sure if these are mutually exclusive or can be combined. The -N flag makes fsck do nothing, except telling you what would happen if it wasn’t issued. (dry run)

same…

fsck -C -V /dev/md0
fsck from util-linux 2.31.1

So as far as my Googling research found, this error is because it cannot find the ext4 superblock where it is expected.

The “jbd” type is Journal Block Device and is the ext3 or ext4 block journal. Besides damaged superblocks it is also possible to see this if you have used a device for a separate journal partition.

I found some suggestions on how to hint fsck to find backup ext4 superblocks. It’s also a possibility that it is missing because you’re missing a required RAID stripe.

I got it fsck’ing - I think it is cooked. lots of (input/output) errors…

sam

Note to self - Check case fans…

1 Like

Did you scrub it?

SMART temp alerts also

I am letting fsck run for s&g’s… Nothing to lose - won’t be reusing these drives anyway…

sam

1 Like

For posterity, I think you’d want to run a scrub before the fsck.

echo check > /sys/block/md0/md/sync_action
cat /proc/mdstat
cat /sys/block/md0/md/mismatch_cnt

But yeah, in your case it isn’t going to make any difference.