Software raid issue

Hi everyone. I have some trouble after turning a regular install on /dev/sda to software Raid.

the system was already installed on a single disk on /dev/sda, then I added a second disk (on a HPE ML30 with no RAID card). then turn it to a raid using mdadm.

At the end everything was OK but reboot was a drama. Grub is OK but second stage will never guess correctly which mdadm of LVM comes first, and fail whatever I tried. ending in infinite dracut timeout script.

here is my (maybe wrong) path to raid I have followed.

Step 1) Install Almalinux from the ISO

Install on the disk /dev/sda

Step 2) Upgrade to the latest packages

dnf upgrade

Step 3) Install Raid packages

dnf install mdadm rsync 

Step 4) Now we need to copy the partition table from sda to sdb


sgdisk -R /dev/sdb /dev/sda
sgdisk -G /dev/sdb

Step 5) Convert the partitions on /dev/sdb to RAID disks

sgdisk -t 1:fd00 -t 2:fd00 -t 3:fd00 /dev/sdb

Step 6) Initialize the RAID

mdadm --create /dev/md0 --level=1 --raid-disks=2 missing /dev/sdb1 --metadata=0.90
mdadm --create /dev/md1 --level=1 --raid-disks=2 missing /dev/sdb2 --metadata=0.90
mdadm --create /dev/md2 --level=1 --raid-disks=2 missing /dev/sdb3

Step 7) Copy the /boot on /dev/md0

mkfs.vfat /dev/md0
mkfs.xfs /dev/md1

mkdir /mnt/md1

mount /dev/md1 /mnt/md1
mkdir /mnt/md1/efi

mount /dev/md0 /mnt/md1/efi

rsync -av /boot/. /mnt/md1

Now we need to edit the /etc/fstab file. In the line containing UUID

  Ex.   UUID=c8ac59b8-ce54-4fca-8107-1b04aaa0194d /boot ext3 defaults 0 1
we replace the UUID part with /dev/md1 OR set the right UUID in place (use blkid)

/dev/md1 /boot      xfs   defaults 0 1
/dev/md0 /boot/efi  vfat  umask=0077,shortname=winnt 0 2

echo raid1 >> /etc/modules-load.d/raid.conf 

cat /etc/modules-load.d/raid.conf 
raid1

dnf reinstal kernel-<current>
dracut -f 

Step 8) Reboot the system

shutdown -r now
after the reboot we can check the result of the command:

mount | grep boot
it must show a line similar to this one:

/dev/md0 on /boot type xfs (rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota)

Now we can tell to GRUB that we want root from /dev/md0

echo 'GRUB_PRELOAD_MODULES="raid dmraid"' >> /etc/default/grub

cd /etc;grub2-mkconfig -o $(readlink /etc/grub2-efi.cfg)

dnf reinstal kernel-<current>
dracut -f 

Step 9) Add /dev/sdaX to raid /dev/mdY
 
sgdisk -t 2:fd00 /dev/sda 

mdadm --add /dev/md0 /dev/sda1
mdadm --add /dev/md1 /dev/sda2

Step 10) Move LVM on /dev/md3

installation created a LVM on /dev/sda3 so we need to move it on /dev/md2 and then remove it from /dev/sda3

pvcreate /dev/md2
vgextend almalinux /dev/md2

swapoff -a
lvremove /dev/almalinux/swap

pvmove /dev/sda3 /dev/md2
The pvmove steps will take some time depending on the CPU and the disk size, it took me few hours

vgreduce almalinux /dev/sda3

pvremove /dev/sda3

lvcreate -l +100%VG -n swap almalinux
mkswap /dev/almalinux/swap

Step 11) Add /dev/sda2 to raid

sgdisk -t 3:fd00 /dev/sda

mdadm --add /dev/md2 /dev/sda3
Now, executing the command

cat /proc/mdstat
we can see the RAID is executing the sync of the disks.

when its done then reboot.

I have then tried to play with dracut -f an plenty options like --mdadmconf , but nothing get me out of “dracut-initqueue timeout – starting timeout scripts” and then in dracut emergency mode.

it ends with “Warning: /dev/almalinux/root doest not exist”

however, booting on systemrescueCD 8 (and choosing boot from an installed Linux) boot very cleanly to my almalinux/root filesystem, whith everything in sync, and raid OK.

So I guess it maybe a single options I may have missed in the process but I can’t see which one. please send me tips or directions I you can.

looks like I was missing an entry like “rd.md.uuid=” in /etc/default/grub for the line GRUB_CMDLINE_LINUX=

So at step 8: add

add rd.md.uuid= in GRUB_CMDLINE_LINUX and get UUID of /dev/md2 from “mdadm --detail /dev/md2”

1 Like