How can I recover my Red Hat 8 or CentOS 8 instance that is failing to boot due to issues with the GRUB2 BLS configuration file?

Last updated: 2020-02-03

I'm running a Red Hat 8 or CentOS 8 Amazon Elastic Compute Cloud (Amazon EC2) instance. How can I recover the BLS configuration (blscfg) file found under /boot/loader/entries/ if it is corrupted or deleted?

Short description

GRUB2 in RHEL 8 and Centos 8 uses blscfg files and entries in /boot/loader for the boot configuration, as opposed to the previous grub.cfg format. The grubby tool is recommended for managing the blscfg files and retrieving information from the /boot/loader/entries/. If the blscfg files are missing from this location or corrupted, grubby doesn't show any results. You must regenerate the files to recover functionality. To regenerate the blscfg, create a temporary rescue instance, and then remount your Amazon Elastic Block Store (Amazon EBS) volume on the rescue instance. From the rescue instance, regenerate the blscfg for any installed kernels.

Important: Don't perform this procedure on an instance store-backed instance. This recovery procedure requires a stop and start of your instance, which means that any data on the instance will be lost. For more information, see Determining the root device type of your instance.


Attach the root volume to a rescue EC2 instance

1.    Create an EBS snapshot of the root volume. For more information, see Creating an Amazon EBS snapshot.

2.    Open the Amazon EC2 console.

Note: Be sure that you are in the correct Region. The Region appears in the Amazon EC2 console to the right of your account information. You can choose a different Region from the drop down menu, if needed.

3.    Select Instances from the navigation pane, and then choose the impaired instance.

4.    Select Actions, select Instance State, and then choose Stop.

5.    In the Description tab, under Root device, choose /dev/sda1, and then choose the EBS ID.

6.    Select Actions, select Detach Volume, and then select Yes, Detach. Note the Availability Zone.

7.    Launch a similar rescue EC2 instance in the same Availability Zone. This instance becomes your rescue instance.

8.    After the rescue instance has launched, select Volumes from the navigation pane, and then choose the detached root volume of the impaired instance.

9.    Select Actions, and then select Attach Volume.

10.    Choose the rescue instance ID (id-xxxxx), and then set an unused device. In this example, the unused device is /dev/sdf.

Mount the volume of the impaired instance

1.    Use SSH to connect to the rescue instance.

2.    Run the lsblk command to view your available disk devices.

[ec2-user@ip-10-10-1-111 /]s lsblk
xvda    202:0    0  10G  0 disk
├─xvda1 202:1    0   1M  0 part
└─xvda2 202:2    0  10G  0 part /
xvdf    202:80   0  10G  0 disk
├─xvdf1 202:81   0   1M  0 part
└─xvdf2 202:82   0  10G  0 part 

Note: Nitro-based instances expose EBS volumes as NVMe block devices. The output generated by the lsblk command on Nitro-based instances shows the disk names as nvme[0-26]n1. For more information, see Amazon EBS and NVMe on Linux instances.

3.    Create a mount directory, and then mount the root partition of the mounted volume to this new directory. In the preceding example, /dev/xvdf2 is the root partition of the mounted volume. For more information, see Making an Amazon EBS volume available for use on Linux.

sudo mkdir /mount
sudo mount /dev/xvdf2 /mount

4.    Mount /dev, /run, /proc, and /sys of the rescue instance to the same paths as the newly mounted volume.

sudo mount -o bind /dev /mount/dev
sudo mount -o bind /run /mount/run
sudo mount -o bind /proc /mount/proc 
sudo mount -o bind /sys /mount/sys

5.    Start the chroot environment.

sudo chroot /mount

Regenerate the blscfg files

1.    Run the rpm command. Take note of the available kernels in your instance.

[root@ip-10-10-1-111 ~]# rpm -q --last kernel
kernel-4.18.0-147.3.1.el8_1.x86_64 Tue 21 Jan 2020 05:11:16 PM UTC
kernel-4.18.0-80.4.2.el8_0.x86_64 Tue 18 Jun 2019 05:06:11 PM UTC

2.    To recreate the blscfg file, run the kernel-install command.

Note: kernel-install binary is provided with the systemd-udev rpm installation package.

sudo kernel-install add 4.18.0-147.3.1.el8_1.x86_64 /lib/modules/4.18.0-147.3.1.el8_1.x86_64/vmlinuz 

Replace 4.18.0-147.3.1.el8_0.x86_64 with your kernel version number.

The blscfg for the designated kernel regenerates under /boot/loader/entries/.

[root@ip-10-10-1-111 ~]# ls /boot/loader/entries/

3.    If needed, repeat step 2 for other installed kernels on the instance. The latest kernel is set to the default kernel.

4.    Run the grubby command --default kernel to see the current default kernel.

sudo grubby --default-kernel

5.    Exit from chroot and unmount the /dev/run, /proc, and /sys mounts.

sudo umount /mount/dev
sudo umount /mount/run
sudo umount /mount/proc
sudo umount /mount/sys
sudo umount /mount

6.    Mount the device back to the original instance with the correct block device mapping. The device now boots with the default kernel.