How do I recover my Red Hat 8 or CentOS 8 instance that fails to boot because of issues with the GRUB2 BLS configuration file?

5 分的閱讀內容

I have a Red Hat 8 or CentOS 8 Amazon Elastic Compute Cloud (Amazon EC2) instance. I want to recover a corrupted or deleted BLS configuration (blscfg) file found under “/boot/loader/entries/”.

Short description

GRUB2 in RHEL 8 and Centos 8 uses blscfg files and entries in /boot/loader for the boot configuration, as opposed to the previous grub.cfg format. It's a best practice to use the grubby tool to manage the blscfg files and retrieve information from /boot/loader/entries/. If the blscfg files are corrupted or missing from this location, then grubby doesn't show any results. You must regenerate the files to recover functionality. To regenerate the blscfg, create a temporary rescue instance, and then remount your Amazon Elastic Block Store (Amazon EBS) volume on the rescue instance. From the rescue instance, regenerate the blscfg for any installed kernels.
Important: Don't perform this procedure on an instance store-backed instance. This recovery procedure requires a stop and start of your instance, which means that you lose any data on the instance. For more information, see Determine the root device type of your instance.

Resolution

Attach the root volume to a rescue EC2 instance

Create an EBS snapshot of the root volume. For more information, see Create Amazon EBS snapshots.
Open the Amazon EC2 console.
Note: Be sure that you are in the correct Region. The Region appears in the Amazon EC2 console to the right of your account information. You can choose a different Region from the drop down menu, if needed.
Choose Instances from the navigation pane, and then choose the impaired instance.
Choose Actions, select Instance State, and then choose Stop.
In the Description tab, under Root device, choose /dev/sda1, and then choose the EBS ID.
Choose Actions, Detach Volume, and then choose Yes, Detach. Note the Availability Zone.
Launch a similar rescue EC2 instance in the same Availability Zone. This instance becomes your rescue instance.
After the rescue instance launches, choose Volumes from the navigation pane, and then choose the detached root volume of the impaired instance.
Choose Actions, and then choose Attach Volume.
Choose the rescue instance ID (id-xxxxx), and then set an unused device. In this example, the unused device is /dev/sdf.

Mount the volume of the impaired instance

Use SSH to connect to the rescue instance.

Run the lsblk command to view your available disk devices:

[ec2-user@ip-10-10-1-111 /]s lsblkNAME    MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvda    202:0    0  10G  0 disk
├─xvda1 202:1    0   1M  0 part
└─xvda2 202:2    0  10G  0 part /
xvdf    202:80   0  10G  0 disk
├─xvdf1 202:81   0   1M  0 part
└─xvdf2 202:82   0  10G  0 part

Note: Nitro-based instances expose EBS volumes as NVMe block devices. The output that the lsblk command generates on Nitro-based instances shows the disk names as nvme[0-26]n1. For more information, see Amazon EBS and NVMe on Linux instances.

Create a mount directory, and then mount the root partition of the mounted volume to this new directory. In the example from Step 2, /dev/xvdf2 is the root partition of the mounted volume. For more information, see Make an Amazon EBS volume available for use on Linux:
```
sudo mkdir /mountsudo mount /dev/xvdf2 /mount
```

Mount /dev, /run, /proc, and /sys of the rescue instance to the same paths as the newly mounted volume:

sudo mount -o bind /dev /mount/devsudo mount -o bind /run /mount/run
sudo mount -o bind /proc /mount/proc 
sudo mount -o bind /sys /mount/sys

Start the chroot environment:
```
sudo chroot /mount
```

Regenerate the blscfg files

Run the rpm command. Note of the available kernels in your instance:

[root@ip-10-10-1-111 ~]# rpm -q --last kernelkernel-4.18.0-147.3.1.el8_1.x86_64 Tue 21 Jan 2020 05:11:16 PM UTC
kernel-4.18.0-80.4.2.el8_0.x86_64 Tue 18 Jun 2019 05:06:11 PM UTC

To recreate the blscfg file, run the kernel-install command:
Note: The systemd-udev rpm installation package provides the kernel-install binary:

sudo kernel-install add 4.18.0-147.3.1.el8_1.x86_64 /lib/modules/4.18.0-147.3.1.el8_1.x86_64/vmlinuz

Replace 4.18.0-147.3.1.el8_0.x86_64 with your kernel version number. The blscfg for the designated kernel regenerates under /boot/loader/entries/:

[root@ip-10-10-1-111 ~]# ls /boot/loader/entries/2bb67fbca2394ed494dc348993fb9b94-4.18.0-147.3.1.el8_1.x86_64.conf

Repeat step 2 for other installed kernels on the instance, as needed. The latest kernel that you set becomes the default kernel.

To see the current default kernel, run the grubby command --default kernel:

sudo grubby --default-kernel

Exit from chroot, and unmount the /dev, /run, /proc, and /sys mounts:

Exitsudo umount /mount/dev
sudo umount /mount/run
sudo umount /mount/proc
sudo umount /mount/sys
sudo umount /mount

Mount the device back to the original instance with the correct block device mapping. The device now boots with the default kernel.

Related information

How do I revert to a known stable kernel after an update prevents my Amazon EC2 instance from rebooting successfully?

主題

運算

標籤

Amazon EC2 Linux

語言

English

AWS 官方已更新 8 個月前

沒有評論