This blog post by Oracle Linux engineers Daniel Kiper, Alexandr Burmashev, John Haxby and Jan Setje-Eilers tells the inside story of how the "BootHole" GRUB2 vulnerability was reported and resolved. Daniel and Alexsandr are maintainers for GRUB2 and are responsible for that code across all platforms. Oracle customers can find information about the impact of CVE-2020-10713 at this link.
As GRUB2 upstream maintainers, Oracle developers took the lead on both the disclosure coordination and the technical solutions. In their role as community maintainers for GRUB2, Daniel and Alexsandr were notified of the security vulnerability and immediately began analyzing the impact of these vulnerabilities, coordinating the cross-vendor industry response, and ensuring that this vulnerability would be fixed swiftly. In the end, this coordination effort would entail around 100 individuals from 18 companies.
CVE-2020-10713, the "BootHole" vulnerability, affects systems using UEFI Secure Boot signed operating systems and has a CVSS Base Score of 8.2.
GRUB2, the GRand Unified Bootloader version 2, is the most popular bootloader for Linux and is used by many other Operating Systems. It offers a uniform, system independent pre-boot environment, and is used to load the OS kernel into memory from persistent storage as part of the boot process. GRUB2 provides a menu interface allowing users to select between multiple OS versions and OS configuration options at boot time. GRUB2 is shipped as part of Oracle Linux and Oracle Solaris.
Although most boot-time exploits are not exploitable over the network and are only exploitable at the time when the system is booting, a boot-time vulnerability could allow an attacker who already has control of a system to load an APT-like vulnerability that would be very hard to detect. This is what Secure Boot is designed to prevent.
Customers enable secure boot to permit only cryptographically signed operating systems to boot. This is an optional feature on most platforms allowing signed or unsigned operating systems to be executed. Users who choose to allow only signed binaries expect that guarantee to be enforced by the hardware and software on their system. This guarantee is accomplished via a signed chain of trust. Secure boot doesn't just validate an on-disk signature of the operating system, it provides a framework to validate each subsequent component of the operating system as well. Secure Boot ensures that each trusted component validated the next trusted component prior to handing off control. Starting with firmware, the signature on each boot component is verified as it is loaded into memory and before execution control is handed over to it. Without that mechanism, an untrusted, running, kernel could lie about a signature verification of its on-disk binary.
The most widely used boot protection technology is UEFI Secure Boot; alternatives include SRTM and DRTM). UEFI Secure Boot relies on a chain of trust where firmware uses a built-in system key, usually a certificate owned and managed by Microsoft, to verify the validity and trustworthiness of the initial boot component. In the case of GRUB2 the initial component is a binary referred to as the "shim". This shim uses additional built-in keys to validate the next component and so on. A common boot sequence looks like this:
Scrutiny of the GRUB2 source code led to the discovery of the BootHole vulnerability which can be used to boot untrusted operating systems.
In early April 2020, we, the GRUB2 maintainers, were approached by security researchers from Eclypsium. The researchers had discovered an issue with a CVSS Base Score of 8.2 ("High") in the GRUB2 script parser. This vulnerability could be used to bypass UEFI Secure Boot and to load an unsigned operating system. Our analysis of this vulnerability revealed that fixes were required in multiple layers of the boot time chain of trust. In addition to the GRUB2 fixes, we would also require fixes to the shim layer, the Linux kernel, fwupd and the entire UEFI Secure Boot signing process. OS and hardware vendors would need to revoke the security certificates of older, unpatched programs.
As GRUB2 upstream maintainers, Oracle developers took the lead on both the disclosure coordination and the technical solutions. Daniel and Alex hosted a weekly call with Eclypsium, Microsoft, Oracle and Red Hat. This allowed the discussion of various signing and revocation scenarios and other issues that required coordination.
Not only was there a technical challenge to ensure we had all the right fixes, but equally important would be the notification and coordination of fixes among multiple vendors and organizations throughout the industry.
Once a vulnerability is found in signed code, that code has to be considered 'untrusted', which means the certificate must be revoked. In our case, the old GRUB2 binaries all had valid signatures which would have to be revoked; otherwise an attacker could just revert to an older version of the package (with a valid signature and with the exploitable vulnerability) and use that to bootstrap an unsigned OS onto the hardware. The team invalidated the existing shims, which would have allowed vulnerable GRUB2 binaries to be loaded, by adding hashes of the current shims into the UEFI Secure Boot revocation list signature store (dbx). This updated dbx then needed to be distributed to, and installed on, every system in the field in order to maintain the chain of trust.
We discovered that the shim's signing key was also being used to sign other parts of the boot chain. Invalidating the key which was used to sign the shim would invalidate the key for any step of the boot process and would require vendors to re-sign and re-release fwupd, the Linux kernel, and other signed software packages as well. These packages all had to be released at the same time that the original certificate was revoked -- which was timed with the official publication of the security vulnerability. The team decided to use this opportunity to change the signing model for GRUB2 to recommend separate certificates for separate artifacts: from this release forward a vulnerability in one component would not require a re-release of unaffected components. This new signing scheme makes it easier to manage the chain of trust if a given artifact is compromised and has to be fixed, and has the additional benefit of being easy to integrate for vendors as they could choose to implement all or part of the scheme by disclosure time, and follow up with additional release in the future.
Adding to the challenge was the need to operate in secrecy. Fortunately the security researchers from Eclypsium disclosed the vulnerabilities they had found through secure channels directly to the maintainers responsible for the code. This responsible disclosure gave the GRUB2 team time to prepare optimal solutions for all the issues, to coordinate across all the affected vendors, and to have the fixes and updated certificates available to customers at the time of public disclosure.
With a complex issue like this security vulnerability, the team had to consider whether otherwise innocuous code changes might reveal the existence of a vulnerability before the whole ecosystem was patched and ready. For example, the shim update itself did not have any direct indications that GRUB2 contained a bug; however if a large number of new shims with new certificates pop up out of the blue, it would be a hint for attackers to start probing the shim and GRUB2 where they might find and exploit the vulnerability. Therefore it was important to keep the process out of the public eye and the team relied on Keybase for its encrypted chat and git features.
The Eclypsium finding prompted Canonical and Oracle developers to look further into GRUB2 code security. Canonical developers found various arithmetic overflow and underflow errors, and Oracle developers used static analysis tools to detect potential security issues.
Though the math was not particularly difficult, older compilers did not provide the overflow/underflow arithmetic operations. The team resolved this for upstream GRUB2 by requiring a more modern compiler to compile upstream GRUB2. OS distributions vendors, who cannot easily move compiler versions, were able to leverage arithmetic functions under a GRUB2 compatible license from the Linux kernel itself.
Static analysis also proved fruitful and the team found additional bugs, that resulted in patches to fix them, using the free cloud based Coverity scans that Synopsys provides for open source projects. All in all, these efforts led to the development of 28 patches during the development process, as seen here: GRUB2 upstream. Variants of this patch set landed in the various Linux distributions and products using GRUB2 as their bootloader.
Maintaining a high level of product security takes work, and this was no exception! The team was ultimately successful due, in no small part, to the close cooperation among various companies and organizations involved. All in all, around one hundred people from eighteen companies and organizations worked together to mitigate these vulnerabilities: AMI, Bitdefender, Canonical, Cisco, Citrix, Debian, Dell, Eclypsium, Google, HP, HPE, Juniper, Microsoft, Oracle, Red Hat, SUSE, UEFI Security Response Team, VMware (presented in alphabetical order). The Oracle team truly appreciate the industry-wide commitment to the security of information systems. The Oracle team would like to thank the Eclypsium researchers for their responsible disclosure of these vulnerabilities to the GRUB2 maintainers.