X

News, tips, partners, and perspectives for the Oracle Linux operating system and upstream Linux kernel work

Using AMD Secure Memory Encryption with Oracle Linux

Oracle Linux kernel developer Boris Ostrovsky wrote this explanation of AMD's memory encryption technologies. 

AMD SME and SEV

Introduction

Disk encryption by now has become a standard procedure to protect information from an intruder who has physical access to the system but is not able, for example, to log in. However, the other system component used for storing data, system memory, remains largely vulnerable. It is true that extracting data from memory is typically more difficult but techniques like cold-boot attacks show that this is not an impossible task. To make things worse, introduction of non-volatile memory allows one to physically remove the NVDIMM chips from the system and examine their contents at some later time, making data there as easy to access as it would be on a non-encrypted hard drive.

To protect system memory from such attacks, hardware manufacturers have been adding support for memory encryption. For example, when AMD recently introduced their EPYC processors, one of the new features was the support for Secure Memory Encryption. (Some of the desktop variants, such as Ryzen Pro, also included this).

Secure Memory Encryption (SME)

With SME, the data that the processor writes to memory passes through an encryption engine that scrambles it before committing. Conversely, when the data is read, the encryption engine unscrambles it and presents to the processor in its original format. All this is done without any software intervention.

The encryption engine implements AES algorithm with an 128-bit encryption key. The key is managed by on-the-chip AMD Secure Processor (AMD-SP) and is generated anew after each reset. The key is not accessible to the software.

There are a couple of ways SME can be used.

The first is Transparent SME (TSME). In this mode, any software (operating system or hypervisor) will have its memory encrypted, without any special SME support in SW. This mode is enabled by BIOS setting (if the BIOS vendor decides to expose it). While TSME is the easiest to use, it has some limitations. The biggest one is that it does not allow the use of SEV which we will discuss in a moment.

The other way of using SME is more flexible in that, in addition to enabling SEV, it also allows encrypting only certain memory regions (with page granularity). This is achieved by setting (typically) bit 47 of the physical address, and therefore requires OS/hypervisor support: for pages that should be encrypted, bit 47 (known as C bit) needs to be set.

Secure Encrypted Virtualization (SEV)

When a guest is executing on a hypervisor, the latter has access to all the resources used by the guest, including guest's memory. This is obviously not an ideal situation: the guest may be running a highly sensitive application and does not want anyone to see its data. If the hypervisor is compromised, then the guest's secrets can be too. That's where SEV comes to help.

With SEV, each guest is assigned (by AMD-SP) an encryption key and can encrypt its pages using the same technique as what is used for SME on bare metal (PTE's C bit). The most important part to keep in mind here is that the key is not available to the hypervisor and therefore it cannot snoop on guest's data (unless the guest decides not to encrypt specific pages, for example, those shared with the hypervisor, such as DMA buffers).

Software support

SME

UEK support for SME is enabled by setting CONFIG_CRYPTO_DEV_SP_PSP and CONFIG_AMD_MEM_ENCRYPT build options. After that, specifying mem_encrypt=on on kernel boot command line will activate SME. Alternatively, if CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT is set in the kernel's .config file, then SME is active by default. To verify that SME is on:

[root@host ~]# dmesg | grep SME
[ 0.000000] AMD Secure Memory Encryption (SME) active
[root@host ~]#

Keep in mind that SME needs to be enabled by system firmware, and some BIOSes may have it turned off by default. You can check whether it is on by first making sure that the feature is present in the hardware by looking at CPUID Fn8000_001F[EAX].[0]:

[root@host ~]# cpuid -r -1 -l 0x8000001f
CPU:
 0x8000001f 0x00: eax=0x0000000f ebx=0x0000016f ecx=0x0000000f edx=0x00000001
[root@host ~]#

and then see if it is enabled by verifying that bit 23 of MSR 0xC0010010 is set:

[root@host ~]# rdmsr 0xC0010010
f40000
[root@host ~]#

For a quick demo of SME functionality we can use smetest.c, which is provided at the end of this blog post. The driver allocates a page where a secret string is stored and then prints the contents of that page (as stored in DRAM) either with SME enabled on that page (i.e. bit C set on the PTE) or when the page is accessed as unencrypted (bit C is cleared). Since the data was originally stored in memory in encrypted form, trying to access it with encryption disabled should be unsuccessful.

The relevant part of the driver is the ioctl routine:

static long smetest_ioctl(struct file *file, unsigned int cmd,
                         unsigned long arg)
{
        int ret = 0;
        char buf[strlen(SECRET_DATA) + 1];

        if (!mem_encrypt_active())
                return -ENXIO;

        switch (cmd) {
        case 1:
                ret = set_memory_decrypted((unsigned long)secret, 1);
        case 0:
                break;
        default:
                return -EINVAL;
        }
        if (ret)
                return ret;

        memcpy(buf, secret, strlen(SECRET_DATA) + 1);
        if (cmd == 1) {
                /* Re-encrypt memory */
                ret = set_memory_encrypted((unsigned long)secret, 1);

                /* Make sure string is terminated */
                buf[strlen(SECRET_DATA)] = 0;
        }
        printk("Secret data is: %s\n", buf);

        return ret;
}

When cmd is 0, the C bit on the PTE is kept and therefore the data is decrypted before it is copied into buf. When cmd is 1, set_memory_decrypted() will clear the bit (and also flush caches and TLBs) so the contents of the memory will be read by the processor without passing through the encryption engine. (Notice that we need to terminate the string in this case since the NULL character will be scrambled).

Userspace code is:

#include <stdlib.h>
#include <errno.h>

main(int argc, char *argv)
{
    int f;

    f = open("/dev/smetest", 0);
    if (f == -1) {
        perror("open");
        exit(errno);
    }

    if (ioctl(f, 0))
        perror("ioctl(0)");

    if (ioctl(f, 1))
        perror("ioctl(1)");

    close(f);
}

Here are the results:

[root@host ~]# insmod ./smetest.ko
[root@host ~]# ./a.out 
[root@host ~]# dmesg 
[ 1129.283633] secret is my secret
[ 1133.687482] Secret data is: my secret
[ 1133.696322] Secret data is: \xffffff81\xffffff83\xffffff93\xffffffa8\xffffffe6\xffffffc\xffffff84\xfffffffc\xffffffb7
[root@host ~]#

SEV

To enable SEV, CONFIG_KVM_AMD_SEV needs to be set in the Linux configuration file. A newer qemu (such as qemu-3.0.0-4.el7) and OVMF is also required.

Start the guest by specifying new qemu object, sev-guest and set machine's memory-encryption attribute. For example:

[root@host ~]# qemu-system-x86_64 -enable-kvm -cpu EPYC -machine q35 -smp 1 -m 1G -drive if=pflash,format=raw,unit=0,file=/usr/share/OVMF/OVMF_CODE.pure-efi.fd,readonly -drive if=pflash,format=raw,unit=1,file=/usr/share/OVMF/OVMF_VARS.fd -drive file=./ol76-uefi.qcow2,if=none,id=disk0,format=qcow2 -device virtio-scsi-pci,id=scsi,disable-legacy=on,iommu_platform=true -device scsi-hd,drive=disk0 -nographic -s -device virtio-rng-pci,disable-legacy=on,iommu_platform=true -object sev-guest,id=sev0,cbitpos=47,reduced-phys-bits=1 -machine memory-encryption=sev0

To see whether SEV is available check CPUID Fn8000_001F[EAX].[1]:

[root@guest ~]# cpuid -r -1 -l 0x8000001f
CPU:
0x8000001f 0x00: eax=0x00000002 ebx=0x0000006f ecx=0x00000000 edx=0x00000000
[root@guest ~]#

And to verify that it is active, look at bit 1 of MSR 0xc0010131:

[root@guest ~]# rdmsr 0xc0010131
1
[root@guest ~]#

You can also verify this by looking at dmesg output to see whether SEV is on:

[root@guest ~]# dmesg | grep SEV
[ 0.001000] AMD Secure Encrypted Virtualization (SEV) active
[ 1.727193] SEV is active and system is using DMA bounce buffers
[root@guest ~]#

Recall that the main reason behind SEV is to protect guest's memory from being snooped on by the hypervisor. Here is a small example that demonstrates this:

#include <stdio.h>
#include <stdlib.h>

main(int argc, char *argv[])
{
    char str[32];
    int secret = -1;

    if (argc > 1)
        secret = atoi(argv[1]);

    sprintf(str, "My secret is %d\n", secret);

    sleep(10000);
}

We run the above code as:

root@guest ~]# ./a.out 123 &
[1] 3698
[root@guest ~]#

We then drop to qemu monitor (Ctrl-A C) and save guest's memory into a file:

(qemu) dump-guest-memory /tmp/encrypted 
(qemu)

Now start the guest without SEV (by dropping '-object sev-guest,id=sev0,cbitpos=47,reduced-phys-bits=1 -machine memory-encryption=sev0' options) and save its memory in /tmp/unencrypted.

Let's first search unencrypted guest's memory:

[root@host ~]# strings /tmp/unencrypted | grep "My secret"
My secret is 123
My secret is %d
My secret is %d
[root@host ~]#

and then

[root@host ~]# strings /tmp/encrypted | grep "My secret"
My secret is %d
[root@host ~]#

The secret string cannot be discovered when SEV is turned on.

(Note that we still see "My secret is %d" string. This is because when the executable was fetched from the disk it was first placed into a buffer shared between the hypervisor (host) and the guest. Since the hypervisor cannot access the guest's encrypted memory, those shared buffers are not encrypted.)

Limitations

While SEV allows guests to hide contents of their memory, another component that a guest may wish to hide from the host is guest's registers. For example, various encryption keys (such as ssh keys, pgp keys etc.) are often stored in floating-point registers such as %xmm and %ymm and therefore it is important that access to that information is not allowed to any entity outside the guest. Currently it is not possible to limit hypervisor's visibility into this state, although AMD promises that future processors will support SEV-ES (Encrypted State) to address this issue.

Another limitation of running guests with SEV is that at the moment live migration (and save/restore in general) are not properly supported.

References

 

Sample kernel module smetest.c

/*
* Copyright (c) 2019, Oracle and/or its affiliates. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 only, as
* published by the Free Software Foundation.
*
* This code is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
* version 2 for more details (a copy is included in the LICENSE file that
* accompanied this code).
*
* You should have received a copy of the GNU General Public License version
* 2 along with this work; if not, write to the Free Software Foundation,
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
*
* Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
* or visit www.oracle.com if you need additional information or have any
* questions.
*/
#include 
#include 
#include 

#define SECRET_DATA	"my secret\n"

static int smetest_major;
static struct class *smetest_class;
static char *secret;

static long smetest_ioctl(struct file *file, unsigned int cmd,
                         unsigned long arg)
{
	int ret = 0;
	char buf[strlen(SECRET_DATA) + 1];

	if (!mem_encrypt_active())
                return -ENXIO;

	switch (cmd) {
	case 1:
		ret = set_memory_decrypted((unsigned long)secret, 1);
	case 0:
		break;
	default:
		return -EINVAL;
	}
	if (ret)
		return ret;

	memcpy(buf, secret, strlen(SECRET_DATA) + 1);
	if (cmd == 1) {
		/* Re-encrypt memory */
		ret = set_memory_encrypted((unsigned long)secret, 1);

		/* Make sure string is terminated */
		buf[strlen(SECRET_DATA)] = 0;
	}
	printk("Secret data is: %s\n", buf);

	return ret;
}


static struct file_operations smetest_ops = {
        .owner   = THIS_MODULE,
	.unlocked_ioctl = smetest_ioctl,
};

static void smetest_cleanup(void)
{
	if (smetest_class) {
		device_destroy(smetest_class, MKDEV(smetest_major, 0));
		class_destroy(smetest_class);
	}
	__unregister_chrdev(smetest_major, 0, 1, "smetest");
	free_page((unsigned long)secret);	
}

static int smetest_init(void)
{
	int err = 0;

	smetest_major = __register_chrdev(smetest_major, 0, 1,
				"smetest", &smetest_ops);
	if (smetest_major < 0) {
                pr_err("unable to get major %d for msr\n", smetest_major);
                err = smetest_major;
		goto errout;
        }
        smetest_class = class_create(THIS_MODULE, "smetest");
        if (IS_ERR(smetest_class)) {
                err = PTR_ERR(smetest_class);
                goto errout;
        }

	device_create(smetest_class, NULL, MKDEV(smetest_major, 0), NULL, "smetest");

	secret = (char *)__get_free_page(GFP_KERNEL);
	if (!secret) {
		printk("Can't allocate page for smetest\n");
		err = -ENOMEM;
		goto errout;
	}
	strcpy(secret, SECRET_DATA);
	printk("secret is %s\n", secret);

	return 0;

errout:
	smetest_cleanup();
        return err;
}
   
static void smetest_exit(void)
{
	smetest_cleanup();
}
   
module_init(smetest_init);
module_exit(smetest_exit);

MODULE_LICENSE("GPL");

Join the discussion

Comments ( 2 )
  • Tom Lendacky Monday, June 10, 2019
    Boris, thank you for the article on SME and SEV. One thing I wish to correct is that TSME and SEV are compatible, so SEV does work when TSME is enabled.
  • Maurice Wednesday, July 17, 2019
    Thank you for this tutorial, it helps a lot with SME.

    Though, on my Thinkpad machine I get these results for cpuid and rdmsr:

    0x8000001f 0x00: eax=0x0000000f ebx=0x0000016f ecx=0x0000000f edx=0x00000000

    and for rdmsr
    f40000

    The only difference to you is, edx is zero on my machine. Actually I can´t use mem_encrypt=on, it stalls the machine.

    What do my values suggest concerning SME?
    cat /proc/cpuid | grep sme gives sme and smep.

    Thanks!
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.