【论文】HyperEnclave：An Open and Cross-platform Trusted Execution Environment

论文思路整理

Threat Module

page-table-based attacks
enclave malware attacks
memory mapping attacks
controlled-channel attacks

论文精读

Abstract

A number of trusted execution environments (TEEs) have been proposed by both academia and industry. However, most of them require specific hardware or firmware changes and are bound to specific hardware vendors (such as Intel, AMD, ARM, and IBM). In this paper, we propose HyperEnclave, an open and cross-platform process-based TEE that relies on the widely-available virtualization extension to create the isolated execution environment. In particular, HyperEnclave is designed to support the flexible enclave operation modes to fulfill the security and performance demands under various enclave workloads. We provide the enclave SDK to run existing SGX programs on HyperEnclave with little or no source code changes. We have implemented HyperEnclave on commodity AMD servers and deployed the system in a world-leading FinTech company to support real-world privacy-preserving computations. The evaluation on both micro-benchmarks and application benchmarks shows the design of HyperEnclave introduces only a small overhead.

现存 TEE 需要特定的硬件、固件修改，绑定硬件厂商

论文提出 HyperEnclave 的设计，其特点如下：
1. 开源 + 跨平台 + 基于进程 + 广泛虚拟化
2. 支持灵活的飞地运行模式

1 Introduction

In recent years, trusted execution environments (TEEs) are emerging as a new form of computing paradigm, known as confidential computing, due to the high demand for privacypreserving data processing technologies that can handle massive data samples. TEEs provide hardware-enforced memory partitions where sensitive data can be securely processed. Existing TEE designs support different levels of TEE abstractions, such as process-based (Intel’s Software Guard eXtensions (SGX) [55]), VM-based (AMD SEV [45]), separate worlds (ARM TrustZone [16]), and hybrid (Keystone [49]). Currently, the most prominent example of TEEs is Intel SGX, which is widely available in commercial off-the-shelf (COTS) desktop and server processors.

TEE 的需求日益增长，为私密数据提供硬件隔离
现有 TEE 提供不同级别的抽象：基于进程、基于虚拟机、独立世界、混合等

Motivations.

Most of today’s TEE technologies are closesourced and require specific hardware or firmware changes that are difficult to audit, slow to evolve, and thus are inferior to cryptographic alternatives (such as homomorphic encryption), which are based upon public algorithms and widely available hardware. Moreover, most existing TEE designs restrict the enclaves (i.e., the protected TEE regions) to run only in fixed mode.1 It is difficult to support the performance and security requirements of various types of applications that need to be protected by TEEs. For example, Intel SGX enclaves run in the user mode and cannot access privileged resources (such as the file system, the IDT, and page tables) and process privileged events (interrupt and exceptions). As a result, running I/O-intensive and memory-demanding tasks leads to significant performance degradation.

现有 TEE 大多闭源、需要硬固件修改、发展缓慢
同时只支持飞地运行在固定模式，不适应多种机密计算场景

To fill the gap, in this paper we propose the design of HyperEnclave to support confidential cloud computing that can run securely on both legacy servers readily available in the cloud, and on the rising ARM (or RISC-V in the future) servers, without requiring specific hardware features. For this purpose, our design provides a process-based TEE abstraction using the widely available virtualization extension (for isolation) and TPM (for root of trust and randomness etc.). To better fulfill the needs for specific enclave workloads, HyperEnclave supports the flexible enclave operation modes, i.e., the enclaves can run at different privilege levels and can have access to certain privileged resources (see Sec. 4 for more details).

应用场景：安全云计算
通过虚拟化技术、TPM 建立基于进程抽象的 TEE
飞地可以运行在多种特权模式下，访问特定特权资源

Design details.

In our design, the system runs in three modes. A trusted software layer, called RustMonitor (security monitor written in Rust), runs in the monitor mode, which is mapped to the VMX root mode. RustMonitor is responsible for enforcing the isolation and is part of the trusted computing base (TCB). The untrusted OS (referred to as the primary OS) provides an execution environment for the untrusted part of applications; the untrusted OS and application parts run in the normal mode, which is mapped to the VMX non-root mode. The trusted part of application (i.e., enclave) runs in the secure mode, which can be mapped flexibly to ring-3 or ring-0 of the VMX non-root mode, or ring-3 of the VMX root mode.

HyperEnclave 在三种模式下运行：
1. monitor mode：RustMonitor，可信软件（属于 TCB）
2. normal mode：OS 和 untrusted application
3. secure mode：enclave（trusted application）

monitor mode 映射到 VMX-root-r0
normal mode 映射到 VMX-non-root-r0~3
secure mode 映射到 VMX-non-root-r3 | VMX-non-root-r0 | VMX-root-r3（对应三种飞地运行模式）

Memory isolation is enforced with hardware-based memory protection of the memory-management unit (MMU). As we observe that existing process-based TEEs (e.g., Inktag [38] and Intel SGX [55]) are vulnerable to page-table-based attacks [74], our memory isolation scheme chooses to manage the enclave’s page table and page fault events entirely by the trusted code, removing the involvement of the primary OS. The design also prevents certain types of enclave malware attacks (Sec. 3.2).

地址映射和隔离是由硬件 MMU 完成的，现有 TEE 的设计中，飞地的页表和也错误均由 OS 处理，可能会遭到基于页表的攻击。在 HyperEnclave 中，将页表和也错误完完全全交给可信代码管理（消除了 OS 的参与）

To minimize the attack surface, we adopt an approach called measured late launch: the primary OS kernel is first booted; then a chunk of special kernel code, implemented as a kernel module in the primary OS, runs to initiate RustMonitor in the most privileged level (i.e., the monitor mode) and demotes the primary OS to the normal mode. All booted components during the booting process are measured and extended to the TPM Platform Configuration Registers (PCRs). Since the TPM attestation guarantees that PCRs cannot be rolled back, the design ensures that RustMonitor is securely launched; otherwise, a violation of the TPM quote would be detected during remote attestation.

为了最小化攻击面，HyperEnclave 采取 measured late launch 的策略
启动顺序：OS 内核 -> 特定内核代码（初始化 RustMonitor，将其变成 monitor mode，将 OS 变成 normal mode）

所有启动组件都会被测量并记录到 TPM PCR 寄存器中（只允许扩展），用于远程验证

We have implemented HyperEnclave on commodity AMD servers. In total RustMonitor consists of about 7,500 lines of Rust code. The APIs of our enclave SDK are compatible with the official SGX SDK. As a result, code written for SGX could be easily ported to run on HyperEnclave by recompiling the code with little (or no) source code changes. We have ported a number ofSGX applications, as well as the Rust SGX SDK [71] and the Occlum library OS [64] to HyperEnclave. The micro-benchmarks show that the overheads for ECALLs and OCALLs are < 9,700 and < 5,260 cycles respectively (14,432 and 12,432 cycles respectively on Intel SGX). The evaluation on a suite of real-world applications shows that the overhead is small (e.g., the overhead on SQLite is only 5%).

在商用 AMD 服务器上实现等内容

Contributions.

In summary, the paper proposes the design of HyperEnclave, with the following contributions:

An open2 and cross-platform processed-based TEE with minimum hardware requirements (virtualization extensions and TPM) that can run existing SGX programs with little or no source code changes, which enables the reuse of the rich toolchains and ecosystem for Intel SGX.
Supporting the flexible enclave operation modes to fulfill the diverse security and performance requirements of enclave applications without hardware or firmware changes.
A memory isolation scheme that the enclave’s page table and page fault are managed entirely by the trust code, which mitigates the page-table-based attacks and the enclave malware attacks.
A measured late launch approach, combined with the TPMbased attestation to reduce the attack surface.
An implementation on commodity servers (mostly) using the memory safe language Rust, and an evaluation on real hardware and applications, demonstrating that the proposed design is practical and only has a small overhead

论文的贡献：

设计了一个开源、跨平台、进程抽象层的 TEE
支持灵活的飞地运行模式
提出了内存管理模式：页表和页错误全部由可信代码管理
检测后启动原则
使用 Rust 语言在商用服务器上实现

2 Background

2.1 Trusted Execution Environment

A Trusted Execution Environment (TEE) is designed to ensure that sensitive data is stored, processed, and protected in an isolated and trusted environment. The isolated area could be a separate system apart from the normal operating system (such as the TrustZone [16] secure world), a part of a process address space (such as an Intel SGX [55] enclave), or a stand-alone VM (such as a virtual machine protected by AMD SEV [45] or Intel TDX [41]). To resist the privileged attacker, TEE needs to thwart not only the OS-level adversary but also the malicious party who has physical access to the platform. To this end, it offers hardware-enforced security features including isolated execution, integrity, and confidentiality protection of the enclave, along with the ability to authenticate the code running inside a trusted platform through remote attestation.

TEE 需要保证：

隔离执行、完整性保护、机密性保护
远程验证

Isolation.

At the core of a TEE is the memory isolation scheme, which guarantees that code, data, and the runtime state of the enclave cannot be accessed or tampered with by untrusted parties. For Intel SGX, the protected memory (i.e., the enclave) is mapped to a special physical memory area called Enclave Page Cache (EPC), which is encrypted and cannot be directly accessed by other software, firmware, BIOS, and direct memory access (DMA).

TEE 核心是内存隔离模式
对于 Intel SGX，被保护的内存被映射到内存的特殊区域（EPC），EPC 被加密且不能被其他软件、固件、BIOS、DMA 访问

Attestation.

The goal of remote attestation is to generate an attestation quote, which includes the measurement of the software state, signed with the attestation key embedded in the hardware. The remote user verifies the validity of the quote by checking the signature (which reflects the hardware identity) and the measurement (which proves the software state).

远程验证的目的是生成一个验证引用
引用包含软件的测量值和嵌入硬件的私钥签名，用户可以通过引用检查软硬件的合法性

2.2 Trusted Platform Module

Trusted Platform Module (TPM) is both an industrystandard [36] and an ISO/IEC standard [4] for a secure cryptoprocessor. It is used by nearly all PC and server manufacturers. Firmware TPMs (fTPMs) are firmware-based (e.g. UEFI) TPM implementations. At the time of this writing, Intel, AMD, and Qualcomm all have implemented fTPMs.

TPM 是安全加密处理器，广泛使用于当代计算机中

TPM has a set of Platform Configuration Registers (PCRs), which can be used for the measurement of the booted code during the boot process. PCRs are reset to zero on system reboot or power on-off. During every boot process, the PCRs can only be extended with the new measurement (called PCR extend), and thus cannot be set to arbitrary values.

TPM 中包含 PCR 寄存器，用于存储启动软件的测量值，开机重置为 0，只能扩展不能重写

Every TPM ships with a unique asymmetric key, called the Endorsement Key (EK), embedded by the manufacturer as the root of trust. The TPM can generate a quote of the PCR values, signed using the TPM Attestation Identity Keys (AIK), while the AIK is generated inside TPM and certified using EK. Any modifications of the booted code would be reflected in the quote. Upon receiving the quote, the remote party can validate the signing key comes from an authentic TPM and can be assured that the PCR digest report has not been altered.

TPM 拥有嵌入硬件的密钥 EK
TPM 内部生成并由 EK 认证的密钥 AIK
TPM PCRs 测量值由 AIK 签名，交给用户，用户进行审核

2.3 Threat Model

Like the other TEE proposals [23, 49], we trust the underlying hardware, including the processor establishing the virtualization-based isolation, the System Management Mode (SMM) code, as well as the TPM. We assume that the Core Root of Trust for Measurement (CRTM) is trusted and immutable. HyperEnclave mitigates certain physical memory attacks, such as cold boot attacks and bus snooping attacks with the hardware support for memory encryption. We don’t fully trust the operator and assume the attacker cannot mount physical attacks during the boot process, i.e., we assume that the system is initially benign (during system boot), and the early OS during the boot stage is part of the TCB. This can be achieved in two ways.

Firstly, the power-on event can be secured with a hardware device, such as an HSM (i.e., hardware security module). The platform enters the boot process only with the engagement and supervision of a trusted party, who owns the HSM. After that, the operators for maintenance are not trusted.
Secondly, the boot process can be enhanced to defend against adversaries with physical accesses. To prevent I/O attacks, we can harden the OS to remove unnecessary devices and disable the DMA capability of peripherals before IOMMU is enabled. We can enable memory encryption at an early stage (e.g., in the BIOS, before any off-chip memory is used) to prevent physical memory attacks.

信任软硬件名单：

负责提供虚拟化隔离机制的处理器
系统管理模式代码 SMM Code
TPM 和 CRTM

额外的假设：

攻击者无法在系统启动阶段实施物理攻击
启动初期系统是良态的，初期 OS 属于 TCB

论文通过理论描述，证明了额外的假设一定可以实现

3 Design

HyperEnclave is designed to support confidential cloud computing without requiring specific hardware features. Therefore, HyperEnclave is built upon the widely available virtual ization extension. In particular, HyperEnclave is designed to support the process-based TEE model (similar to Intel SGX) for the following reasons.

Minimized TCB. To protect an application using the process-based TEE, the TCB includes only the protected code itself, while in the other forms of TEE, much more code must be included, such as the guest operating system for VM-based TEEs.
Established ecosystem. Since Intel SGX is currently the most prevalent TEE supported in the cloud (major CSPs, including GCP, Azure, and Aliyun, provide SGX-based instances [9, 62]), a rich set of toolchains and applications have been developed. Supporting the SGX model reduces the porting effort and makes it easy to deploy confidential computing tasks in the cloud.
Cloud computing trends. We have witnessed a clear trend towards running container-based serverless applications in the cloud. Protecting these applications against untrusted clouds using TEEs is important. Considering that such computing tasks are typically short-lived, and favor a short startup time, maintaining a VM seems to be too heavy-weight.

In this section, we introduce HyperEnclave using x86 notations, as we prototyped HyperEnclave on AMD servers.

为什么要模仿 SGX 实现一个进程抽象的 TEE
最小化 PCB + SGX 很火可以借鉴 + 短期云计算畅行（虚拟机管理成本高）

3.1 System Overview

HyperEnclave supports the following modes: the monitor mode, i.e., VMX root operation mode; the normal mode for the primary OS and untrusted part of applications, i.e., ring-0 and ring-3 of the VMX non-root operation mode respectively; and the secure mode for the enclave, which could be ring-3 and ring-0 of the VMX non-root operation mode, or ring-3 of the VMX root operation mode, depending on the enclave operation mode. We will introduce the flexible operation mode supported by HyperEnclave in Sec. 4. As illustrated in Figure 1, HyperEnclave consists of the following components:

RustMonitor is a lightweight hypervisor running in the monitor mode that manages the enclave memory, enforces the memory isolation, and controls the enclave state transitions. It works as a resource monitor, while complicated tasks are offloaded to the primary OS.
RustMonitor creates a unique guest VM (referred to as the normal VM) that runs the primary OS (such as Linux) and hosts the untrusted part of applications in the normal mode. The primary OS is still in charge of process scheduling and I/O devices management, but it is not trusted by the RustMonitor and enclaves.
Application is the untrusted part of the application which runs in the primary OS.
The kernel module. We provide a kernel module in the primary OS to load, measure, and launch RustMonitor, as well as to invoke the emulated privileged operations.
To ease development, HyperEnclave provides an enclave SDK with APIs compatible with the official Intel SGX SDK [12], including both the untrusted runtime and trusted runtime (i.e., SDK uRTS and SDK tRTS). As such, most SGX programs can run on HyperEnclave with little or no source code changes.
Enclave is the trusted part of the application running in the secure mode.

HyperEnclave 包括五个组成部分：

1. RustMonitor
2. guest VM
3. Application
4. kernel module
5. enclave SDK
6. Enclave

一个应用进程调用步骤示例：
使用 SDK 提供的 API 发起 Enclave 调用 -> 内核模块捕获指令 -> RustMonitor 验证合法性 -> 初始化飞地并执行 -> 结果返回给 OS

3.2 Memory Management and Protection

Challenges.

For process-based TEEs, the enclave runs in the user mode and is not able to manage its own page table. Existing designs (e.g., Intel SGX, TrustVisor [54]) allow the untrusted OS to manage the enclave’s page table. To prevent memory mapping attacks (i.e., attacks by manipulating the enclave’s address mappings, as shown in Figure 9, Appendix A.1), the design of SGX extends the Page Missing Handler (PMH) and introduces a new metadata called EPCM for additional security checks on TLB misses [32]. Without secure hardware support, a prevalent software solution [19, 54, 75] is to make the page tables write-protected by setting the page table entries (PTEs) for pages holding the page tables, i.e., any update to the page table traps to the hypervisor and then be verified. However, on x86 platforms the updates of access and dirty bits of the PTEs also trap into the hypervisor, leading to non-negligible overhead. Even-worse, since the enclave page fault is also processed by the OS, the above designs are still vulnerable to the page table-basedattacks, such as the controlled-channel attacks [74].

SGX 的页表由 OS 管理，SGX 将页表设置为只读防止页表映射攻击，但是由于每次更新页表都需要经过管理程序的验证（包括访问位和脏位），开销过大
页错误仍由 OS 管理，会受到受控通道攻击的影响

The design becomes more challenging to support enclave dynamic memory management (i.e., EDMM on SGX2 platforms [34]), i.e, dynamically adding or removing enclave pages, or changing the enclave page attributes or types after the enclave is initialized. Without EDMM, all physical memory that the enclave might ever use must be committed before enclave initialization. Therefore, EDMM reduces enclave build time and enables new enclave features, such as on-demand stack and heap growth, and on-demand creation of code pages to support just-in-time (JIT) compilation. On SGX2 platforms, the enclaves need to send the EDMM request to the SGX driver through OCALLs, who then makes the requested changes. Since the driver is untrusted by the enclaves, the changes need to be explicitly checked and accepted by the enclaves to take effect, which involves heavy enclave mode switches.

另外一大难点是支持动态分配飞地内存（EDMM），即在飞地初始化后动态添加/删除飞地页
SGX 中飞地发送 OCALL 指令给驱动程序，驱动程序修改后交给飞地管理程序，由于驱动不可信，飞地管理程序需要显示检查后才能接受，这涉及大量安全区模式切换。

HyperEnclave memory management.

上述问题的本质：飞地页表和页错误由 OS 管理
HyperEnclave 为飞地设置一个单独的页表，飞地页表和页错误全部由 RustMonitor 管理，但是会带来飞地页表和普通进程页表的同步问题。

To eliminate the overhead for synchronization, we preallocate a marshalling buffer in the application’s address space, which is shared with the enclave. The mappings of the marshalling buffer are fixed during the entire enclave life cycle by pre-populating the physical memory and pinning it in the memory. All data exchanged between the enclave and the application must be passed through the marshalling buffer. The application’s memory mappings (except those for the marshalling buffer) are not needed by the enclave and are not included in the enclave’s page table. Such a design also mitigates the known enclave malware attacks [63], as the enclave cannot access the application’s address space but the marshalling buffer (Sec. 6 for more details). We remind the attacker may manipulate the marshalling buffer, however it does not cause additional security issues, since the buffer is untrusted by design where the developer is responsible to ensure that the data transmitted through the buffer is authentic and protected (same as the SGX model).

为了消除飞地页表和普通进程页表的同步问题，HyperEnclave 引入了 marshalling buffer 用来完成进程用户空间和飞地空间的参数交换，marshalling buffer 的映射是固定的，用户在撰写代码时有责任确保 marshalling buffer 涉及的参数交换是安全的。
飞地的页表无需存放普通的进程空间映射，应用进程也无法意识到飞地空间的存在

When the enclave accesses a virtual address that is not committed with a physical page (e.g., due to page swapping or EDMM), a page fault is raised and the enclave traps to RustMonitor. RustMonitor picks up a free page from the enclave memory pool, inserts a new mapping to the enclave’s page table, and resumes the enclave’s execution. When the enclave requests changing the page permissions, the enclave issues a hypercall to RustMonitor to update the permissions in the enclave’s page table and clear the corresponding TLB entries.4

发生缺页错误时，RustMonitor 响应该错误并在对应飞地内存池中分配一个空闲页，恢复飞地的运行。当飞地希望修改页面权限时，RustMonitor 响应，更新飞地页面权限以及清空 TLB 对应栏目

HyperEnclave memory isolation.

Figure 2 shows the memory mappings of the applications within the normal VM and the enclaves. The application’s memory within the normal VM is managed with nested paging, while the enclave’s memory could be managed through nested paging or through normal 1-level address translation, determined by the corresponding operation mode (Sec. 4). As a result, HyperEnclave enforces the following security requirements.

R-1: The primary OS and applications are not allowed to access the physical memory belonging to RustMonitor and the enclaves.
R-2: The enclave is not allowed to access physical memory belonging to RustMonitor and other enclaves. It is designed to have access to only a specific memory region shared with the untrusted application for parameter passing (i.e., the marshalling buffer).
R-3: DMA accesses from malicious peripherals to the physical memory belonging to RustMonitor and the enclaves are not allowed. In order to prevent such attacks, HyperEnclave restricts the physical memory used by the peripherals with the support of the Input-Output Memory Management Unit (IOMMU) in modern processors.

OS 和不可信应用进程运行在虚拟机内，访问通过嵌套页表（客户虚拟地址 -> 客户物理地址 -> 主机物理地址），而飞地根据不同的模式通过嵌套页表或 1 级地址转换
HyperEnclave 的地址隔离原则如下：

R-1: OS 和不可信应用不能访问 RustMonitor 和 enclave
R-2: enclave 不能访问 RustMonitor 和其他 enclave
R-3: 外围设备的 DMA 请求地址不能为 RustMonitor 和 enclave，通过 IOMMU 实现

Memory encryption.

To thwart physical memory attacks, such as cold boot and bus snooping attacks, HyperEnclave may leverage hardware memory encryption (such as AMD SME [44] and Intel MKTME [42]) to encrypt partial physical memory at the page granularity. If the platform does not support hardware memory encryption, HyperEnclave may consider to apply software approaches [76] to encrypt the isolated memory. This approach, however, may impose substantial overhead compared with hardware based solutions.

HyperEnclave 提供硬件和软件的内存加密，防止内存物理攻击

3.3 Trusted Boot, Attestation and Sealing

Measured Late Launch.

The boot process of HyperEnclave is shown in Figure 3. On system boot, a static and immutable piece of code, known as the Core Root of Trust for Measurement (CRTM), executes first to bootstrap the process of building a measurement chain for subsequent firmware and software, including the BIOS, grub, the primary OS kernel, and initramfs. The measurements are stored to TPM PCRs for each boot component, so that any modification will be reflected in the attestation quote.

系统启动时，首先运行一段固定且无法修改的内核代码 CRTM，这段内核代码会启动一系列测量链，包括 BIOS，grub，初始 OS kernel，initramfs。测量值将写入 TPM 的 PCR 寄存器中，任何修改都将在引用中体现

To reduce the attack surface from the primary OS, we put the RustMonitor image into the initramfs. The kernel measures the RustMonitor image and extends the value to TPM PCRs, then it launches RustMonitor in early userspace, i.e., before any userspace program that relies on the disk file system starts to run. Along with the measured boot, it ensures that the software state when RustMonitor is loaded is trusted.

initramfs：临时根文件系统，一个压缩的 CPIO 归档文件，在内核启动初期被加载到内存中。在内核完成硬件初始化后，为挂载真正的根文件系统（/）提供必要的工具和驱动。将 RustMonitor 的镜像加载至其中，在 early userspace 阶段启动

After RustMonitor is loaded, the execution continues at the pre-defined entry. RustMonitor sets up its own running context (such as the stack, page table, IDT, etc.) and prepares the virtual CPU (vCPU) configurations for each CPU. Then RustMonitor launches the normal VM and demotes the primary OS to the normal mode. Returning to the kernel module, the kernel continues to boot in the normal mode and is unaware of the existence of RustMonitor.

RustMonitor 加载后，初始化上下文 -> 准备 vCPU -> 加载 normal VM -> 将 OS 降级为 normal mode 并使其运行在 normal VM 中 -> 回到内核模块继续启动，此时内核无法意识到 RustMonitor 的存在

HyperEnclave applies the above approach (referred to as measured late launch) so that RustMonitor is loaded as a type2 hypervisor (like KVM) while runs as a type-1 hypervisor (like Xen). In this way, RustMonitor does not need to trust the primary OS anymore after the primary OS is demoted to the normal mode.

RustMonitor 加载时是 Type-2 虚拟机，运行时是 Type-1 虚拟机
Type-1 虚拟机：运行在裸机上，无需操作系统
Type-2 虚拟机：运行在操作系统上

Remote Attestation.

With the measured late launch, all booted components are measured and extended to the TPM. After RustMonitor is booted, it needs to extend the trust to the enclaves. For this purpose, RustMonitor derives an attestation key pair which is used to sign the enclave measurement. Then RustMonitor extends the derived public key to the TPM PCR, and the private key never leaves RustMonitor which is protected by memory isolation and encryption.

RustMonitor 需要将信任扩展到飞地（用户信任飞地），RustMonitor 内部生成一个密钥，用私钥（永远不离开 RustMonitor）对飞地进行签名

During enclave creation, all pages added to the enclave (including the corresponding page content, page type, and RWX permissions) are measured by RustMonitor to generate the enclave measurement. The (intermediate) measurement is stored in RustMonitor’s memory, which is invisible to the enclaves and the primary OS.

飞地创建时，RustMonitor 会对其所有页面（页面内容、类型、RWX 权限）进行测量，存储在 OS 和飞地不可见的内存中

Similar to TPM and Intel SGX, HyperEnclave adopts a SIGn-and-MAc (SIGMA) attestation protocol for the remote attestation flow. As shown in Figure 4, we denote the public key of RustMonitor’s attestation key by the hypervisor attestation public key (hapk). The enclave measurement is signed using RustMonitor’s attestation key to form the enclave measurement signature (ems). The TPM quote TMP_Quote, which is signed using the TPM attestation key, includes the PCRs for the measurement of all booted code, and the measurement of hapk. Upon receiving the attestation report, the remote user can verify the report by comparing the measurement of booted code (including the CRTM, BIOS, grub, kernel, initramfs, and hypervisor) and the enclave, as well as verifying the certificate chain for generating the signature.

HyperEnclave 遵循 SIGMA 远程验证协议

术语	作用	安全属性
SIGMA 协议	双向认证协议，确保通信双方身份和数据的完整性	防止中间人攻击、数据篡改
hapk	RustMonitor 的认证公钥，用于签名飞地度量值	私钥受内存隔离和加密保护
ems	飞地度量值的签名（由 hapk 对应的私钥生成）	证明飞地未被篡改
TPM_Quote	TPM 签名的报告，包含 PCR 值和 hapk 的度量值	硬件级防伪，证明启动链可信
PCR（Platform Configuration Register）	TPM 中的寄存器，存储启动过程各阶段的累积哈希值	任何组件篡改将导致 PCR 值变化

Secret key generation.

When RustMonitor is initialized for the first time, it generates a root key Kroot from the random number generator (RNG) module of the TPM. Kroot is stored outside the TPM using TPM’s seal operation. During the booting process on system reset, RustMonitor decrypts Kroot using TPM’s unseal operation, which guarantees that Kroot can only be unsealed with the exactly same TPM chip with matching PCR configurations. Furthermore, RustMonitor floods the PCRs with a constant before transferring control to the primary OS to prevent it from retrieving Kroot . All other key materials, including the enclave’s sealing key and report key are derived from both Kroot and the enclave’s measurement.

TODO

3.4 The Enclave SDK

Porting existing applications to the enclaves can be cumbersome since TEEs usually expose limited hardware and software interfaces and provide additional security services (e.g., attestation and sealing). For process-based TEEs, the applications need to be partitioned into the trusted and untrusted parts, and the interfaces need to be carefully designed to avoid various security pitfalls [27, 46, 69]. A lot of effort has been spent and many tools have been developed for Intel SGX, due to its dominant position in the market, including library OSes [64, 67], containers [18], automatic partition and protection tools [50, 68], WebAssembly Micro Runtime [57], and interface protection [65]. Consequently, Intel SGX has supported securely running applications written in C/C++, Rust, Java, Python, etc., without expensive code refactoring.

进程抽象的 TEE 需要将应用分成可信和不可信部分，将现有应用程序移植到飞地可能很麻烦，但是 SGX 相关工具很充分

We provide the enclave SDK with APIs compatible with the official Intel SGX SDK to ease the development of applications on HyperEnclave. The enclave SDK is retrofitting the official SGX SDK. By replacing the SGX user leaf functions (e.g., EENTER, EEXIT, and ERESUME) with hypercalls, SGX programs can run on HyperEnclave with little or no source code changes. Once the enclave executes these user leaf functions, it traps to RustMonitor and RustMonitor emulates the functionalities of the corresponding SGX instructions.

论文为飞地设计了和 SGX SDK 兼容的 API，当应用调用 HyperCall 时，RustMonitor 将其捕获并模拟对应的 SGX 指令

The enclave is compiled as a trusted library of the application, while the application itself runs in the primary OS. The enclave life cycle is managed through the emulation of a set of privileged SGX instructions (i.e., ECREATE, EADD, EINIT, etc.). To this end, the kernel module running in the primary OS provides similar functionalities by invoking RustMonitor through hypercalls, and exposes the functionalities to the applications by the ioctl() interfaces. By emulating the privileged SGX instructions, RustMonitor is responsible for the management of the enclave’s life cycle (Sec. 4).

SGX 中，飞地的生命周期通过一组特权指令管理，RustMonitor 通过模拟特权 SGX 指令完成飞地生命周期的管理任务

To be compatible with the official Intel SGX SDK, most data structures involved in HyperEnclave (such as the SIGSTRUCT structure, the SECS page, and the TCS page) are similar to that of SGX. With the HyperEnclave design, it is straightforward to support dynamic enclave management in an enclave, since the enclave memory and page fault are all managed by RustMonitor. Multi-threading within the enclave is supported by associating one TCS page for each enclave thread within the enclave. Exception handling within the enclave is supported by setting more than 1 SSA page for each TCS. The details are omitted due to space constraints and we refer the readers to the SGX manual [11] for more details.

为了兼容 SGX，沿用其很多机制和数据结构：

SIGSTRUCT：enclave 的数字身份证
SECS：enclave 的元数据
TCS：enclave 的线程控制结构
SSA：enclave 异常时用于保存现场

4 Flexible Enclave Operation Mode

A wide range of existing applications can be offloaded to the TEEs, such as computing-intensive tasks (machine learning [60]), input and output (IO)-intensive tasks (such as the Apache and Nginx web server [18]), memory-intensive tasks (Redis and Memcached [18]), and tasks which favor in-enclave exception handling and privilege separation [21]. Most TEEs support running the enclaves only in fixed mode, Intel SGX (also TrustVisor [54] and Secage [51]) enclaves in particular, as part of the application address space, run in user mode. As a result, the user mode enclave is not allowed to access the privileged resources (such as the IDT and page tables) and process the privileged events (interrupt and exceptions). It must switch to the untrusted code to gain access to privileged resources and handle the events. The I/O-intensive and memory-intensive tasks essentially involve the frequent world switches which are expensive and introduce non-negligible performance losses, even though both software and hardware optimizations have been proposed trying to reduce the context switch latencies [61, 66, 73]. In this section, we introduce the three enclave operation modes supported by HyperEnclave, as shown in Figure 5. The world switches in different enclave operation modes are shown in Figure 6.

TEE 中执行的应用进程任务种类繁多，计算密集型任务、IO 密集型任务、内存密集型任务等
现有飞地只支持固定模式运行，例如 SGX 运行在用户模式，对于 IO 密集型任务，需要进行世界切换（切换至不可信部分）才能访问特权资源，带来巨大的性能消耗。

4.1 Guest User Enclaves

Guest user enclave (GU-Enclave) is the basic enclave operation mode which is typically running computing-intensive tasks. The enclave runs in the guest user mode (i.e., guest ring-3 of the VMX non-root operation mode). During the enclave creation, RustMonitor prepares a vCPU structure which contains a guest page table (GPT) and a nested page table (NPT) for GU-Enclave. On entry and exit between the normal VM and the enclave VM, RustMonitor switches the vCPU states (e.g. the instruction pointer, thread pointer, NPT, and GPT) accordingly. To handle the interrupts and exceptions during the enclave running, RustMonitor configures the vCPU to trap all interrupts and exceptions to the monitor mode. RustMonitor then saves the enclave’s context, forwards the interrupt or exception to the normal VM. After the primary OS completes handling the interrupt or exception, the application invokes the ERESUME hypercall, which traps to RustMonitor to restore the enclave’s context and resume the execution of the enclave.

GU-Enclave 是最基本的飞地运行模式，用于运行计算密集型任务。飞地在 VMX-non-root-r3 模式下运行
在飞地创建时，RustMonitor 准备一个 vCPU 结构，包含飞地对应的 GPT 和 NPT。在 normal VM 和 enclave VM 的切换中，RustMonitor 负责切换相应的 vCPU 状态
当飞地发生中断或异常时，vCPU 捕获并交付给 RustMonitor 进行处理，保存飞地上下文后转交给 OS 处理，处理完毕后再由 RustMonitor 恢复飞地上下文并继续执行

4.2 Host User Enclaves

Host user enclave (HU-Enclave) is running in host user mode. It delivers the optimal world switch efficiency by substituting the mode switch (hypercalls: ∼ 880 CPU cycles on our platform) with the ring switch (syscalls: ∼ 120 CPU cycles on our platform) (Figure 6). It further eliminates the extra virtualization overhead (e.g. vCPU context switching and two-dimensional page walking) in GU-Enclave. HU-Enclave may benefit the I/O-intensive workload according to our evaluation in Sec 7. By comparison, running enclaves in the guest user mode provides more defensive depth. When loading the HU-Enclave, RustMonitor prepares a process context, e.g. creates a level-1 page table. On enclave entry, RustMonitor updates the CPU state and invokes the system call return instruction (i.e., SYSRET on x86 platforms) to enter the HU-Enclave. Correspondingly, on enclave exit, HU-enclave invokes the system call instruction (i.e., SYSCALL on x86 platforms) and traps into RustMonitor. The ENCLU leaf instructions (e.g., EGETKEY, EREPORT) are emulated as a system call. Interrupts and exceptions within the HUEnclaves also trap into the RustMonitor. The procedures are similar to those for the GU-Enclaves described in Sec. 4.1.

HU-Enclave 用于运行 IO 密集型任务。飞地在 VMX-root-r3 模式下运行
在飞地创建时，RustMonitor 为飞地准备一级页表，通过系统调用进入和退出飞地，相较于 GU-Enclave 有更好模式切换性能，但是更浅的防御深度

4.3 Privileged Enclaves

Inspired by the VM-based TEEs, such as AMD SEV [45], HyperEnclave supports privilege enclaves (P-Enclaves) which run in guest privileged mode. P-Enclave is permitted to access the GDT, IDT, and level-1 page table which benefits a wide variety of applications, as demonstrated by Dune [21]. One such example is the garbage collector, an essential feature for Java applications (existing works port the JVM to enclaves [26, 43]). The garbage collector frequently changes page permissions to trigger page faults in order to track the page status. For user mode enclaves (e.g., GU-Enclaves and HU-Enclaves), it has to involve the primary OS to update the page table and handle the page fault which suffers huge performance loss due to world switches. P-Enclaves eliminate the world switch by supporting in-enclave exception handling and level-1 page table management. More specifically, P-Enclaves configures its own exception handler to handle certain exceptions (such as page fault). RustMonitor passes through the white-list exceptions to the P-Enclave and forwards others to the primary OS. Furthermore, P-Enclaves can also support page-table-based in-enclave isolation schemes, e.g., sandboxing untrusted third-party libraries. With the ability to receive interrupts within the enclaves, PEnclaves may also detect abnormal interrupt events by counting the frequency, before requesting RustMonitor to route them to the primary OS. As such, existing interrupt-based side channel attacks [24, 37, 40, 58, 59, 70] could be detected and mitigated. We leave further exploration in this direction to future work due to space constraints.

P-Enclave 运行在 VMX-non-root-r0 模式下，用于例如垃圾收集器的特殊任务
P-Enclave 可以自己管理页表和白名单内的异常，同时可以通过内部计数来减轻基于不正常中断的测信道攻击

5 Implementations

We report our implementation of HyperEnclave on an AMD platform that supports hardware virtualization technology and memory encryption. In the current implementation, RustMonitor consists of about 7,500 lines of code written mostly in Rust, and the kernel module for the primary OS has about 3,500 lines of C code. Also, we made about 2,000 lines of code changes to the official Intel SGX SDK (version 2.13).

RustMonitor 由 7500 行 Rust 代码组成，OS 内核由 3500 行 C 代码组成，同时修改了约 2000 行 SGX 代码

5.1 RustMonitor

RustMonitor runs at the highest privilege level and enforces the isolation for the enclaves. To reduce the risks caused by memory corruption or concurrency bugs, we implemented RustMonitor mostly in Rust, a memory-safe language, with only a few lines of assembly code used for context switches. Compared with existing hypervisors such as KVM [47] and Xen [29], RustMonitor is much smaller and thus easier to be formally verified. We are working on the formal verification of RustMonitor and plan to release the result as a separate report. When the platform is booted, we configure the kernel command line parameters in the grub to reserve regions of physical memory, which are exclusively used by RustMonitor and the enclaves. RustMonitor manages the reserved physical memory by maintaining a list of free pages. When an enclave page is needed, e.g., when adding an enclave page during enclave creation, a free page is retrieved from the pool; when the enclave page is freed, the page is attached to the list again. Moreover, RustMonitor also manages the enclave’s page tables and processes the page fault.

RustMonitor 运行在最高特权级，代码量小方便验证
RustMonitor 通过维护空闲页面列表来管理保留的物理内存，需要飞地页面时从内存池分配一个空闲页面，释放飞地页面时将其重新添加到空闲页面列表。RustMonitor 同时还管理安全区页表并处理页错误

5.2 The Kernel Module

The kernel module is loaded by the primary OS during the booting process. Then it loads, measures, and launches RustMonitor, with the measurement extended to the TPM PCR as part of the TPM quote. When the kernel module is loaded, a device file is created and mounted at /dev/hyper_enclave. The application can open it and issue the ioctl() to invoke the emulated privileged operations.

内核模块在启动过程中由 OS 加载，它负责加载、测量、启动 RustMonitor，将测量值扩展到 TPM PCRs 并体现在 TPM 引用中
加载内核模块时，将创建一个设备文件并将其装载到 /dev/hyper_enclave 中，应用程序可以打开它并发出 ioctl() 来调用模拟的特权操作

5.3 The Enclave SDK

HyperEnclave retrofits the official SGX SDK as follows.

Supporting the SGX SDK APIs.

We replace the SGX user leaf functions (e.g. EENTER, EEXIT, ERESUME, etc.) in the SGX SDK with hypercalls or system calls. Our implementation retains the same parameter semantics and orders as SGX for compatibility purposes.

Parameters passing with the marshalling buffer.

In HyperEnclave, the enclave can only access its own address space and the marshalling buffer shared with the application. The size of the marshalling buffer can be configured in the enclave’s configuration file, with a default size. The data needs to be transmitted to the marshalling buffer before invoking edge calls. We modified SGX SDK to handle the transitions, which are thus transparent to the developer.

TODO

We modified the untrusted runtime library in the SDK (i.e., libsgx_urts.so), such that during enclave initialization a marshalling buffer is allocated using mmap() with MAP_POPULATE flags set. As a result, the GPAs for the marshalling buffers are pre-populated. Then an ioctl() is issued to request the primary OS not to compact or swap out the physical pages of the marshalling buffers during the enclave’s lifetime. When the application invokes the emulated EINIT instruction to mark the initialization of the enclave, the base address and the size of the marshalling buffer are passed to RustMonitor, who will add the mapping of the marshalling buffer in the enclave’s page table. In this way, the marshalling buffer is now shared between the enclave and the untrusted application. The base address and the size of the marshalling buffer are also passed to the trusted runtime library to transmit data from the marshalling buffer to the enclave.

TODO

The current OCALL’s implementation in the SGX SDK invokes the sgx_ocalloc() within the enclave to allocate a buffer on the stack area of the untrusted application, which is then used for cross-enclave data transmission. As such, we only need to modify the sgx_ocalloc() function to allocate a memory area in the marshalling buffer. To support parameter passing through the marshalling buffer for ECALLs, we modified SGX’s Edger8r tool to automatically generate code that copies the transmitted data into the marshalling buffer. The SGX programming model supports passing parameters with the user_check attribute. For such parameters, the SDK tool will not generate code to check the address range or perform data movement. Since the enclave code could access the entire process’s address space, some enclave programs may use a pointer with the user_check attribute to manipulate the data buffer outside the enclave directly, without accounting for the overhead for copying the data across the enclave boundary. To deal with it, we added an interface for the developer to allocate the buffer within the marshalling buffer, in the cases when the developer may use parameters with the user_check attribute.

TODO

The remote attestation flow is similar to SGX, following the same SIGn-and-MAc (SIGMA) protocol. We extended the sgx_quote_t structure in the SDK to include the HyperEnclave quote, and the modification is transparent to the enclave code. With the above design, most SGX programs could run on HyperEnclave without source code changes. Furthermore, to ease the development of HyperEnclave applications, we have also ported the Rust SGX SDK [71] and the Occlum library OS [64] to HyperEnclave.

TODO

6 Security Analysis

#GPU TEE

【论文】HyperEnclave：An Open and Cross-platform Trusted Execution Environment

https://dmx20070206.github.io/2025/02/17/【论文】HyperEnclave：An Open and Cross-platform Trusted Execution Environment/

Author

DM-X~X~X

Posted on

February 17, 2025

Licensed under

【论文】Heterogeneous Isolated Execution for Commodity GPUs Previous

一个月过去喽 Next