【论文】Heterogeneous Isolated Execution for Commodity GPUs

论文思路整理

Threat Module

HIX Architecture

Architecture Overview

目标:保护主机和 GPU 的数据通路,将 Driver 运行在 enclave 中,OS 无法修改 MMIO 映射和 PCIe 路由。

必须的硬件/软件修改:

  1. Isolated GPU management with GPU enclave: 将 GPU Driver 运行在 GPU enclave 中,只有 GPU enclave 可以访问 MMIO 区域
  2. Secure hardware I/O path: 对 SGX 做硬件扩展,OS 无法修改 GPU MMIO region,OS 无法修改 PCIe 路由,保护通过 MMIO/DMA 传输的命令/代码/数据
  3. Trusted application-to-GPU communication: U-enclave 发起的 GPU 命令要先通过对称加密的密钥保护的信道传递给 G-enclave,由其作为代表发送

GPU Enclave

G-Enclave 的目标:对 GPU 的单一控制 + 用户访问 GPU 的唯一接口

HIX 将 GPU Driver 的核心功能转移至 G-enclave,其他良性内核功能仍保留在内核中。需要对 SGX 做扩展,使有且仅有 G-enclave 可以访问 GPU MMIO 区域。

GPU MMIO Registration

HIX 扩展了两条 SGX 指令,EGCREATE, EGADD,用于注册 GPU MMIO regions

扩展的硬件需要知道:

  1. 哪部分 MMIO region 应该被保护起来
  2. 这部分 MMIO region 应该被映射到 G-enclave 的哪部分地址空间
  3. 哪个 G-enclave 应该被允许访问

HIX 新增了两种数据结构,存放到 EPC 中

  1. GPU enclave control structure (GECS): the control information regarding the GPU enclave including the hardware GPU number and GPU enclave ID.
  2. GPU MMIO region table (TGMR): the virtual and physical address mapping information of the GPU MMIO region, which is used to verify the address mapping for the MMIO region.

(1)通过指令 EGCREATE 创建 G-enclave:将 <GPU-BDF, G-enclave-ID> 存放进 GECS 中,保证同一个 G-enclave 只对应一个 GPU
(2)通过指令 EADD 添加 GPU MMIO 区域的 <VA, PA> 存放进 TGMR 中,之后对 MMIO region 的访问会经过验证

GPU Initialization and Measurement

G-enclave 创建并加载后,首先验证 GPU BIOS 的完整性,并彻底清空 GPU 的状态

GPU Protection on GPU Enclave Termination

如果 OS 强行杀死 G-enclave 进程,杀死的 G-enclave 进程仍然唯一的拥有 GPU 对应资源,除非系统断电重启,否则任何软件都无法访问这部分资源,而重启时 GECS 和 TGMR 都会重置并清空 GPU 数据。

如果 OS 优雅的结束 G-enclave 进程,G-enclave 会先清空 GPU 数据并将 GPU 使用权还给 OS,同时通知 U-enclave GPU 不再可信。

Securing I/O Path: MMIO and DMA

保护向 GPU 发送的命令和数据,保护以 MMIO/DMA 发送的数据得以保护

MMIO Access Protection

发生 TLB Miss 后添加新的 TLB entry 前,硬件 page table walker 会进行四次检查:

  1. 检查 GECS,确保当前进程是 G-enclave
  2. VA 和 G-enclave 的请求对应
  3. VA 和 TGMR 中某一项相符
  4. PA 和 TGMR 对应项相符
MMIO Lockdown and Securing PCIe Routing

HIX 提供 MMIO 锁机制,即 PCIe RC 在收到主机发来的事务包时,会检测其是否为修改 MMIO 映射或修改 PCIe 路由,若是则直接丢弃该包。

锁机制在 EGCREATE 命令时启用

Trusted DMA

OS 通过恶意篡改 IOMMU 页表可以将 DMA 数据传送至任何页面。HIX 要求 DMA 数据必须进行加密并通过 message authentication code (MAC) 进行验证。

通过该机制,只有加密的 DMA 数据才会出现在不安全的缓冲区,并通过 MAC 验证,密钥的传递是安全的(后文),故 DMA 数据得以保护。

Application-to-GPU Communication

G-enclave 对 GPU 有唯一的控制权,故需要向 U-enclave 提供一些获取 GPU 服务的接口,GPU 和不同的 U-enclave 同i性能使用不同的密钥。

Trusted Runtime User Library

HIX 为 U-enclave 提供了一套安全的 API,U-enclave 通过这些 API 和 GPU 交互,隐藏 HIX 的硬件细节。

Secure Inter-Enclave Communication

HIX 使用对称加密技术确保 G-enclave,U-enclave,GPU 三方通信,使用 SGX local attestation 技术完成彼此的验证。

U-enclave 和 G-enclave 中设立了两条通道:message queue 和 shared memory。U-enclave 将发送给 G-enclave 的数据放至 shared memory,然后向 message queue 发送请求,G-enclave 接受并处理数据。

Secure Communication between the GPU Enclave and GPU

GPU command buffer 在 HIX 保护下的 MMIO region,G-enclave 通过 MMIO 向 GPU 发送 U-enclave 对 GPU 的请求。

数据传输链:U-enclave -> G-enclave -> GPU,期间涉及两对密钥的加解密过程,性能开销大。HIX 采取 single-copy 方法,使用相同的密钥,G-enclave 可以发送命令将 GPU 数据复制到 shared memory 或将 shared memory 数据复制到 GPU,复制可以通过 MMIO 或 DMA 来完成。

Communication Example

U-enclave 向 GPU 传递数据/命令的流程:

  1. 向 message queue 发送 cuMemcpyHtoD request,同时附带元数据
  2. 向 shared memory 发送加密的实际数据
  3. G-enclave 将加密数据直接复制给 GPU
  4. G-enclave 向 GPU 发送一个解密内核任务,用于对加密数据的解密

论文精读

Abstract

Traditional CPUs and cloud systems based on them have embraced the hardware-based trusted execution environments to securely isolate computation from malicious OS or hardware attacks. However, GPUs and their cloud deployments have yet to include such support for hardware-based trusted computing. As large amounts of sensitive data are offloaded to GPU acceleration in cloud environments, ensuring the security of the data is a current and pressing need. As deployed today, the outsourced GPU model is vulnerable to attacks from compromised privileged software. To support isolated remote execution on GPUs even under vulnerable operating systems, this paper proposes a novel hardware and software architecture, called HIX (Heterogeneous Isolated eXecution). HIX does not require modifications to the GPU architecture to offer protections: Instead, it offers security by modifying the I/O interconnect between the CPU and GPU, and by refactoring the GPU device driver to work from within the CPU trusted environment. A result of the architectural choices behind HIX is that the concept can be applied to other offload accelerators besides GPUs. This work implements the proposed HIX architecture on an emulated machine with KVM and QEMU. Experimental results from the emulated security support with a real GPU show that the performance overhead for security is curtailed to 26% on average for the Rodinia benchmark, while providing secure isolated GPU computing.
研究背景(2019 年):GPU 侧的 TEE 尚未实现,而大量敏感数据被卸载到 GPU 上运行(例如外包 GPU 模型),亟需完成 GPU 上的 TEE 部署

论文提出了一个软硬件架构:HIX,HIX 不需要修改 GPU 的硬件,但需要调整 CPU 和 GPU 之间的 I/O 数据传输模块,并将 GPU Driver 运行在 CPU 可信执行环境中。
HIX 架构适用于多种加速器,不局限于 GPU

1 Introduction

In conventional CPU-based computation, hardware-based trusted execution environments (TEE) such as Intel SGX and ARM TrustZone have been providing trusted and isolated computing environments to user applications. Such hardware-based TEEs reduce the trusted computing base (TCB) of the computation to the processor and critical code running in TEE. With the TEE support, security-critical applications can be protected from compromised privileged software as well as hardware-based attacks to the memory and system buses, to provide secure computation running on untrusted remote cloud servers.
当今基于硬件实现的 TEE 将 TCB 缩小到处理器和可信代码,提供在不可信云服务器上的可信计算
With increasing use of general purpose GPU computing from traditional high performance computing to data center acceleration and machine learning applications, securing the GPU computation has become critical to protect security sensitive data [34, 45, 56, 57]. However, although even more and more critical data are processed in GPUs, trusted computing is yet to be supported in GPU computation. In the current system architecture, high performance discrete GPUs communicate with CPUs through I/O interconnects such as PCI Express (PCIe) buses, and the GPU driver which is part of the operating system controls the GPUs [25]. As the privileged operating system can fully control the hardware I/O interconnects and GPU driver, computing in GPUs is vulnerable to potential attacks on the operating system [8]. Beyond the GPU-based computing, the proliferation of various accelerator-based computing models has been increasing the demands for higher-level of security supports for accelerators under the vulnerable privileged software.
GPU 和 CPU 通过 PCIe 总线进行通信,驱动负责控制通信,OS 完全管理驱动和 PCIe 总线,故 GPU 侧数据很容易受到攻击
In existing architectures, both of the code and data in GPUs can be compromised by a privileged adversary. Recent work has demonstrated that the integrity of GPU code can be subverted by disrupting and replacing the code at runtime with an off-the-shelf reverse engineering tool [13]. In addition to code, data in GPU can potentially be uncovered and leaked [45]. GPU data vulnerable to confidentiality attacks comprises both the communication data being transferred to and from a GPU, and the data being processed within a GPU. The susceptibility of GPUs to confidentiality and integrity attacks stems from the lack of access control to their interfaces such as the I/O interconnects and memory-mapped I/O addresses. To support secure computing in GPUs, this paper proposes a novel hardware and software architecture for isolating GPUs even from the potentially malicious privileged software (OS and hypervisor). The proposed architecture, called Heterogeneous Isolated eXecution (HIX), requires minor extensions to the current PCIe interconnect implementation and the TEE support in CPUs. The goal of HIX is to extend the security guarantees, namely confidentiality and integrity of user data, of TEE technologies to heterogeneous computing environments. At the time of writing, none of these technologies protect accelerators in heterogeneous systems from privileged software attacks; they only protect the code and data in trusted “enclaves” running on the processors. In this work, we expand the scope of a widely used trusted isolation technology, Intel SGX, to secure general purpose accelerators, in particular GPUs.
目标:保护 GPU 数据(通信路径上的数据、GPU 内运行的数据)的完整性和机密性
产生攻击面的原因:I/O 接口和 MMIO 的控制过少
HIX 需要扩展 PCIe interconnect 和 CPU 侧 TEE(扩展 SGX)
Our proposed architecture consists of four main hardware and software changes. First, key functions of the GPU driver are removed from the operating system (OS) and relocated in a separate process in its own GPU enclave. The GPU enclave is an extension of the current SGX enclave, designed to exclusively manage the GPU. Second, the PCIe interconnect architecture is slightly modified to prevent the OS from changing the routing configuration of the interconnect, once the GPU enclave is completely initialized. Third, the memory management unit (MMU) is augmented to protect the memory mapped GPU I/O region from unauthorized accesses. Fourth, the CPU counterpart process of a GPU application runs on an SGX enclave, and the SGX enclave sets up a trusted communication path to the GPU enclave, which is robust even against privileged adversaries. To support the secure execution environments for GPUs without any GPU modification, HIX does not provide the protection against direct hardware-based attacks, as PCIe buses and the memory of GPUs are exposed to such hardware attacks in the current architecture. Although the security level is lower compared to the hardware TEEs for CPUs, HIX can be extended to other accelerators without requiring any modification of the accelerators themselves, if the accelerator is connected via I/O interconnects.
HIV 的核心功能需要四个主要的硬件/软件修改:
  • GPU 驱动的核心函数从内核空间中转移到单独的 GPU enclave
  • PCIe 互连组件需要修改,防止 OS 恶意修改其路由配置
  • MMU 加强,保护 GPU MMIO 区域被无权限者访问
  • 和 GPU 进程相对应的主机侧进程在 enclave 中运行
HIX 不需要修改 GPU 硬件,可以扩展到其他加速器设备上,但防御深度弱于 CPU TEE
We evaluate the proposed architecture in terms of security and performance. We have implemented a prototype for HIX on KVM and QEMU, adding extra instructions for the GPU enclave and separating the GPU driver from the operating system. The prototype using the emulation connected to a real GPU shows that the performance degradation introduced by HIX secure GPU computation is 26% compared to the conventional unsecure GPU computation for the benchmarks from the Rodinia suite. We summarize the main contributions of this work as follows:
  • We provide an attack surface assessment of GPU computation. We identify key GPU components that can be attacked from privileged software: PCIe interconnect, memory mapped I/O region, and GPU driver.
  • We augment the design of the PCIe interconnect to block any routing change after the GPU initialization, and to further guarantee the address mapping immutability of the memory mapped I/O region to the GPU.
  • We extend the current SGX interface to support the GPU enclave, which runs the GPU driver in a secure way. The MMU design is extended to protect the GPU memory mapped I/O region from unauthorized accesses.
  • We implement a prototype on an emulated system with KVM and QEMU to evaluate the performance overhead of HIX. Although it is implemented in the emulated system due to the required changes in hardware, it faithfully reflects necessary changes in hardware interfaces and software architectures.
The rest of the paper is organized as follows. Section 2 describes the current architecture of SGX, PCIe, and GPU driver. Section 3 discusses the threat model. Section 4 presents the proposed architecture. Section 5 discusses the security analysis and shows performance results. Section 6 presents the prior work and Section 7 concludes the paper.
HIX 的四大贡献:
  • 特权软件可能会攻击的三大 GPU 组件:PCIe 互连设备、MMIO 区域、GPU 驱动
  • 加强 PCIe 互连设备,保证 MMIO 映射、PCIe RC 路由无法被非法修改
  • 扩展 SGX 以支持 GPU enclave,将 GPU 驱动的关键函数以一种安全的方式运行,加强 MMU 防止无权限者访问 GPU MMIO 区域
  • 完成了性能评估

2 Background

HIX is designed on top of Intel SGX architecture and the PCI Express standard. We provide a brief overview of these technologies in this section.
这部分进行 SGX、PCIe 的简述

2.1 Intel Software Guard Extensions (SGX)

Intel SGX is a hardware-based protection technology that provides a trusted execution environment (TEE) called an enclave, protected even from the privileged software and direct hardware attacks. SGX protects the enclave memory and execution contexts to support the strong isolated execution. The SGX hardware-based isolated execution is augmented by an attestation service that verifies the integrity of the code running on the enclave [1, 35]. The main memory is untrusted under the SGX threat model, and thus, SGX provides memory encryption and access restriction mechanisms to protect a small region of main memory for enclaves, called the enclave page cache (EPC). Although SGX uses the virtual memory support provided by the untrusted OS, it protects EPC pages from unauthorized accesses with hardware-based verification. Figure 1 illustrates the structure of SGX address space. In the figure, ELRANGE (Enclave Linear Address Range) is the protected virtual address range in the enclave, and the pages in the range are guaranteed to be mapped to EPC pages. When an enclave is created, the system software registers the virtual address and corresponding EPC physical address of a page in the protected memory using EADD SGX instruction. During handling of the EADD instruction, the hardware stores the mapping information in the enclave page cache map (EPCM) to verify future accesses to the page during address translation in MMU [9].
SGX 架构中:
  • 飞地对应的内存空间是加密的,并集中在内存的 EPC 区域中
  • 一个进程虚拟地址中的用户空间被分为可信和不可信部分,可信部分对应的连续虚拟地址段叫做 ELRANGE
  • 飞地的页表存储在硬件中,叫做 EPCM,飞地的地址映射由 MMU 处理

2.2 PCI Express Architecture

Modern GPUs are connected to the system via the PCI Express (PCIe) interface. The PCIe interface facilitates memorymapped I/O (MMIO) access to PCIe devices for software. Since the MMIO mechanism maps the hardware registers and memory of a device to the system memory address space for software, this enables the software to transparently access the PCIe devices using regular memory addresses. Figure 2 illustrates how the system routes device access requests to the device by using the system memory address map [49]. CPU is responsible for distinguishing accesses to the MMIO regions from main memory accesses. It uses its internal hardware registers which are initialized by BIOS at system boot time, to route access requests for MMIO appropriately [19]. When the address of a memory access is for the MMIO region, the PCIe root complex takes the request. As PCIe devices are attached to the system as a tree, where the PCIe root complex is its root, the root complex creates a PCIe transaction packet and routes it to the desired device, using the hardware routing registers [5, 43]. These registers are also initialized by the BIOS at system boot time to cover the entire physical address ranges of attached devices. Modern PCIe devices use direct memory access (DMA) to directly read or write the main memory without CPU intervention. The DMA arrows in Figure 2 show how the system routes the DMA request. An input/output memory management unit (IOMMU) can be used to translate device addresses to physical addresses for DMAs [42].
主机进程利用 MMIO 机制,可以像访问普通内存一样访问设备寄存器,CPU 根据内部寄存器(BIOS 初始化)判断访问的内存地址是普通内存还是 MMIO 区域
当访问 MMIO 区域时,PCIe RC 会接受请求,创建一个 PCIe 业务包并将其路由到目标设备(通过硬件路由寄存器)

【论文】Heterogeneous Isolated Execution for Commodity GPUs
https://dmx20070206.github.io/2025/02/18/【论文】Heterogeneous Isolated Execution for Commodity GPUs/
Author
DM-X~X~X
Posted on
February 18, 2025
Licensed under