HIX 扩展了两条 SGX 指令,EGCREATE, EGADD,用于注册 GPU MMIO regions
扩展的硬件需要知道:
哪部分 MMIO region 应该被保护起来
这部分 MMIO region 应该被映射到 G-enclave 的哪部分地址空间
哪个 G-enclave 应该被允许访问
HIX 新增了两种数据结构,存放到 EPC 中
GPU enclave control structure (GECS): the control information regarding the GPU enclave including the hardware GPU number and GPU enclave ID.
GPU MMIO region table (TGMR): the virtual and physical address mapping information of the GPU MMIO region, which is used to verify the address mapping for the MMIO region.
Traditional CPUs and cloud systems based on them have embraced the hardware-based trusted execution environments to securely isolate computation from malicious OS or hardware attacks. However, GPUs and their cloud deployments have yet to include such support for hardware-based trusted computing. As large amounts of sensitive data are offloaded to GPU acceleration in cloud environments, ensuring the security of the data is a current and pressing need. As deployed today, the outsourced GPU model is vulnerable to attacks from compromised privileged software. To support isolated remote execution on GPUs even under vulnerable operating systems, this paper proposes a novel hardware and software architecture, called HIX (Heterogeneous Isolated eXecution). HIX does not require modifications to the GPU architecture to offer protections: Instead, it offers security by modifying the I/O interconnect between the CPU and GPU, and by refactoring the GPU device driver to work from within the CPU trusted environment. A result of the architectural choices behind HIX is that the concept can be applied to other offload accelerators besides GPUs. This work implements the proposed HIX architecture on an emulated machine with KVM and QEMU. Experimental results from the emulated security support with a real GPU show that the performance overhead for security is curtailed to 26% on average for the Rodinia benchmark, while providing secure isolated GPU computing.
研究背景(2019 年):GPU 侧的 TEE 尚未实现,而大量敏感数据被卸载到 GPU 上运行(例如外包 GPU 模型),亟需完成 GPU 上的 TEE 部署
论文提出了一个软硬件架构:HIX,HIX 不需要修改 GPU 的硬件,但需要调整 CPU 和 GPU 之间的 I/O 数据传输模块,并将 GPU Driver 运行在 CPU 可信执行环境中。
HIX 架构适用于多种加速器,不局限于 GPU
1 Introduction
In conventional CPU-based computation, hardware-based trusted execution environments (TEE) such as Intel SGX and ARM
TrustZone have been providing trusted and isolated computing environments to user applications. Such hardware-based TEEs
reduce the trusted computing base (TCB) of the computation to the processor and critical code running in TEE. With the
TEE support, security-critical applications can be protected from compromised privileged software as well as
hardware-based attacks to the memory and system buses, to provide secure computation running on untrusted remote cloud
servers.
当今基于硬件实现的 TEE 将 TCB 缩小到处理器和可信代码,提供在不可信云服务器上的可信计算
With increasing use of general purpose GPU computing from traditional high performance computing to data center
acceleration and machine learning applications, securing the GPU computation has become critical to protect security
sensitive data [34, 45, 56, 57]. However, although even more and more critical data are processed in GPUs, trusted
computing is yet to be supported in GPU computation. In the current system architecture, high performance discrete GPUs
communicate with CPUs through I/O interconnects such as PCI Express (PCIe) buses, and the GPU driver which is part of
the operating system controls the GPUs [25]. As the privileged operating system can fully control the hardware I/O
interconnects and GPU driver, computing in GPUs is vulnerable to potential attacks on the operating system [8]. Beyond
the GPU-based computing, the proliferation of various accelerator-based computing models has been increasing the demands for higher-level of security supports for accelerators under the vulnerable privileged software.
In existing architectures, both of the code and data in GPUs can be compromised by a privileged adversary. Recent work
has demonstrated that the integrity of GPU code can be subverted by disrupting and replacing the code at runtime with an
off-the-shelf reverse engineering tool [13]. In addition to code, data in GPU can potentially be uncovered and leaked
[45]. GPU data vulnerable to confidentiality attacks comprises both the communication data being transferred to and from
a GPU, and the data being processed within a GPU. The susceptibility of GPUs to confidentiality and integrity attacks
stems from the lack of access control to their interfaces such as the I/O interconnects and memory-mapped I/O addresses.
To support secure computing in GPUs, this paper proposes a novel hardware and software architecture for isolating GPUs
even from the potentially malicious privileged software (OS and hypervisor). The proposed architecture, called
Heterogeneous Isolated eXecution (HIX), requires minor extensions to the current PCIe interconnect implementation and
the TEE support in CPUs. The goal of HIX is to extend the security guarantees, namely confidentiality and integrity of
user data, of TEE technologies to heterogeneous computing environments. At the time of writing, none of these
technologies protect accelerators in heterogeneous systems from privileged software attacks; they only protect the code
and data in trusted “enclaves” running on the processors. In this work, we expand the scope of a widely used trusted
isolation technology, Intel SGX, to secure general purpose accelerators, in particular GPUs.
Our proposed architecture consists of four main hardware and software changes. First, key functions of the GPU driver
are removed from the operating system (OS) and relocated in a separate process in its own GPU enclave. The GPU enclave
is an extension of the current SGX enclave, designed to exclusively manage the GPU. Second, the PCIe interconnect
architecture is slightly modified to prevent the OS from changing the routing configuration of the interconnect, once
the GPU enclave is completely initialized. Third, the memory management unit (MMU) is augmented to protect the memory
mapped GPU I/O region from unauthorized accesses. Fourth, the CPU counterpart process of a GPU application runs on an
SGX enclave, and the SGX enclave sets up a trusted communication path to the GPU enclave, which is robust even against
privileged adversaries.
To support the secure execution environments for GPUs without any GPU modification, HIX does not provide the protection
against direct hardware-based attacks, as PCIe buses and the memory of GPUs are exposed to such hardware attacks in the
current architecture. Although the security level is lower compared to the hardware TEEs for CPUs, HIX can be extended
to other accelerators without requiring any modification of the accelerators themselves, if the accelerator is connected via I/O interconnects.
HIV 的核心功能需要四个主要的硬件/软件修改:
GPU 驱动的核心函数从内核空间中转移到单独的 GPU enclave
PCIe 互连组件需要修改,防止 OS 恶意修改其路由配置
MMU 加强,保护 GPU MMIO 区域被无权限者访问
和 GPU 进程相对应的主机侧进程在 enclave 中运行
HIX 不需要修改 GPU 硬件,可以扩展到其他加速器设备上,但防御深度弱于 CPU TEE
We evaluate the proposed architecture in terms of security and performance. We have implemented a prototype for HIX on
KVM and QEMU, adding extra instructions for the GPU enclave and separating the GPU driver from the operating system. The
prototype using the emulation connected to a real GPU shows that the performance degradation introduced by HIX secure
GPU computation is 26% compared to the conventional unsecure GPU computation for the benchmarks from the Rodinia suite.
We summarize the main contributions of this work as follows:
We provide an attack surface assessment of GPU computation. We identify key GPU components that can be attacked from
privileged software: PCIe interconnect, memory mapped I/O region, and GPU driver.
We augment the design of the PCIe interconnect to block any routing change after the GPU initialization, and to further
guarantee the address mapping immutability of the memory mapped I/O region to the GPU.
We extend the current SGX interface to support the GPU enclave, which runs the GPU driver in a secure way. The MMU
design is extended to protect the GPU memory mapped I/O region from unauthorized accesses.
We implement a prototype on an emulated system with KVM and QEMU to evaluate the performance overhead of HIX. Although
it is implemented in the emulated system due to the required changes in hardware, it faithfully reflects necessary
changes in hardware interfaces and software architectures.
The rest of the paper is organized as follows. Section 2 describes the current architecture of SGX, PCIe, and GPU
driver. Section 3 discusses the threat model. Section 4 presents the proposed architecture. Section 5 discusses the
security analysis and shows performance results. Section 6 presents the prior work and Section 7 concludes the paper.
HIX is designed on top of Intel SGX architecture and the PCI Express standard. We provide a brief overview of these
technologies in this section.
这部分进行 SGX、PCIe 的简述
2.1 Intel Software Guard Extensions (SGX)
Intel SGX is a hardware-based protection technology that provides a trusted execution environment (TEE) called an
enclave, protected even from the privileged software and direct hardware attacks. SGX protects the enclave memory and
execution contexts to support the strong isolated execution. The SGX hardware-based isolated execution is augmented by an attestation service that verifies the integrity of the code running on the enclave [1, 35].
The main memory is untrusted under the SGX threat model, and thus, SGX provides memory encryption and access restriction
mechanisms to protect a small region of main memory for enclaves, called the enclave page cache (EPC). Although SGX uses
the virtual memory support provided by the untrusted OS, it protects EPC pages from unauthorized accesses with
hardware-based verification. Figure 1 illustrates the structure of SGX address space. In the figure, ELRANGE (Enclave
Linear Address Range) is the protected virtual address range in the enclave, and the pages in the range are guaranteed
to be mapped to EPC pages. When an enclave is created, the system software registers the virtual address and
corresponding EPC physical address of a page in the protected memory using EADD SGX instruction. During handling of the
EADD instruction, the hardware stores the mapping information in the enclave page cache map (EPCM) to verify future
accesses to the page during address translation in MMU [9].
Modern GPUs are connected to the system via the PCI Express (PCIe) interface. The PCIe interface facilitates
memorymapped I/O (MMIO) access to PCIe devices for software. Since the MMIO mechanism maps the hardware registers and
memory of a device to the system memory address space for software, this enables the software to transparently access
the PCIe devices using regular memory addresses. Figure 2 illustrates how the system routes device access requests to
the device by using the system memory address map [49]. CPU is responsible for distinguishing accesses to the MMIO
regions from main memory accesses. It uses its internal hardware registers which are initialized by BIOS at system boot
time, to route access requests for MMIO appropriately [19]. When the address of a memory access is for the MMIO region,
the PCIe root complex takes the request. As PCIe devices are attached to the system as a tree, where the PCIe root
complex is its root, the root complex creates a PCIe transaction packet and routes it to the desired device, using the
hardware routing registers [5, 43]. These registers are also initialized by the BIOS at system boot time to cover the
entire physical address ranges of attached devices.
Modern PCIe devices use direct memory access (DMA) to directly read or write the main memory without CPU intervention.
The DMA arrows in Figure 2 show how the system routes the DMA request. An input/output memory management unit (IOMMU) can be used to translate device addresses to physical addresses for DMAs [42].