Exposure of Sensitive Information caused by Incorrect Data Forwarding during Transient Execution

Description

A processor event or prediction may allow incorrect or stale data to be forwarded to transient operations, potentially exposing data over a covert channel.

Extended Description

Software may use a variety of techniques to preserve the confidentiality of private data that is accessible within the current processor context. For example, the memory safety and type safety properties of some high-level programming languages help to prevent software written in those languages from exposing private data. As a second example, software sandboxes may co-locate multiple users' software within a single process. The processor's Instruction Set Architecture (ISA) may permit one user's software to access another user's data (because the software shares the same address space), but the sandbox prevents these accesses by using software techniques such as bounds checking. If incorrect or stale data can be forwarded (for example, from a cache) to transient operations, then the operations' microarchitectural side effects may correspond to the data. If an attacker can trigger these transient operations and observe their side effects through a covert channel, then the attacker may be able to infer the data. For example, an attacker process may induce transient execution in a victim process that causes the victim to inadvertently access and then expose its private data via a covert channel. In the software sandbox example, an attacker sandbox may induce transient execution in its own code, allowing it to transiently access and expose data in a victim sandbox that shares the same address space. Consequently, weaknesses that arise from incorrect/stale data forwarding might violate users' expectations of software-based memory safety and isolation techniques. If the data forwarding behavior is not properly documented by the hardware vendor, this might violate the software vendor's expectation of how the hardware should behave.

Common Consequences 1

Scope: Confidentiality

Impact: Read Memory

Detection Methods 3

Automated Static AnalysisModerate

A variety of automated static analysis tools can identify potentially exploitable code sequences in software. These tools may perform the analysis on source code, on binary code, or on an intermediate code representation (for example, during compilation).

Manual AnalysisModerate

This weakness can be detected in hardware by manually inspecting processor specifications. Features that exhibit this weakness may include microarchitectural predictors, access control checks that occur out-of-order, or any other features that can allow operations to execute without committing to architectural state.Hardware designers can also scrutinize aspects of the instruction set architecture that have undefined behavior; these can become a focal point when applying other detection methods.

Automated AnalysisHigh

Software vendors can release tools that detect presence of known weaknesses on a processor. For example, some of these tools can attempt to transiently execute a vulnerable code sequence and detect whether code successfully leaks data in a manner consistent with the weakness under test. Alternatively, some hardware vendors provide enumeration for the presence of a weakness (or lack of a weakness). These enumeration bits can be checked and reported by system software. For example, Linux supports these checks for many commodity processors: $ cat /proc/cpuinfo | grep bugs | head -n 1 bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa itlb_multihit srbds mmio_stale_data retbleed

Potential Mitigations 10

Phase: Architecture and Design

The hardware designer can attempt to prevent transient execution from causing observable discrepancies in specific covert channels.

Effectiveness: Limited

Phase: Requirements

Processor designers, system software vendors, or other agents may choose to restrict the ability of unprivileged software to access to high-resolution timers that are commonly used to monitor covert channels.

Effectiveness: Defense in Depth

Phase: Requirements

Processor designers may expose instructions or other architectural features that allow software to mitigate the effects of transient execution, but without disabling predictors. These features may also help to limit opportunities for data exposure.

Effectiveness: Moderate

Phase: Requirements

Processor designers may expose registers (for example, control registers or model-specific registers) that allow privileged and/or user software to disable specific predictors or other hardware features that can cause confidential data to be exposed during transient execution.

Effectiveness: Limited

Phase: Build and Compilation

Use software techniques (including the use of serialization instructions) that are intended to reduce the number of instructions that can be executed transiently after a processor event or misprediction.

Effectiveness: Incidental

Phase: Build and Compilation

Isolate sandboxes or managed runtimes in separate address spaces (separate processes).

Effectiveness: High

Phase: Build and Compilation

Include serialization instructions (for example, LFENCE) that prevent processor events or mis-predictions prior to the serialization instruction from causing transient execution after the serialization instruction. For some weaknesses, a serialization instruction can also prevent a processor event or a mis-prediction from occurring after the serialization instruction (for example, CVE-2018-3639 can allow a processor to predict that a load will not depend on an older store; a serialization instruction between the store and the load may allow the store to update memory and prevent the mis-prediction from happening at all).

Effectiveness: Moderate

Phase: Build and Compilation

Use software techniques that can mitigate the consequences of transient execution. For example, address masking can be used in some circumstances to prevent out-of-bounds transient reads.

Effectiveness: Limited

Phase: Build and Compilation

If the weakness is exposed by a single instruction (or a small set of instructions), then the compiler (or JIT, etc.) can be configured to prevent the affected instruction(s) from being generated, and instead generate an alternate sequence of instructions that is not affected by the weakness.

Effectiveness: Limited

Phase: Documentation

If a hardware feature can allow incorrect or stale data to be forwarded to transient operations, the hardware designer may opt to disclose this behavior in architecture documentation. This documentation can inform users about potential consequences and effective mitigations.

Effectiveness: High

Demonstrative Examples 2

Faulting loads in a victim domain may trigger incorrect transient forwarding, which leaves secret-dependent traces in the microarchitectural state. Consider this code sequence example from [REF-1391].

Code Example:

Bad

void call_victim(size_t untrusted_arg) {

A processor with this weakness will store the value of untrusted_arg (which may be provided by an attacker) to the stack, which is trusted memory. Additionally, this store operation will save this value in some microarchitectural buffer, for example, the store buffer. In this code sequence, trusted_ptr is dereferenced while the attacker forces a page fault. The faulting load causes the processor to mis-speculate by forwarding untrusted_arg as the (transient) load result. The processor then uses untrusted_arg for the pointer dereference. After the fault has been handled and the load has been re-issued with the correct argument, secret-dependent information stored at the address of trusted_ptr remains in microarchitectural state and can be extracted by an attacker using a vulnerable code sequence.

Some processors try to predict when a store will forward data to a subsequent load, even when the address of the store or the load is not yet known. For example, on Intel processors this feature is called a Fast Store Forwarding Predictor [REF-1392], and on AMD processors the feature is called Predictive Store Forwarding [REF-1393]. A misprediction can cause incorrect or stale data to be forwarded from a store to a load, as illustrated in the following code snippet from [REF-1393]:

Code Example:

Bad

void fn(int idx) {

In this example, assume that the parameter idx can only be 0 or 1, and assume that idx_array initially contains all 0s. Observe that the assignment to v in line 4 will be array[0], regardless of whether idx=0 or idx=1. Now suppose that an attacker repeatedly invokes fn with idx=0 to train the store forwarding predictor to predict that the store in line 3 will forward the data 4096 to the load idx_array[idx] in line 4. Then, when the attacker invokes fn with idx=1 the predictor may cause idx_array[idx] to transiently produce the incorrect value 4096, and therefore v will transiently be assigned the value array[4096], which otherwise would not have been accessible in line 4. Although this toy example is benign (it doesn't transmit array[4096] over a covert channel), an attacker may be able to use similar techniques to craft and train malicious code sequences to, for example, read data beyond a software sandbox boundary.

Observed Examples 2

CVE-2020-0551A fault, microcode assist, or abort may allow transient load operations to forward malicious stale data to dependent operations executed by a victim, causing the victim to unintentionally access and potentially expose its own data over a covert channel.

CVE-2020-8698A fast store forwarding predictor may allow store operations to forward incorrect data to transient load operations, potentially exposing data over a covert channel.

References 5

You Cannot Always Win the Race: Analyzing the LFENCE/JMP Mitigation for Branch Target Injection

Alyssa Milburn, Ke Sun, and Henrique Kawakami

08-03-2022

https://arxiv.org/abs/2203.04277(2024-02-22)

ID: REF-1389

Speculation

The kernel development community

16-08-2020

https://docs.kernel.org/6.6/staging/speculation.html(2024-02-04)

ID: REF-1390

LVI : Hijacking Transient Execution through Microarchitectural Load Value Injection