subscribe to arXiv mailings

TME-Box: Scalable In-Process Isolation through Intel TME-MK Memory Encryption

Authors: Martin Unterguggenberger, Lukas Lamster, David Schrammel, Martin Schwarzl, Stefan Mangard

Abstract: Efficient cloud computing relies on in-process isolation to optimize performance by running workloads within a single process. Without heavy-weight process isolation, memory safety errors pose a significant security threat by allowing an adversary to extract or corrupt the private data of other co-located tenants. Existing in-process isolation mechanisms are not suitable for modern cloud requireme… ▽ More Efficient cloud computing relies on in-process isolation to optimize performance by running workloads within a single process. Without heavy-weight process isolation, memory safety errors pose a significant security threat by allowing an adversary to extract or corrupt the private data of other co-located tenants. Existing in-process isolation mechanisms are not suitable for modern cloud requirements, e.g., MPK's 16 protection domains are insufficient to isolate thousands of cloud workers per process. Consequently, cloud service providers have a strong need for lightweight in-process isolation on commodity x86 machines. This paper presents TME-Box, a novel isolation technique that enables fine-grained and scalable sandboxing on commodity x86 CPUs. By repurposing Intel TME-MK, which is intended for the encryption of virtual machines, TME-Box offers lightweight and efficient in-process isolation. TME-Box enforces that sandboxes use their designated encryption keys for memory interactions through compiler instrumentation. This cryptographic isolation enables fine-grained access control, from single cache lines to full pages, and supports flexible data relocation. In addition, the design of TME-Box allows the efficient isolation of up to 32K concurrent sandboxes. We present a performance-optimized TME-Box prototype, utilizing x86 segment-based addressing, that showcases geomean performance overheads of 5.2 % for data isolation and 9.7 % for code and data isolation, evaluated with the SPEC CPU2017 benchmark suite. △ Less

Submitted 15 July, 2024; originally announced July 2024.

arXiv:2303.03711 [pdf, other]

SCRAMBLE-CFI: Mitigating Fault-Induced Control-Flow Attacks on OpenTitan

Authors: Pascal Nasahl, Stefan Mangard

Abstract: Secure elements physically exposed to adversaries are frequently targeted by fault attacks. These attacks can be utilized to hijack the control-flow of software allowing the attacker to bypass security measures, extract sensitive data, or gain full code execution. In this paper, we systematically analyze the threat vector of fault-induced control-flow manipulations on the open-source OpenTitan sec… ▽ More Secure elements physically exposed to adversaries are frequently targeted by fault attacks. These attacks can be utilized to hijack the control-flow of software allowing the attacker to bypass security measures, extract sensitive data, or gain full code execution. In this paper, we systematically analyze the threat vector of fault-induced control-flow manipulations on the open-source OpenTitan secure element. Our thorough analysis reveals that current countermeasures of this chip either induce large area overheads or still cannot prevent the attacker from exploiting the identified threats. In this context, we introduce SCRAMBLE-CFI, an encryption-based control-flow integrity scheme utilizing existing hardware features of OpenTitan. SCRAMBLE-CFI confines, with minimal hardware overhead, the impact of fault-induced control-flow attacks by encrypting each function with a different encryption tweak at load-time. At runtime, code only can be successfully decrypted when the correct decryption tweak is active. We open-source our hardware changes and release our LLVM toolchain automatically protecting programs. Our analysis shows that SCRAMBLE-CFI complementarily enhances security guarantees of OpenTitan with a negligible hardware overhead of less than 3.97 % and a runtime overhead of 7.02 % for the Embench-IoT benchmarks. △ Less

Submitted 24 March, 2023; v1 submitted 7 March, 2023; originally announced March 2023.

Comments: Accepted at GLSVLSI'23

arXiv:2301.13760 [pdf, other]

EC-CFI: Control-Flow Integrity via Code Encryption Counteracting Fault Attacks

Authors: Pascal Nasahl, Salmin Sultana, Hans Liljestrand, Karanvir Grewal, Michael LeMay, David M. Durham, David Schrammel, Stefan Mangard

Abstract: Fault attacks enable adversaries to manipulate the control-flow of security-critical applications. By inducing targeted faults into the CPU, the software's call graph can be escaped and the control-flow can be redirected to arbitrary functions inside the program. To protect the control-flow from these attacks, dedicated fault control-flow integrity (CFI) countermeasures are commonly deployed. Howe… ▽ More Fault attacks enable adversaries to manipulate the control-flow of security-critical applications. By inducing targeted faults into the CPU, the software's call graph can be escaped and the control-flow can be redirected to arbitrary functions inside the program. To protect the control-flow from these attacks, dedicated fault control-flow integrity (CFI) countermeasures are commonly deployed. However, these schemes either have high detection latencies or require intrusive hardware changes. In this paper, we present EC-CFI, a software-based cryptographically enforced CFI scheme with no detection latency utilizing hardware features of recent Intel platforms. Our EC-CFI prototype is designed to prevent an adversary from escaping the program's call graph using faults by encrypting each function with a different key before execution. At runtime, the instrumented program dynamically derives the decryption key, ensuring that the code only can be successfully decrypted when the program follows the intended call graph. To enable this level of protection on Intel commodity systems, we introduce extended page table (EPT) aliasing allowing us to achieve function-granular encryption by combing Intel's TME-MK and virtualization technology. We open-source our custom LLVM-based toolchain automatically protecting arbitrary programs with EC-CFI. Furthermore, we evaluate our EPT aliasing approach with the SPEC CPU2017 and Embench-IoT benchmarks and discuss and evaluate potential TME-MK hardware changes minimizing runtime overheads. △ Less

Submitted 24 March, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

Comments: Accepted at HOST'23

arXiv:2301.02915 [pdf, other]

SFP: Providing System Call Flow Protection against Software and Fault Attacks

Authors: Robert Schilling, Pascal Nasahl, Martin Unterguggenberger, Stefan Mangard

Abstract: With the improvements in computing technologies, edge devices in the Internet-of-Things have become more complex. The enabler technology for these complex systems are powerful application core processors with operating system support, such as Linux. While the isolation of applications through the operating system increases the security, the interface to the kernel poses a new threat. Different att… ▽ More With the improvements in computing technologies, edge devices in the Internet-of-Things have become more complex. The enabler technology for these complex systems are powerful application core processors with operating system support, such as Linux. While the isolation of applications through the operating system increases the security, the interface to the kernel poses a new threat. Different attack vectors, including fault attacks and memory vulnerabilities, exploit the kernel interface to escalate privileges and take over the system. In this work, we present SFP, a mechanism to protect the execution of system calls against software and fault attacks providing integrity to user-kernel transitions. SFP provides system call flow integrity by a two-step linking approach, which links the system call and its origin to the state of control-flow integrity. A second linking step within the kernel ensures that the right system call is executed in the kernel. Combining both linking steps ensures that only the correct system call is executed at the right location in the program and cannot be skipped. Furthermore, SFP provides dynamic CFI instrumentation and a new CFI checking policy at the edge of the kernel to verify the control-flow state of user programs before entering the kernel. We integrated SFP into FIPAC, a CFI protection scheme exploiting ARM pointer authentication. Our prototype is based on a custom LLVM-based toolchain with an instrumented runtime library combined with a custom Linux kernel to protect system calls. The evaluation of micro- and macrobenchmarks based on SPEC 2017 show an average runtime overhead of 1.9 % and 20.6 %, which is only an increase of 1.8 % over plain control-flow protection. This small impact on the performance shows the efficiency of SFP for protecting all system calls and providing integrity for the user-kernel transitions. △ Less

Submitted 12 January, 2023; v1 submitted 7 January, 2023; originally announced January 2023.

Comments: Published at HASP22

arXiv:2208.01356 [pdf, other]

SCFI: State Machine Control-Flow Hardening Against Fault Attacks

Authors: Pascal Nasahl, Martin Unterguggenberger, Rishub Nagpal, Robert Schilling, David Schrammel, Stefan Mangard

Abstract: Fault injection (FI) is a powerful attack methodology allowing an adversary to entirely break the security of a target device. As finite-state machines (FSMs) are fundamental hardware building blocks responsible for controlling systems, inducing faults into these controllers enables an adversary to hijack the execution of the integrated circuit. A common defense strategy mitigating these attacks i… ▽ More Fault injection (FI) is a powerful attack methodology allowing an adversary to entirely break the security of a target device. As finite-state machines (FSMs) are fundamental hardware building blocks responsible for controlling systems, inducing faults into these controllers enables an adversary to hijack the execution of the integrated circuit. A common defense strategy mitigating these attacks is to manually instantiate FSMs multiple times and detect faults using a majority voting logic. However, as each additional FSM instance only provides security against one additional induced fault, this approach scales poorly in a multi-fault attack scenario. In this paper, we present SCFI: a strong, probabilistic FSM protection mechanism ensuring that control-flow deviations from the intended control-flow are detected even in the presence of multiple faults. At its core, SCFI consists of a hardened next-state function absorbing the execution history as well as the FSM's control signals to derive the next state. When either the absorbed inputs, the state registers, or the function itself are affected by faults, SCFI triggers an error with no detection latency. We integrate SCFI into a synthesis tool capable of automatically hardening arbitrary unprotected FSMs without user interaction and open-source the tool. Our evaluation shows that SCFI provides strong protection guarantees with a better area-time product than FSMs protected using classical redundancy-based approaches. Finally, we formally verify the resilience of the protected state machines using a pre-silicon fault analysis tool. △ Less

Submitted 2 August, 2022; originally announced August 2022.

arXiv:2205.04775 [pdf, other]

SYNFI: Pre-Silicon Fault Analysis of an Open-Source Secure Element

Authors: Pascal Nasahl, Miguel Osorio, Pirmin Vogel, Michael Schaffner, Timothy Trippel, Dominic Rizzo, Stefan Mangard

Abstract: Fault attacks are active, physical attacks that an adversary can leverage to alter the control-flow of embedded devices to gain access to sensitive information or bypass protection mechanisms. Due to the severity of these attacks, manufacturers deploy hardware-based fault defenses into security-critical systems, such as secure elements. The development of these countermeasures is a challenging tas… ▽ More Fault attacks are active, physical attacks that an adversary can leverage to alter the control-flow of embedded devices to gain access to sensitive information or bypass protection mechanisms. Due to the severity of these attacks, manufacturers deploy hardware-based fault defenses into security-critical systems, such as secure elements. The development of these countermeasures is a challenging task due to the complex interplay of circuit components and because contemporary design automation tools tend to optimize inserted structures away, thereby defeating their purpose. Hence, it is critical that such countermeasures are rigorously verified post-synthesis. As classical functional verification techniques fall short of assessing the effectiveness of countermeasures, developers have to resort to methods capable of injecting faults in a simulation testbench or into a physical chip. However, developing test sequences to inject faults in simulation is an error-prone task and performing fault attacks on a chip requires specialized equipment and is incredibly time-consuming. To that end, this paper introduces SYNFI, a formal pre-silicon fault verification framework that operates on synthesized netlists. SYNFI can be used to analyze the general effect of faults on the input-output relationship in a circuit and its fault countermeasures, and thus enables hardware designers to assess and verify the effectiveness of embedded countermeasures in a systematic and semi-automatic way. To demonstrate that SYNFI is capable of handling unmodified, industry-grade netlists synthesized with commercial and open tools, we analyze OpenTitan, the first open-source secure element. In our analysis, we identified critical security weaknesses in the unprotected AES block, developed targeted countermeasures, reassessed their security, and contributed these countermeasures back to the OpenTitan repository. △ Less

Submitted 7 July, 2022; v1 submitted 10 May, 2022; originally announced May 2022.

arXiv:2105.03395 [pdf, other]

SERVAS! Secure Enclaves via RISC-V Authenticryption Shield

Authors: Stefan Steinegger, David Schrammel, Samuel Weiser, Pascal Nasahl, Stefan Mangard

Abstract: Isolation is a long-standing challenge of software security. Traditional privilege rings and virtual memory are more and more augmented with concepts such as capabilities, protection keys, and powerful enclaves. At the same time, we are evidencing an increased need for physical protection, shifting towards full memory encryption schemes. This results in a complex interplay of various security mech… ▽ More Isolation is a long-standing challenge of software security. Traditional privilege rings and virtual memory are more and more augmented with concepts such as capabilities, protection keys, and powerful enclaves. At the same time, we are evidencing an increased need for physical protection, shifting towards full memory encryption schemes. This results in a complex interplay of various security mechanisms, increasing the burden for system architects and security analysts. In this work, we tackle the isolation challenge with a new isolation primitive called authenticryption shield that unifies both traditional and advanced isolation policies while offering the potential for future extensibility. At the core, we build upon an authenticated memory encryption scheme that gives cryptographic isolation guarantees and, thus, streamlines the security reasoning. We showcase the versatility of our approach by designing and prototyping SERVAS -- an innovative enclave architecture for RISC-V. Unlike current enclave systems, SERVAS facilitates efficient and secure enclave memory sharing. While the memory encryption constitutes the main overhead, entering or exiting a SERVAS enclave requires only 3.5x of a simple syscall, instead of 71x for Intel SGX. △ Less

Submitted 7 May, 2021; originally announced May 2021.

arXiv:2104.14993 [pdf, other]

FIPAC: Thwarting Fault- and Software-Induced Control-Flow Attacks with ARM Pointer Authentication

Authors: Robert Schilling, Pascal Nasahl, Stefan Mangard

Abstract: With the improvements of computing technology, more and more applications embed powerful ARM processors into their devices. These systems can be attacked by redirecting the control-flow of a program to bypass critical pieces of code such as privilege checks or signature verifications. Control-flow hijacks can be performed using classical software vulnerabilities, physical fault attacks, or softwar… ▽ More With the improvements of computing technology, more and more applications embed powerful ARM processors into their devices. These systems can be attacked by redirecting the control-flow of a program to bypass critical pieces of code such as privilege checks or signature verifications. Control-flow hijacks can be performed using classical software vulnerabilities, physical fault attacks, or software-induced fault attacks. To cope with this threat and to protect the control-flow, dedicated countermeasures are needed. To counteract control-flow hijacks, control-flow integrity~(CFI) aims to be a generic solution. However, software-based CFI typically either protects against software or fault attacks, but not against both. While hardware-assisted CFI can mitigate both types of attacks, they require extensive hardware modifications. As hardware changes are unrealistic for existing ARM architectures, a wide range of systems remains unprotected and vulnerable to control-flow attacks. In this work, we present FIPAC, an efficient software-based CFI scheme protecting the execution at basic block granularity of ARM-based devices against software and fault attacks. FIPAC exploits ARM pointer authentication of ARMv8.6-A to implement a cryptographically signed control-flow graph. We cryptographically link the correct sequence of executed basic blocks to enforce CFI at this granularity. We use an LLVM-based toolchain to automatically instrument programs. The evaluation on SPEC2017 with different security policies shows a code overhead between 54-97\% and a runtime overhead between 35-105%. While these overheads are higher than for countermeasures against software attacks, FIPAC outperforms related work protecting the control-flow against fault attacks. FIPAC is an efficient solution to provide protection against software- and fault-based CFI attacks on basic block level on modern ARM devices. △ Less

Submitted 30 April, 2021; originally announced April 2021.

arXiv:2012.06761 [pdf, other]

doi 10.1145/3433210.3453684

CrypTag: Thwarting Physical and Logical Memory Vulnerabilities using Cryptographically Colored Memory

Authors: Pascal Nasahl, Robert Schilling, Mario Werner, Jan Hoogerbrugge, Marcel Medwed, Stefan Mangard

Abstract: Memory vulnerabilities are a major threat to many computing systems. To effectively thwart spatial and temporal memory vulnerabilities, full logical memory safety is required. However, current mitigation techniques for memory safety are either too expensive or trade security against efficiency. One promising attempt to detect memory safety vulnerabilities in hardware is memory coloring, a security… ▽ More Memory vulnerabilities are a major threat to many computing systems. To effectively thwart spatial and temporal memory vulnerabilities, full logical memory safety is required. However, current mitigation techniques for memory safety are either too expensive or trade security against efficiency. One promising attempt to detect memory safety vulnerabilities in hardware is memory coloring, a security policy deployed on top of tagged memory architectures. However, due to the memory storage and bandwidth overhead of large tags, commodity tagged memory architectures usually only provide small tag sizes, thus limiting their use for security applications. Irrespective of logical memory safety, physical memory safety is a necessity in hostile environments prevalent for modern cloud computing and IoT devices. Architectures from Intel and AMD already implement transparent memory encryption to maintain confidentiality and integrity of all off-chip data. Surprisingly, the combination of both, logical and physical memory safety, has not yet been extensively studied in previous research, and a naive combination of both security strategies would accumulate both overheads. In this paper, we propose CrypTag, an efficient hardware/software co-design mitigating a large class of logical memory safety issues and providing full physical memory safety. At its core, CrypTag utilizes a transparent memory encryption engine not only for physical memory safety, but also for memory coloring at hardly any additional costs. The design avoids any overhead for tag storage by embedding memory colors in the upper bits of a pointer and using these bits as an additional input for the memory encryption. A custom compiler extension automatically leverages CrypTag to detect logical memory safety issues for commodity programs and is fully backward compatible. △ Less

Submitted 9 March, 2021; v1 submitted 12 December, 2020; originally announced December 2020.

arXiv:2009.05262 [pdf, other]

doi 10.1145/3433210.3453112

HECTOR-V: A Heterogeneous CPU Architecture for a Secure RISC-V Execution Environment

Authors: Pascal Nasahl, Robert Schilling, Mario Werner, Stefan Mangard

Abstract: To ensure secure and trustworthy execution of applications, vendors frequently embed trusted execution environments into their systems. Here, applications are protected from adversaries, including a malicious operating system. TEEs are usually built by integrating protection mechanisms directly into the processor or by using dedicated external secure elements. However, both of these approaches onl… ▽ More To ensure secure and trustworthy execution of applications, vendors frequently embed trusted execution environments into their systems. Here, applications are protected from adversaries, including a malicious operating system. TEEs are usually built by integrating protection mechanisms directly into the processor or by using dedicated external secure elements. However, both of these approaches only cover a narrow threat model resulting in limited security guarantees. Enclaves in the application processor typically provide weak isolation between the secure and non-secure domain, especially when considering side-channel attacks. Although secure elements do provide strong isolation, the slow communication interface to the application processor is exposed to adversaries and restricts the use cases. Independently of the used implementation approach, TEEs often lack the possibility to establish secure communication to external peripherals, and most operating systems executed inside TEEs do not provide state-of-the-art defense strategies, making them vulnerable against various attacks. We argue that TEEs implemented on the main application processor are insecure, especially when considering side-channel attacks. We demonstrate how a heterogeneous architecture can be utilized to realize a secure TEE design. We directly embed a processor into our architecture to provide strong isolation between the secure and non-secure domain. The tight coupling of TEE and REE enables HECTOR-V to provide mechanisms for establishing secure communication channels. We further introduce RISC-V Secure Co-Processor, a security-hardened processor tailored for TEEs. To secure applications executed inside the TEE, RVSCP provides control-flow integrity, rigorously restricts I/O accesses to certain execution states, and provides operating system services directly in hardware. △ Less

Submitted 9 March, 2021; v1 submitted 11 September, 2020; originally announced September 2020.

arXiv:1809.08811 [pdf, other]

doi 10.1145/3274694.3274728

Pointing in the Right Direction - Securing Memory Accesses in a Faulty World

Authors: Robert Schilling, Mario Werner, Pascal Nasahl, Stefan Mangard

Abstract: Reading and writing memory are, besides computation, the most common operations a processor performs. The correctness of these operations is therefore essential for the proper execution of any program. However, as soon as fault attacks are considered, assuming that the hardware performs its memory operations as instructed is not valid anymore. In particular, attackers may induce faults with the go… ▽ More Reading and writing memory are, besides computation, the most common operations a processor performs. The correctness of these operations is therefore essential for the proper execution of any program. However, as soon as fault attacks are considered, assuming that the hardware performs its memory operations as instructed is not valid anymore. In particular, attackers may induce faults with the goal of reading or writing incorrectly addressed memory, which can have various critical safety and security implications. In this work, we present a solution to this problem and propose a new method for protecting every memory access inside a program against address tampering. The countermeasure comprises two building blocks. First, every pointer inside the program is redundantly encoded using a multi-residue error detection code. The redundancy information is stored in the unused upper bits of the pointer with zero overhead in terms of storage. Second, load and store instructions are extended to link data with the corresponding encoded address from the pointer. Wrong memory accesses subsequently infect the data value allowing the software to detect the error. For evaluation purposes, we implemented our countermeasure into a RISC-V processor, tested it on a FPGA development board, and evaluated the induced overhead. Furthermore, a LLVM-based C compiler has been modified to automatically encode all data pointers, to perform encoded pointer arithmetic, and to emit the extended load/store instructions with linking support. Our evaluations show that the countermeasure induces an average overhead of 10% in terms of code size and 7% regarding runtime, which makes it suitable for practical adoption. △ Less

Submitted 24 September, 2018; originally announced September 2018.

Comments: Accepted at ACSAC 2018

arXiv:1803.08359 [pdf, other]

Securing Conditional Branches in the Presence of Fault Attacks

Authors: Robert Schilling, Mario Werner, Stefan Mangard

Abstract: In typical software, many comparisons and subsequent branch operations are highly critical in terms of security. Examples include password checks, signature checks, secure boot, and user privilege checks. For embedded devices, these security-critical branches are a preferred target of fault attacks as a single bit flip or skipping a single instruction can lead to complete access to a system. In th… ▽ More In typical software, many comparisons and subsequent branch operations are highly critical in terms of security. Examples include password checks, signature checks, secure boot, and user privilege checks. For embedded devices, these security-critical branches are a preferred target of fault attacks as a single bit flip or skipping a single instruction can lead to complete access to a system. In the past, numerous redundancy schemes have been proposed in order to provide control-flow-integrity (CFI) and to enable error detection on processed data. However, current countermeasures for general purpose software do not provide protection mechanisms for conditional branches. Hence, critical branches are in practice often simply duplicated. We present a generic approach to protect conditional branches, which links an encoding-based comparison result with the redundancy of CFI protection mechanisms. The presented approach can be used for all types of data encodings and CFI mechanisms and maintains their error-detection capabilities throughout all steps of a conditional branch. We demonstrate our approach by realizing an encoded comparison based on AN-codes, which is a frequently used encoding scheme to detect errors on data during arithmetic operations. We extended the LLVM compiler so that standard code and conditional branches can be protected automatically and analyze its security. Our design shows that the overhead in terms of size and runtime is lower than state-of-the-art duplication schemes. △ Less

Submitted 22 March, 2018; originally announced March 2018.

Comments: Accepted at DATE 2018

arXiv:1802.06691 [pdf, other]

Sponge-Based Control-Flow Protection for IoT Devices

Authors: Mario Werner, Thomas Unterluggauer, David Schaffenrath, Stefan Mangard

Abstract: Embedded devices in the Internet of Things (IoT) face a wide variety of security challenges. For example, software attackers perform code injection and code-reuse attacks on their remote interfaces, and physical access to IoT devices allows to tamper with code in memory, steal confidential Intellectual Property (IP), or mount fault attacks to manipulate a CPU's control flow. In this work, we pre… ▽ More Embedded devices in the Internet of Things (IoT) face a wide variety of security challenges. For example, software attackers perform code injection and code-reuse attacks on their remote interfaces, and physical access to IoT devices allows to tamper with code in memory, steal confidential Intellectual Property (IP), or mount fault attacks to manipulate a CPU's control flow. In this work, we present Sponge-based Control Flow Protection (SCFP). SCFP is a stateful, sponge-based scheme to ensure the confidentiality of software IP and its authentic execution on IoT devices. At compile time, SCFP encrypts and authenticates software with instruction-level granularity. During execution, an SCFP hardware extension between the CPU's fetch and decode stage continuously decrypts and authenticates instructions. Sponge-based authenticated encryption in SCFP yields fine-grained control-flow integrity and thus prevents code-reuse, code-injection, and fault attacks on the code and the control flow. In addition, SCFP withstands any modification of software in memory. For evaluation, we extended a RISC-V core with SCFP and fabricated a real System on Chip (SoC). The average overhead in code size and execution time of SCFP on this design is 19.8% and 9.1%, respectively, and thus meets the requirements of embedded IoT devices. △ Less

Submitted 19 February, 2018; originally announced February 2018.

Comments: accepted at IEEE EuroS&P 2018

arXiv:1801.01207 [pdf, other]

Meltdown

Authors: Moritz Lipp, Michael Schwarz, Daniel Gruss, Thomas Prescher, Werner Haas, Stefan Mangard, Paul Kocher, Daniel Genkin, Yuval Yarom, Mike Hamburg

Abstract: The security of computer systems fundamentally relies on memory isolation, e.g., kernel address ranges are marked as non-accessible and are protected from user access. In this paper, we present Meltdown. Meltdown exploits side effects of out-of-order execution on modern processors to read arbitrary kernel-memory locations including personal data and passwords. Out-of-order execution is an indispen… ▽ More The security of computer systems fundamentally relies on memory isolation, e.g., kernel address ranges are marked as non-accessible and are protected from user access. In this paper, we present Meltdown. Meltdown exploits side effects of out-of-order execution on modern processors to read arbitrary kernel-memory locations including personal data and passwords. Out-of-order execution is an indispensable performance feature and present in a wide range of modern processors. The attack works on different Intel microarchitectures since at least 2010 and potentially other processors are affected. The root cause of Meltdown is the hardware. The attack is independent of the operating system, and it does not rely on any software vulnerabilities. Meltdown breaks all security assumptions given by address space isolation as well as paravirtualized environments and, thus, every security mechanism building upon this foundation. On affected systems, Meltdown enables an adversary to read memory of other processes or virtual machines in the cloud without any permissions or privileges, affecting millions of customers and virtually every user of a personal computer. We show that the KAISER defense mechanism for KASLR has the important (but inadvertent) side effect of impeding Meltdown. We stress that KAISER must be deployed immediately to prevent large-scale exploitation of this severe information leakage. △ Less

Submitted 3 January, 2018; originally announced January 2018.

arXiv:1801.01203 [pdf, ps, other]

Spectre Attacks: Exploiting Speculative Execution

Authors: Paul Kocher, Daniel Genkin, Daniel Gruss, Werner Haas, Mike Hamburg, Moritz Lipp, Stefan Mangard, Thomas Prescher, Michael Schwarz, Yuval Yarom

Abstract: Modern processors use branch prediction and speculative execution to maximize performance. For example, if the destination of a branch depends on a memory value that is in the process of being read, CPUs will try guess the destination and attempt to execute ahead. When the memory value finally arrives, the CPU either discards or commits the speculative computation. Speculative logic is unfaithful… ▽ More Modern processors use branch prediction and speculative execution to maximize performance. For example, if the destination of a branch depends on a memory value that is in the process of being read, CPUs will try guess the destination and attempt to execute ahead. When the memory value finally arrives, the CPU either discards or commits the speculative computation. Speculative logic is unfaithful in how it executes, can access to the victim's memory and registers, and can perform operations with measurable side effects. Spectre attacks involve inducing a victim to speculatively perform operations that would not occur during correct program execution and which leak the victim's confidential information via a side channel to the adversary. This paper describes practical attacks that combine methodology from side channel attacks, fault attacks, and return-oriented programming that can read arbitrary memory from the victim's process. More broadly, the paper shows that speculative execution implementations violate the security assumptions underpinning numerous software security mechanisms, including operating system process separation, static analysis, containerization, just-in-time (JIT) compilation, and countermeasures to cache timing/side-channel attacks. These attacks represent a serious threat to actual systems, since vulnerable speculative execution capabilities are found in microprocessors from Intel, AMD, and ARM that are used in billions of devices. While makeshift processor-specific countermeasures are possible in some cases, sound solutions will require fixes to processor designs as well as updates to instruction set architectures (ISAs) to give hardware architects and software developers a common understanding as to what computation state CPU implementations are (and are not) permitted to leak. △ Less

Submitted 3 January, 2018; originally announced January 2018.

arXiv:1711.01254 [pdf, other]

Automated Detection, Exploitation, and Elimination of Double-Fetch Bugs using Modern CPU Features

Authors: Michael Schwarz, Daniel Gruss, Moritz Lipp, Clémentine Maurice, Thomas Schuster, Anders Fogh, Stefan Mangard

Abstract: Double-fetch bugs are a special type of race condition, where an unprivileged execution thread is able to change a memory location between the time-of-check and time-of-use of a privileged execution thread. If an unprivileged attacker changes the value at the right time, the privileged operation becomes inconsistent, leading to a change in control flow, and thus an escalation of privileges for the… ▽ More Double-fetch bugs are a special type of race condition, where an unprivileged execution thread is able to change a memory location between the time-of-check and time-of-use of a privileged execution thread. If an unprivileged attacker changes the value at the right time, the privileged operation becomes inconsistent, leading to a change in control flow, and thus an escalation of privileges for the attacker. More severely, such double-fetch bugs can be introduced by the compiler, entirely invisible on the source-code level. We propose novel techniques to efficiently detect, exploit, and eliminate double-fetch bugs. We demonstrate the first combination of state-of-the-art cache attacks with kernel-fuzzing techniques to allow fully automated identification of double fetches. We demonstrate the first fully automated reliable detection and exploitation of double-fetch bugs, making manual analysis as in previous work superfluous. We show that cache-based triggers outperform state-of-the-art exploitation techniques significantly, leading to an exploitation success rate of up to 97%. Our modified fuzzer automatically detects double fetches and automatically narrows down this candidate set for double-fetch bugs to the exploitable ones. We present the first generic technique based on hardware transactional memory, to eliminate double-fetch bugs in a fully automated and transparent manner. We extend defensive programming techniques by retrofitting arbitrary code with automated double-fetch prevention, both in trusted execution environments as well as in syscalls, with a performance overhead below 1%. △ Less

Submitted 3 November, 2017; originally announced November 2017.

arXiv:1706.06381 [pdf, other]

KeyDrown: Eliminating Keystroke Timing Side-Channel Attacks

Authors: Michael Schwarz, Moritz Lipp, Daniel Gruss, Samuel Weiser, Clémentine Maurice, Raphael Spreitzer, Stefan Mangard

Abstract: Besides cryptographic secrets, side-channel attacks also leak sensitive user input. The most accurate attacks exploit cache timings or interrupt information to monitor keystroke timings and subsequently infer typed words and sentences. Previously proposed countermeasures fail to prevent keystroke timing attacks as they do not protect keystroke processing among the entire software stack. We close… ▽ More Besides cryptographic secrets, side-channel attacks also leak sensitive user input. The most accurate attacks exploit cache timings or interrupt information to monitor keystroke timings and subsequently infer typed words and sentences. Previously proposed countermeasures fail to prevent keystroke timing attacks as they do not protect keystroke processing among the entire software stack. We close this gap with KeyDrown, a new defense mechanism against keystroke timing attacks. KeyDrown injects a large number of fake keystrokes in the kernel to prevent interrupt-based attacks and Prime+Probe attacks on the kernel. All keystrokes, including fake keystrokes, are carefully propagated through the shared library in order to hide any cache activity and thus to prevent Flush+Reload attacks. Finally, we provide additional protection against Prime+Probe for password input in user space programs. We show that attackers cannot distinguish fake keystrokes from real keystrokes anymore and we evaluate KeyDrown on a commodity notebook as well as on two Android smartphones. We show that KeyDrown eliminates any advantage an attacker can gain from using interrupt or cache side-channel information. △ Less

Submitted 20 June, 2017; originally announced June 2017.

arXiv:1702.08719 [pdf, other]

Malware Guard Extension: Using SGX to Conceal Cache Attacks

Authors: Michael Schwarz, Samuel Weiser, Daniel Gruss, Clémentine Maurice, Stefan Mangard

Abstract: In modern computer systems, user processes are isolated from each other by the operating system and the hardware. Additionally, in a cloud scenario it is crucial that the hypervisor isolates tenants from other tenants that are co-located on the same physical machine. However, the hypervisor does not protect tenants against the cloud provider and thus the supplied operating system and hardware. Int… ▽ More In modern computer systems, user processes are isolated from each other by the operating system and the hardware. Additionally, in a cloud scenario it is crucial that the hypervisor isolates tenants from other tenants that are co-located on the same physical machine. However, the hypervisor does not protect tenants against the cloud provider and thus the supplied operating system and hardware. Intel SGX provides a mechanism that addresses this scenario. It aims at protecting user-level software from attacks from other processes, the operating system, and even physical attackers. In this paper, we demonstrate fine-grained software-based side-channel attacks from a malicious SGX enclave targeting co-located enclaves. Our attack is the first malware running on real SGX hardware, abusing SGX protection features to conceal itself. Furthermore, we demonstrate our attack both in a native environment and across multiple Docker containers. We perform a Prime+Probe cache side-channel attack on a co-located SGX enclave running an up-to-date RSA implementation that uses a constant-time multiplication primitive. The attack works although in SGX enclaves there are no timers, no large pages, no physical addresses, and no shared memory. In a semi-synchronous attack, we extract 96% of an RSA private key from a single trace. We extract the full RSA private key in an automated attack from 11 traces within 5 minutes. △ Less

Submitted 22 May, 2019; v1 submitted 28 February, 2017; originally announced February 2017.

Comments: Extended version of DIMVA 2017 submission

arXiv:1612.05974 [pdf, other]

doi 10.1109/TCSI.2017.2698019

An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics

Authors: Francesco Conti, Robert Schilling, Pasquale Davide Schiavone, Antonio Pullini, Davide Rossi, Frank Kagan Gürkaynak, Michael Muehlberghuber, Michael Gautschi, Igor Loi, Germain Haugou, Stefan Mangard, Luca Benini

Abstract: Near-sensor data analytics is a promising direction for IoT endpoints, as it minimizes energy spent on communication and reduces network load - but it also poses security concerns, as valuable data is stored or sent over the network at various stages of the analytics pipeline. Using encryption to protect sensitive data at the boundary of the on-chip analytics engine is a way to address data securi… ▽ More Near-sensor data analytics is a promising direction for IoT endpoints, as it minimizes energy spent on communication and reduces network load - but it also poses security concerns, as valuable data is stored or sent over the network at various stages of the analytics pipeline. Using encryption to protect sensitive data at the boundary of the on-chip analytics engine is a way to address data security issues. To cope with the combined workload of analytics and encryption in a tight power envelope, we propose Fulmine, a System-on-Chip based on a tightly-coupled multi-core cluster augmented with specialized blocks for compute-intensive data processing and encryption functions, supporting software programmability for regular computing tasks. The Fulmine SoC, fabricated in 65nm technology, consumes less than 20mW on average at 0.8V achieving an efficiency of up to 70pJ/B in encryption, 50pJ/px in convolution, or up to 25MIPS/mW in software. As a strong argument for real-life flexible application of our platform, we show experimental results for three secure analytics use cases: secure autonomous aerial surveillance with a state-of-the-art deep CNN consuming 3.16pJ per equivalent RISC op; local CNN-based face detection with secured remote recognition in 5.74pJ/op; and seizure detection with encrypted data collection from EEG within 12.7pJ/op. △ Less

Submitted 23 April, 2017; v1 submitted 18 December, 2016; originally announced December 2016.

Comments: 15 pages, 12 figures, accepted for publication to the IEEE Transactions on Circuits and Systems - I: Regular Papers

arXiv:1611.03748 [pdf, other]

doi 10.1109/COMST.2017.2779824

Systematic Classification of Side-Channel Attacks: A Case Study for Mobile Devices

Authors: Raphael Spreitzer, Veelasha Moonsamy, Thomas Korak, Stefan Mangard

Abstract: Side-channel attacks on mobile devices have gained increasing attention since their introduction in 2007. While traditional side-channel attacks, such as power analysis attacks and electromagnetic analysis attacks, required physical presence of the attacker as well as expensive equipment, an (unprivileged) application is all it takes to exploit the leaking information on modern mobile devices. Giv… ▽ More Side-channel attacks on mobile devices have gained increasing attention since their introduction in 2007. While traditional side-channel attacks, such as power analysis attacks and electromagnetic analysis attacks, required physical presence of the attacker as well as expensive equipment, an (unprivileged) application is all it takes to exploit the leaking information on modern mobile devices. Given the vast amount of sensitive information that are stored on smartphones, the ramifications of side-channel attacks affect both the security and privacy of users and their devices. In this paper, we propose a new categorization system for side-channel attacks, which is necessary as side-channel attacks have evolved significantly since their scientific investigations during the smart card era in the 1990s. Our proposed classification system allows to analyze side-channel attacks systematically, and facilitates the development of novel countermeasures. Besides this new categorization system, the extensive survey of existing attacks and attack strategies provides valuable insights into the evolving field of side-channel attacks, especially when focusing on mobile devices. We conclude by discussing open issues and challenges in this context and outline possible future research directions. △ Less

Submitted 6 December, 2017; v1 submitted 11 November, 2016; originally announced November 2016.

arXiv:1511.08756 [pdf, other]

DRAMA: Exploiting DRAM Addressing for Cross-CPU Attacks

Authors: Peter Pessl, Daniel Gruss, Clémentine Maurice, Michael Schwarz, Stefan Mangard

Abstract: In cloud computing environments, multiple tenants are often co-located on the same multi-processor system. Thus, preventing information leakage between tenants is crucial. While the hypervisor enforces software isolation, shared hardware, such as the CPU cache or memory bus, can leak sensitive information. For security reasons, shared memory between tenants is typically disabled. Furthermore, tena… ▽ More In cloud computing environments, multiple tenants are often co-located on the same multi-processor system. Thus, preventing information leakage between tenants is crucial. While the hypervisor enforces software isolation, shared hardware, such as the CPU cache or memory bus, can leak sensitive information. For security reasons, shared memory between tenants is typically disabled. Furthermore, tenants often do not share a physical CPU. In this setting, cache attacks do not work and only a slow cross-CPU covert channel over the memory bus is known. In contrast, we demonstrate a high-speed covert channel as well as the first side-channel attack working across processors and without any shared memory. To build these attacks, we use the undocumented DRAM address mappings. We present two methods to reverse engineer the mapping of memory addresses to DRAM channels, ranks, and banks. One uses physical probing of the memory bus, the other runs entirely in software and is fully automated. Using this mapping, we introduce DRAMA attacks, a novel class of attacks that exploit the DRAM row buffer that is shared, even in multi-processor systems. Thus, our attacks work in the most restrictive environments. First, we build a covert channel with a capacity of up to 2 Mbps, which is three to four orders of magnitude faster than memory-bus-based channels. Second, we build a side-channel template attack that can automatically locate and monitor memory accesses. Third, we show how using the DRAM mappings improves existing attacks and in particular enables practical Rowhammer attacks on DDR4. △ Less

Submitted 28 June, 2016; v1 submitted 27 November, 2015; originally announced November 2015.

Comments: Original publication in the Proceedings of the 25th Annual USENIX Security Symposium (USENIX Security 2016). https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/pessl

arXiv:1511.04897 [pdf, other]

ARMageddon: Cache Attacks on Mobile Devices

Authors: Moritz Lipp, Daniel Gruss, Raphael Spreitzer, Clémentine Maurice, Stefan Mangard

Abstract: In the last 10 years, cache attacks on Intel x86 CPUs have gained increasing attention among the scientific community and powerful techniques to exploit cache side channels have been developed. However, modern smartphones use one or more multi-core ARM CPUs that have a different cache organization and instruction set than Intel x86 CPUs. So far, no cross-core cache attacks have been demonstrated o… ▽ More In the last 10 years, cache attacks on Intel x86 CPUs have gained increasing attention among the scientific community and powerful techniques to exploit cache side channels have been developed. However, modern smartphones use one or more multi-core ARM CPUs that have a different cache organization and instruction set than Intel x86 CPUs. So far, no cross-core cache attacks have been demonstrated on non-rooted Android smartphones. In this work, we demonstrate how to solve key challenges to perform the most powerful cross-core cache attacks Prime+Probe, Flush+Reload, Evict+Reload, and Flush+Flush on non-rooted ARM-based devices without any privileges. Based on our techniques, we demonstrate covert channels that outperform state-of-the-art covert channels on Android by several orders of magnitude. Moreover, we present attacks to monitor tap and swipe events as well as keystrokes, and even derive the lengths of words entered on the touchscreen. Eventually, we are the first to attack cryptographic primitives implemented in Java. Our attacks work across CPUs and can even monitor cache activity in the ARM TrustZone from the normal world. The techniques we present can be used to attack hundreds of millions of Android devices. △ Less

Submitted 19 June, 2016; v1 submitted 16 November, 2015; originally announced November 2015.

Comments: Original publication in the Proceedings of the 25th Annual USENIX Security Symposium (USENIX Security 2016). https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/lipp

arXiv:1511.04594 [pdf, other]

Flush+Flush: A Fast and Stealthy Cache Attack

Authors: Daniel Gruss, Clémentine Maurice, Klaus Wagner, Stefan Mangard

Abstract: Research on cache attacks has shown that CPU caches leak significant information. Proposed detection mechanisms assume that all cache attacks cause more cache hits and cache misses than benign applications and use hardware performance counters for detection. In this article, we show that this assumption does not hold by developing a novel attack technique: the Flush+Flush attack. The Flush+Flush… ▽ More Research on cache attacks has shown that CPU caches leak significant information. Proposed detection mechanisms assume that all cache attacks cause more cache hits and cache misses than benign applications and use hardware performance counters for detection. In this article, we show that this assumption does not hold by developing a novel attack technique: the Flush+Flush attack. The Flush+Flush attack only relies on the execution time of the flush instruction, which depends on whether data is cached or not. Flush+Flush does not make any memory accesses, contrary to any other cache attack. Thus, it causes no cache misses at all and the number of cache hits is reduced to a minimum due to the constant cache flushes. Therefore, Flush+Flush attacks are stealthy, i.e., the spy process cannot be detected based on cache hits and misses, or state-of-the-art detection mechanisms. The Flush+Flush attack runs in a higher frequency and thus is faster than any existing cache attack. With 496 KB/s in a cross-core covert channel it is 6.7 times faster than any previously published cache covert channel. △ Less

Submitted 5 April, 2016; v1 submitted 14 November, 2015; originally announced November 2015.

Comments: This paper has been accepted at the 13th Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA) 2016. The final publication is available at link.springer.com

arXiv:1507.06955 [pdf, other]

Rowhammer.js: A Remote Software-Induced Fault Attack in JavaScript

Authors: Daniel Gruss, Clémentine Maurice, Stefan Mangard

Abstract: A fundamental assumption in software security is that a memory location can only be modified by processes that may write to this memory location. However, a recent study has shown that parasitic effects in DRAM can change the content of a memory cell without accessing it, but by accessing other memory locations in a high frequency. This so-called Rowhammer bug occurs in most of today's memory modu… ▽ More A fundamental assumption in software security is that a memory location can only be modified by processes that may write to this memory location. However, a recent study has shown that parasitic effects in DRAM can change the content of a memory cell without accessing it, but by accessing other memory locations in a high frequency. This so-called Rowhammer bug occurs in most of today's memory modules and has fatal consequences for the security of all affected systems, e.g., privilege escalation attacks. All studies and attacks related to Rowhammer so far rely on the availability of a cache flush instruction in order to cause accesses to DRAM modules at a sufficiently high frequency. We overcome this limitation by defeating complex cache replacement policies. We show that caches can be forced into fast cache eviction to trigger the Rowhammer bug with only regular memory accesses. This allows to trigger the Rowhammer bug in highly restricted and even scripting environments. We demonstrate a fully automated attack that requires nothing but a website with JavaScript to trigger faults on remote hardware. Thereby we can gain unrestricted access to systems of website visitors. We show that the attack works on off-the-shelf systems. Existing countermeasures fail to protect against this new Rowhammer attack. △ Less

Submitted 5 April, 2016; v1 submitted 24 July, 2015; originally announced July 2015.

Comments: This paper has been accepted at the 13th Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA) 2016. The final publication is available at link.springer.com

arXiv:0907.4273 [pdf, other]

On the Duality of Probing and Fault Attacks

Authors: Berndt M. Gammel, Stefan Mangard

Abstract: In this work we investigate the problem of simultaneous privacy and integrity protection in cryptographic circuits. We consider a white-box scenario with a powerful, yet limited attacker. A concise metric for the level of probing and fault security is introduced, which is directly related to the capabilities of a realistic attacker. In order to investigate the interrelation of probing and fault… ▽ More In this work we investigate the problem of simultaneous privacy and integrity protection in cryptographic circuits. We consider a white-box scenario with a powerful, yet limited attacker. A concise metric for the level of probing and fault security is introduced, which is directly related to the capabilities of a realistic attacker. In order to investigate the interrelation of probing and fault security we introduce a common mathematical framework based on the formalism of information and coding theory. The framework unifies the known linear masking schemes. We proof a central theorem about the properties of linear codes which leads to optimal secret sharing schemes. These schemes provide the lower bound for the number of masks needed to counteract an attacker with a given strength. The new formalism reveals an intriguing duality principle between the problems of probing and fault security, and provides a unified view on privacy and integrity protection using error detecting codes. Finally, we introduce a new class of linear tamper-resistant codes. These are eligible to preserve security against an attacker mounting simultaneous probing and fault attacks. △ Less

Submitted 24 July, 2009; originally announced July 2009.

Journal ref: Cryptology ePrint Archive, Report 2009/352

Showing 1–25 of 25 results for author: Mangard, S