-
TopicTag: Automatic Annotation of NMF Topic Models Using Chain of Thought and Prompt Tuning with LLMs
Authors:
Selma Wanna,
Ryan Barron,
Nick Solovyev,
Maksim E. Eren,
Manish Bhattarai,
Kim Rasmussen,
Boian S. Alexandrov
Abstract:
Topic modeling is a technique for organizing and extracting themes from large collections of unstructured text. Non-negative matrix factorization (NMF) is a common unsupervised approach that decomposes a term frequency-inverse document frequency (TF-IDF) matrix to uncover latent topics and segment the dataset accordingly. While useful for highlighting patterns and clustering documents, NMF does no…
▽ More
Topic modeling is a technique for organizing and extracting themes from large collections of unstructured text. Non-negative matrix factorization (NMF) is a common unsupervised approach that decomposes a term frequency-inverse document frequency (TF-IDF) matrix to uncover latent topics and segment the dataset accordingly. While useful for highlighting patterns and clustering documents, NMF does not provide explicit topic labels, necessitating subject matter experts (SMEs) to assign labels manually. We present a methodology for automating topic labeling in documents clustered via NMF with automatic model determination (NMFk). By leveraging the output of NMFk and employing prompt engineering, we utilize large language models (LLMs) to generate accurate topic labels. Our case study on over 34,000 scientific abstracts on Knowledge Graphs demonstrates the effectiveness of our method in enhancing knowledge management and document organization.
△ Less
Submitted 28 July, 2024;
originally announced July 2024.
-
Dating ancient manuscripts using radiocarbon and AI-based writing style analysis
Authors:
Mladen Popović,
Maruf A. Dhali,
Lambert Schomaker,
Johannes van der Plicht,
Kaare Lund Rasmussen,
Jacopo La Nasa,
Ilaria Degano,
Maria Perla Colombini,
Eibert Tigchelaar
Abstract:
Determining the chronology of ancient handwritten manuscripts is essential for reconstructing the evolution of ideas. For the Dead Sea Scrolls, this is particularly important. However, there is an almost complete lack of date-bearing manuscripts evenly distributed across the timeline and written in similar scripts available for palaeographic comparison. Here, we present Enoch, a state-of-the-art A…
▽ More
Determining the chronology of ancient handwritten manuscripts is essential for reconstructing the evolution of ideas. For the Dead Sea Scrolls, this is particularly important. However, there is an almost complete lack of date-bearing manuscripts evenly distributed across the timeline and written in similar scripts available for palaeographic comparison. Here, we present Enoch, a state-of-the-art AI-based date-prediction model, trained on the basis of new radiocarbon-dated samples of the scrolls. Enoch uses established handwriting-style descriptors and applies Bayesian ridge regression. The challenge of this study is that the number of radiocarbon-dated manuscripts is small, while current machine learning requires an abundance of training data. We show that by using combined angular and allographic writing style feature vectors and applying Bayesian ridge regression, Enoch could predict the radiocarbon-based dates from style, supported by leave-one-out validation, with varied MAEs of 27.9 to 30.7 years relative to the radiocarbon dating. Enoch was then used to estimate the dates of 135 unseen manuscripts, revealing that 79 per cent of the samples were considered 'realistic' upon palaeographic post-hoc evaluation. We present a new chronology of the scrolls. The radiocarbon ranges and Enoch's style-based predictions are often older than the traditionally assumed palaeographic estimates. In the range of 300-50 BCE, Enoch's date prediction provides an improved granularity. The study is in line with current developments in multimodal machine-learning techniques, and the methods can be used for date prediction in other partially-dated manuscript collections. This research shows how Enoch's quantitative, probability-based approach can be a tool for palaeographers and historians, re-dating ancient Jewish key texts and contributing to current debates on Jewish and Christian origins.
△ Less
Submitted 26 June, 2024;
originally announced July 2024.
-
Cyber-Security Knowledge Graph Generation by Hierarchical Nonnegative Matrix Factorization
Authors:
Ryan Barron,
Maksim E. Eren,
Manish Bhattarai,
Selma Wanna,
Nicholas Solovyev,
Kim Rasmussen,
Boian S. Alexandrov,
Charles Nicholas,
Cynthia Matuszek
Abstract:
Much of human knowledge in cybersecurity is encapsulated within the ever-growing volume of scientific papers. As this textual data continues to expand, the importance of document organization methods becomes increasingly crucial for extracting actionable insights hidden within large text datasets. Knowledge Graphs (KGs) serve as a means to store factual information in a structured manner, providin…
▽ More
Much of human knowledge in cybersecurity is encapsulated within the ever-growing volume of scientific papers. As this textual data continues to expand, the importance of document organization methods becomes increasingly crucial for extracting actionable insights hidden within large text datasets. Knowledge Graphs (KGs) serve as a means to store factual information in a structured manner, providing explicit, interpretable knowledge that includes domain-specific information from the cybersecurity scientific literature. One of the challenges in constructing a KG from scientific literature is the extraction of ontology from unstructured text. In this paper, we address this topic and introduce a method for building a multi-modal KG by extracting structured ontology from scientific papers. We demonstrate this concept in the cybersecurity domain. One modality of the KG represents observable information from the papers, such as the categories in which they were published or the authors. The second modality uncovers latent (hidden) patterns of text extracted through hierarchical and semantic non-negative matrix factorization (NMF), such as named entities, topics or clusters, and keywords. We illustrate this concept by consolidating more than two million scientific papers uploaded to arXiv into the cyber-domain, using hierarchical and semantic NMF, and by building a cyber-domain-specific KG.
△ Less
Submitted 26 March, 2024; v1 submitted 24 March, 2024;
originally announced March 2024.
-
Catch'em all: Classification of Rare, Prominent, and Novel Malware Families
Authors:
Maksim E. Eren,
Ryan Barron,
Manish Bhattarai,
Selma Wanna,
Nicholas Solovyev,
Kim Rasmussen,
Boian S. Alexandrov,
Charles Nicholas
Abstract:
National security is threatened by malware, which remains one of the most dangerous and costly cyber threats. As of last year, researchers reported 1.3 billion known malware specimens, motivating the use of data-driven machine learning (ML) methods for analysis. However, shortcomings in existing ML approaches hinder their mass adoption. These challenges include detection of novel malware and the a…
▽ More
National security is threatened by malware, which remains one of the most dangerous and costly cyber threats. As of last year, researchers reported 1.3 billion known malware specimens, motivating the use of data-driven machine learning (ML) methods for analysis. However, shortcomings in existing ML approaches hinder their mass adoption. These challenges include detection of novel malware and the ability to perform malware classification in the face of class imbalance: a situation where malware families are not equally represented in the data. Our work addresses these shortcomings with MalwareDNA: an advanced dimensionality reduction and feature extraction framework. We demonstrate stable task performance under class imbalance for the following tasks: malware family classification and novel malware detection with a trade-off in increased abstention or reject-option rate.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Iris: Dynamic Privacy Preserving Search in Structured Peer-to-Peer Networks
Authors:
Angeliki Aktypi,
Kasper Rasmussen
Abstract:
In structured peer-to-peer networks like Chord, the users manage to retrieve the information they seek by asking other nodes from the network for the information they search. Revealing to other nodes the search target makes structured peer-to-peer networks unsuitable for applications that demand query privacy, i.e., hiding the query's target from the intermediate nodes that take part in the routin…
▽ More
In structured peer-to-peer networks like Chord, the users manage to retrieve the information they seek by asking other nodes from the network for the information they search. Revealing to other nodes the search target makes structured peer-to-peer networks unsuitable for applications that demand query privacy, i.e., hiding the query's target from the intermediate nodes that take part in the routing. This paper studies the query privacy of structured P2P networks, particularly the Chord protocol.
We initially observe that already proposed privacy notions, such as $k$-anonymity, do not allow us to reason about the privacy guarantees of a query in Chord in the presence of a strong adversary. Thus, we introduce a new privacy notion that we call $(α,δ)$-privacy that allows us to evaluate the privacy guarantees even when considering the worst-case scenario regarding an attacker's background knowledge.
We then design Iris, an algorithm that allows a requester to conceal the target of a query in Chord from the intermediate nodes that take part in the routing. Iris achieves that by having the requester query for other than the target addresses so as reaching each one of them allows the requester to get closer to the target address.
We perform a security analysis of the proposed algorithm, based on the privacy notion we introduce. We also develop a prototype of the algorithm in Matlab and evaluate its performance. Our analysis proves Iris to be $(α,δ)$-private while introducing a modest performance overhead.
△ Less
Submitted 30 October, 2023;
originally announced October 2023.
-
Ask for Alice: Online Covert Distress Signal in the Presence of a Strong Adversary
Authors:
Hayyu Imanda,
Kasper Rasmussen
Abstract:
In this paper we propose a protocol that can be used to covertly send a distress signal through a seemingly normal webserver, even if the adversary is monitoring both the network and the user's device. This allows a user to call for help even when they are in the same physical space as their adversaries. We model such a scenario by introducing a strong adversary model that captures a high degree o…
▽ More
In this paper we propose a protocol that can be used to covertly send a distress signal through a seemingly normal webserver, even if the adversary is monitoring both the network and the user's device. This allows a user to call for help even when they are in the same physical space as their adversaries. We model such a scenario by introducing a strong adversary model that captures a high degree of access to the user's device and full control over the network.
Our model fits into scenarios where a user is under surveillance and wishes to inform a trusted party of the situation. To do this, our method uses existing websites to act as intermediaries between the user and a trusted backend; this enables the user to initiate the distress signal without arousing suspicion, even while being actively monitored. We accomplish this by utilising the TLS handshake to convey additional information; this means that any website wishing to participate can do so with minimal effort and anyone monitoring the traffic will just see common TLS connections. In order for websites to be willing to host such a functionality the protocol must coexist gracefully with users who use normal TLS and the computational overhead must be minimal. We provide a full security analysis of the architecture and prove that the adversary cannot distinguish between a set of communications which contains a distress call and a normal communication.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
Interactive Distillation of Large Single-Topic Corpora of Scientific Papers
Authors:
Nicholas Solovyev,
Ryan Barron,
Manish Bhattarai,
Maksim E. Eren,
Kim O. Rasmussen,
Boian S. Alexandrov
Abstract:
Highly specific datasets of scientific literature are important for both research and education. However, it is difficult to build such datasets at scale. A common approach is to build these datasets reductively by applying topic modeling on an established corpus and selecting specific topics. A more robust but time-consuming approach is to build the dataset constructively in which a subject matte…
▽ More
Highly specific datasets of scientific literature are important for both research and education. However, it is difficult to build such datasets at scale. A common approach is to build these datasets reductively by applying topic modeling on an established corpus and selecting specific topics. A more robust but time-consuming approach is to build the dataset constructively in which a subject matter expert (SME) handpicks documents. This method does not scale and is prone to error as the dataset grows. Here we showcase a new tool, based on machine learning, for constructively generating targeted datasets of scientific literature. Given a small initial "core" corpus of papers, we build a citation network of documents. At each step of the citation network, we generate text embeddings and visualize the embeddings through dimensionality reduction. Papers are kept in the dataset if they are "similar" to the core or are otherwise pruned through human-in-the-loop selection. Additional insight into the papers is gained through sub-topic modeling using SeNMFk. We demonstrate our new tool for literature review by applying it to two different fields in machine learning.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
MalwareDNA: Simultaneous Classification of Malware, Malware Families, and Novel Malware
Authors:
Maksim E. Eren,
Manish Bhattarai,
Kim Rasmussen,
Boian S. Alexandrov,
Charles Nicholas
Abstract:
Malware is one of the most dangerous and costly cyber threats to national security and a crucial factor in modern cyber-space. However, the adoption of machine learning (ML) based solutions against malware threats has been relatively slow. Shortcomings in the existing ML approaches are likely contributing to this problem. The majority of current ML approaches ignore real-world challenges such as t…
▽ More
Malware is one of the most dangerous and costly cyber threats to national security and a crucial factor in modern cyber-space. However, the adoption of machine learning (ML) based solutions against malware threats has been relatively slow. Shortcomings in the existing ML approaches are likely contributing to this problem. The majority of current ML approaches ignore real-world challenges such as the detection of novel malware. In addition, proposed ML approaches are often designed either for malware/benign-ware classification or malware family classification. Here we introduce and showcase preliminary capabilities of a new method that can perform precise identification of novel malware families, while also unifying the capability for malware/benign-ware classification and malware family classification into a single framework.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
Robust Adversarial Defense by Tensor Factorization
Authors:
Manish Bhattarai,
Mehmet Cagri Kaymak,
Ryan Barron,
Ben Nebgen,
Kim Rasmussen,
Boian Alexandrov
Abstract:
As machine learning techniques become increasingly prevalent in data analysis, the threat of adversarial attacks has surged, necessitating robust defense mechanisms. Among these defenses, methods exploiting low-rank approximations for input data preprocessing and neural network (NN) parameter factorization have shown potential. Our work advances this field further by integrating the tensorization…
▽ More
As machine learning techniques become increasingly prevalent in data analysis, the threat of adversarial attacks has surged, necessitating robust defense mechanisms. Among these defenses, methods exploiting low-rank approximations for input data preprocessing and neural network (NN) parameter factorization have shown potential. Our work advances this field further by integrating the tensorization of input data with low-rank decomposition and tensorization of NN parameters to enhance adversarial defense. The proposed approach demonstrates significant defense capabilities, maintaining robust accuracy even when subjected to the strongest known auto-attacks. Evaluations against leading-edge robust performance benchmarks reveal that our results not only hold their ground against the best defensive methods available but also exceed all current defense strategies that rely on tensor factorizations. This study underscores the potential of integrating tensorization and low-rank decomposition as a robust defense against adversarial attacks in machine learning.
△ Less
Submitted 3 September, 2023;
originally announced September 2023.
-
Generating Hidden Markov Models from Process Models Through Nonnegative Tensor Factorization
Authors:
Erik Skau,
Andrew Hollis,
Stephan Eidenbenz,
Kim Rasmussen,
Boian Alexandrov
Abstract:
Monitoring of industrial processes is a critical capability in industry and in government to ensure reliability of production cycles, quick emergency response, and national security. Process monitoring allows users to gauge the progress of an organization in an industrial process or predict the degradation or aging of machine parts in processes taking place at a remote location. Similar to many da…
▽ More
Monitoring of industrial processes is a critical capability in industry and in government to ensure reliability of production cycles, quick emergency response, and national security. Process monitoring allows users to gauge the progress of an organization in an industrial process or predict the degradation or aging of machine parts in processes taking place at a remote location. Similar to many data science applications, we usually only have access to limited raw data, such as satellite imagery, short video clips, event logs, and signatures captured by a small set of sensors. To combat data scarcity, we leverage the knowledge of Subject Matter Experts (SMEs) who are familiar with the actions of interest. SMEs provide expert knowledge of the essential activities required for task completion and the resources necessary to carry out each of these activities. Various process mining techniques have been developed for this type of analysis; typically such approaches combine theoretical process models built based on domain expert insights with ad-hoc integration of available pieces of raw data. Here, we introduce a novel mathematically sound method that integrates theoretical process models (as proposed by SMEs) with interrelated minimal Hidden Markov Models (HMM), built via nonnegative tensor factorization. Our method consolidates: (a) theoretical process models, (b) HMMs, (c) coupled nonnegative matrix-tensor factorizations, and (d) custom model selection. To demonstrate our methodology and its abilities, we apply it on simple synthetic and real world process models.
△ Less
Submitted 26 April, 2024; v1 submitted 3 October, 2022;
originally announced October 2022.
-
SeNMFk-SPLIT: Large Corpora Topic Modeling by Semantic Non-negative Matrix Factorization with Automatic Model Selection
Authors:
Maksim E. Eren,
Nick Solovyev,
Manish Bhattarai,
Kim Rasmussen,
Charles Nicholas,
Boian S. Alexandrov
Abstract:
As the amount of text data continues to grow, topic modeling is serving an important role in understanding the content hidden by the overwhelming quantity of documents. One popular topic modeling approach is non-negative matrix factorization (NMF), an unsupervised machine learning (ML) method. Recently, Semantic NMF with automatic model selection (SeNMFk) has been proposed as a modification to NMF…
▽ More
As the amount of text data continues to grow, topic modeling is serving an important role in understanding the content hidden by the overwhelming quantity of documents. One popular topic modeling approach is non-negative matrix factorization (NMF), an unsupervised machine learning (ML) method. Recently, Semantic NMF with automatic model selection (SeNMFk) has been proposed as a modification to NMF. In addition to heuristically estimating the number of topics, SeNMFk also incorporates the semantic structure of the text. This is performed by jointly factorizing the term frequency-inverse document frequency (TF-IDF) matrix with the co-occurrence/word-context matrix, the values of which represent the number of times two words co-occur in a predetermined window of the text. In this paper, we introduce a novel distributed method, SeNMFk-SPLIT, for semantic topic extraction suitable for large corpora. Contrary to SeNMFk, our method enables the joint factorization of large documents by decomposing the word-context and term-document matrices separately. We demonstrate the capability of SeNMFk-SPLIT by applying it to the entire artificial intelligence (AI) and ML scientific literature uploaded on arXiv.
△ Less
Submitted 21 August, 2022;
originally announced August 2022.
-
Electromagnetic Signal Injection Attacks on Differential Signaling
Authors:
Youqian Zhang,
Kasper Rasmussen
Abstract:
Differential signaling is a method of data transmission that uses two complementary electrical signals to encode information. This allows a receiver to reject any noise by looking at the difference between the two signals, assuming the noise affects both signals in the same way. Many protocols such as USB, Ethernet, and HDMI use differential signaling to achieve a robust communication channel in a…
▽ More
Differential signaling is a method of data transmission that uses two complementary electrical signals to encode information. This allows a receiver to reject any noise by looking at the difference between the two signals, assuming the noise affects both signals in the same way. Many protocols such as USB, Ethernet, and HDMI use differential signaling to achieve a robust communication channel in a noisy environment. This generally works well and has led many to believe that it is infeasible to remotely inject attacking signals into such a differential pair. In this paper we challenge this assumption and show that an adversary can in fact inject malicious signals from a distance, purely using common-mode injection, i.e., injecting into both wires at the same time. We show how this allows an attacker to inject bits or even arbitrary messages into a communication line. Such an attack is a significant threat to many applications, from home security and privacy to automotive systems, critical infrastructure, or implantable medical devices; in which incorrect data or unauthorized control could cause significant damage, or even fatal accidents.
We show in detail the principles of how an electromagnetic signal can bypass the noise rejection of differential signaling, and eventually result in incorrect bits in the receiver. We show how an attacker can exploit this to achieve a successful injection of an arbitrary bit, and we analyze the success rate of injecting longer arbitrary messages. We demonstrate the attack on a real system and show that the success rate can reach as high as $90\%$. Finally, we present a case study where we wirelessly inject a message into a Controller Area Network (CAN) bus, which is a differential signaling bus protocol used in many critical applications, including the automotive and aviation sector.
△ Less
Submitted 30 July, 2022;
originally announced August 2022.
-
Detection of Electromagnetic Signal Injection Attacks on Actuator Systems
Authors:
Youqian Zhang,
Kasper Rasmussen
Abstract:
An actuator is a device that converts electricity into another form of energy, typically physical movement. They are absolutely essential for any system that needs to impact or modify the physical world, and are used in millions of systems of all sizes, all over the world, from cars and spacecraft to factory control systems and critical infrastructure. An actuator is a "dumb device" that is entire…
▽ More
An actuator is a device that converts electricity into another form of energy, typically physical movement. They are absolutely essential for any system that needs to impact or modify the physical world, and are used in millions of systems of all sizes, all over the world, from cars and spacecraft to factory control systems and critical infrastructure. An actuator is a "dumb device" that is entirely controlled by the surrounding electronics, e.g., a microcontroller, and thus cannot authenticate its control signals or do any other form of processing. The problem we look at in this paper is how the wires that connect an actuator to its control electronics can act like antennas, picking up electromagnetic signals from the environment. This makes it possible for a remote attacker to wirelessly inject signals (energy) into these wires to bypass the controller and directly control the actuator.
To detect such attacks, we propose a novel detection method that allows the microcontroller to monitor the control signal and detect attacks as a deviation from the intended value. We have managed to do this without requiring the microcontroller to sample the signal at a high rate or run any signal processing. That makes our defense mechanism practical and easy to integrate into existing systems. Our method is general and applies to any type of actuator (provided a few basic assumptions are met), and can deal with adversaries with arbitrarily high transmission power. We implement our detection method on two different practical systems to show its generality, effectiveness, and robustness.
△ Less
Submitted 14 March, 2022;
originally announced March 2022.
-
Silently Disabling ECUs and Enabling Blind Attacks on the CAN Bus
Authors:
Matthew Rogers,
Kasper Rasmussen
Abstract:
The CAN Bus is crucial to the efficiency, and safety of modern vehicle infrastructure. Electronic Control Units (ECUs) exchange data across a shared bus, dropping messages whenever errors occur. If an ECU generates enough errors, their transmitter is put in a bus-off state, turning it off. Previous work abuses this process to disable ECUs, but is trivial to detect through the multiple errors trans…
▽ More
The CAN Bus is crucial to the efficiency, and safety of modern vehicle infrastructure. Electronic Control Units (ECUs) exchange data across a shared bus, dropping messages whenever errors occur. If an ECU generates enough errors, their transmitter is put in a bus-off state, turning it off. Previous work abuses this process to disable ECUs, but is trivial to detect through the multiple errors transmitted over the bus. We propose a novel attack, undetectable by prior intrusion detection systems, which disables ECUs within a single message without generating any errors on the bus. Performing this attack requires the ability to flip bits on the bus, but not with any level of sophistication. We show that an attacker who can only flip bits 40% of the time can execute our stealthy attack 100% of the time. But this attack, and all prior CAN attacks, rely on the ability to read the bus. We propose a new technique which synchronizes the bus, such that even a blind attacker, incapable of reading the bus, can know when to transmit. Taking a limited attacker's chance of success from the percentage of dead bus time, to 100%. Finally, we propose a small modification to the CAN error process to ensure an ECU cannot fail without being detected, no matter how advanced the attacker is. Taken together we advance the state of the art for CAN attacks and blind attackers, while proposing a detection system against stealthy attacks, and the larger problem of CAN's abusable error frames.
△ Less
Submitted 17 January, 2022;
originally announced January 2022.
-
RegGuard: Leveraging CPU Registers for Mitigation of Control- and Data-Oriented Attacks
Authors:
Munir Geden,
Kasper Rasmussen
Abstract:
CPU registers are small discrete storage units, used to hold temporary data and instructions within the CPU. Registers are not addressable in the same way memory is, which makes them immune from memory attacks and manipulation by other means. In this paper, we take advantage of this to provide a protection mechanism for critical program data; both active local variables and control objects on the…
▽ More
CPU registers are small discrete storage units, used to hold temporary data and instructions within the CPU. Registers are not addressable in the same way memory is, which makes them immune from memory attacks and manipulation by other means. In this paper, we take advantage of this to provide a protection mechanism for critical program data; both active local variables and control objects on the stack. This protection effectively eliminates the threat of control- and data-oriented attacks, even by adversaries with full knowledge of the active stack. Our solution RegGuard, is a compiler register allocation strategy that utilises the available CPU registers to hold critical variables during execution. Unlike conventional allocations schemes, RegGuard prioritises the security significance of a program variable over its expected performance gain. Our scheme can deal effectively with saved registers to the stack, i.e., when the compiler needs to free up registers to make room for the variables of a new function call. With RegGuard, critical data objects anywhere on the entire stack are effectively protected from corruption, even by adversaries with arbitrary read and write access. While our primary design focus is on security, performance is very important for a scheme to be adopted in practice. RegGuard is still benefiting from the performance gain normally associated with register allocations, and the overhead is within a few percent of other unsecured register allocation schemes for most cases. We present detailed experiments that showcase the performance of RegGuard using different benchmark programs and the C library on ARM64 platform.
△ Less
Submitted 20 October, 2021;
originally announced October 2021.
-
Open-Source Verification with Chisel and Scala
Authors:
Andrew Dobis,
Tjark Petersen,
Kasper Juul Hesse Rasmussen,
Enrico Tolotto,
Hans Jakob Damsgaard,
Simon Thye Andersen,
Richard Lin,
Martin Schoeberl
Abstract:
Performance increase with general-purpose processors has come to a halt. We can no longer depend on Moore's Law to increase computing performance. The only way to achieve higher performance or lower energy consumption is by building domain-specific hardware accelerators. To efficiently design and verify those domain-specific accelerators, we need agile hardware development. One of the main obstacl…
▽ More
Performance increase with general-purpose processors has come to a halt. We can no longer depend on Moore's Law to increase computing performance. The only way to achieve higher performance or lower energy consumption is by building domain-specific hardware accelerators. To efficiently design and verify those domain-specific accelerators, we need agile hardware development. One of the main obstacles when proposing such a modern method is the lack of modern tools to attack it. To be able to verify a design in such a time-constrained development method, one needs to have efficient tools both for design and verification.
This paper thus proposes ChiselVerify, an open-source tool for verifying circuits described in any Hardware Description Language. It builds on top of the Chisel hardware construction language and uses Scala to drive the verification using a testing strategy inspired by the Universal Verification Methodology (UVM) and adapted for designs described in Chisel. ChiselVerify is created based on three key ideas. First, our solution highly increases the productivity of the verification engineer, by allowing hardware testing to be done in a modern high-level programming environment. Second, the framework functions with any hardware description language thanks to the flexibility of Chisel blackboxes. Finally, the solution is well integrated into the existing Chisel universe, making it an extension of currently existing testing libraries.
We implement ChiselVerify in a way inspired by the functionalities found in SystemVerilog. This allows one to use functional coverage, constrained-random verification, bus functional models, transaction-level modeling and much more during the verification process of a design in a contemporary high-level programming ecosystem.
△ Less
Submitted 26 February, 2021;
originally announced February 2021.
-
Identification of Anomalous Diffusion Sources by Unsupervised Learning
Authors:
Raviteja Vangara,
Kim Ø. Rasmussen,
Dimiter N. Petsev,
Golan Bel,
Boian S. Alexandrov
Abstract:
Fractional Brownian motion (fBm) is a ubiquitous diffusion process in which the memory effects of the stochastic transport result in the mean squared particle displacement following a power law, $\langle {Δr}^2 \rangle \sim t^α$, where the diffusion exponent $α$ characterizes whether the transport is subdiffusive, ($α<1$), diffusive ($α= 1$), or superdiffusive, ($α>1$). Due to the abundance of fBm…
▽ More
Fractional Brownian motion (fBm) is a ubiquitous diffusion process in which the memory effects of the stochastic transport result in the mean squared particle displacement following a power law, $\langle {Δr}^2 \rangle \sim t^α$, where the diffusion exponent $α$ characterizes whether the transport is subdiffusive, ($α<1$), diffusive ($α= 1$), or superdiffusive, ($α>1$). Due to the abundance of fBm processes in nature, significant efforts have been devoted to the identification and characterization of fBm sources in various phenomena. In practice, the identification of the fBm sources often relies on solving a complex and ill-posed inverse problem based on limited observed data. In the general case, the detected signals are formed by an unknown number of release sources, located at different locations and with different strengths, that act simultaneously. This means that the observed data is composed of mixtures of releases from an unknown number of sources, which makes the traditional inverse modeling approaches unreliable. Here, we report an unsupervised learning method, based on Nonnegative Matrix Factorization, that enables the identification of the unknown number of release sources as well the anomalous diffusion characteristics based on limited observed data and the general form of the corresponding fBm Green's function. We show that our method performs accurately for different types of sources and configurations with a predetermined number of sources with specific characteristics and introduced noise.
△ Less
Submitted 5 October, 2020;
originally announced October 2020.
-
BLURtooth: Exploiting Cross-Transport Key Derivation in Bluetooth Classic and Bluetooth Low Energy
Authors:
Daniele Antonioli,
Nils Ole Tippenhauer,
Kasper Rasmussen,
Mathias Payer
Abstract:
The Bluetooth standard specifies two transports: Bluetooth Classic (BT) for high-throughput wireless services and Bluetooth Low Energy (BLE) for very low-power scenarios. BT and BLE have dedicated pairing protocols and devices have to pair over BT and BLE to use both securely. In 2014, the Bluetooth standard (v4.2) addressed this usability issue by introducing Cross-Transport Key Derivation (CTKD)…
▽ More
The Bluetooth standard specifies two transports: Bluetooth Classic (BT) for high-throughput wireless services and Bluetooth Low Energy (BLE) for very low-power scenarios. BT and BLE have dedicated pairing protocols and devices have to pair over BT and BLE to use both securely. In 2014, the Bluetooth standard (v4.2) addressed this usability issue by introducing Cross-Transport Key Derivation (CTKD). CTKD allows establishing BT and BLE pairing keys just by pairing over one of the two transports. While CTKD crosses the security boundary between BT and BLE, little is known about the internals of CTKD and its security implications.
In this work, we present the first complete description of CTKD obtained by merging the scattered information from the Bluetooth standard with the results from our reverse-engineering experiments. Then, we perform a security evaluation of CTKD and uncover four cross-transport issues in its specification. We leverage these issues to design four standard-compliant attacks on CTKD enabling new ways to exploit Bluetooth (e.g., exploiting BT and BLE by targeting only one of the two). Our attacks work even if the strongest security mechanism for BT and BLE are in place, including Numeric Comparison and Secure Connections. They allow to impersonate, man-in-the-middle, and establish unintended sessions with arbitrary devices. We refer to our attacks as BLUR attacks, as they blur the security boundary between BT and BLE. We provide a low-cost implementation of the BLUR attacks and we successfully evaluate them on 14 devices with 16 unique Bluetooth chips from popular vendors. We discuss the attacks' root causes and present effective countermeasures to fix them. We disclosed our findings and countermeasures to the Bluetooth SIG in May 2020 (CVE-2020-15802), and we reported additional unmitigated issues in May 2021.
△ Less
Submitted 8 November, 2021; v1 submitted 24 September, 2020;
originally announced September 2020.
-
$α$ Belief Propagation for Approximate Inference
Authors:
Dong Liu,
Minh Thành Vu,
Zuxing Li,
Lars K. Rasmussen
Abstract:
Belief propagation (BP) algorithm is a widely used message-passing method for inference in graphical models. BP on loop-free graphs converges in linear time. But for graphs with loops, BP's performance is uncertain, and the understanding of its solution is limited. To gain a better understanding of BP in general graphs, we derive an interpretable belief propagation algorithm that is motivated by m…
▽ More
Belief propagation (BP) algorithm is a widely used message-passing method for inference in graphical models. BP on loop-free graphs converges in linear time. But for graphs with loops, BP's performance is uncertain, and the understanding of its solution is limited. To gain a better understanding of BP in general graphs, we derive an interpretable belief propagation algorithm that is motivated by minimization of a localized $α$-divergence. We term this algorithm as $α$ belief propagation ($α$-BP). It turns out that $α$-BP generalizes standard BP. In addition, this work studies the convergence properties of $α$-BP. We prove and offer the convergence conditions for $α$-BP. Experimental simulations on random graphs validate our theoretical results. The application of $α$-BP to practical problems is also demonstrated.
△ Less
Submitted 27 June, 2020;
originally announced June 2020.
-
Region-based Energy Neural Network for Approximate Inference
Authors:
Dong Liu,
Ragnar Thobaben,
Lars K. Rasmussen
Abstract:
Region-based free energy was originally proposed for generalized belief propagation (GBP) to improve loopy belief propagation (loopy BP). In this paper, we propose a neural network based energy model for inference in general Markov random fields (MRFs), which directly minimizes the region-based free energy defined on region graphs. We term our model Region-based Energy Neural Network (RENN). Unlik…
▽ More
Region-based free energy was originally proposed for generalized belief propagation (GBP) to improve loopy belief propagation (loopy BP). In this paper, we propose a neural network based energy model for inference in general Markov random fields (MRFs), which directly minimizes the region-based free energy defined on region graphs. We term our model Region-based Energy Neural Network (RENN). Unlike message-passing algorithms, RENN avoids iterative message propagation and is faster. Also different from recent deep neural network based models, inference by RENN does not require sampling, and RENN works on general MRFs. RENN can also be employed for MRF learning. Our experiments on marginal distribution estimation, partition function estimation, and learning of MRFs show that RENN outperforms the mean field method, loopy BP, GBP, and the state-of-the-art neural network based model.
△ Less
Submitted 17 June, 2020;
originally announced June 2020.
-
On Dominant Interference in Random Networks and Communication Reliability
Authors:
Dong Liu,
Baptiste Cavarec,
Lars K. Rasmussen,
Jing Yue
Abstract:
In this paper, we study the characteristics of dominant interference power with directional reception in a random network modelled by a Poisson Point Process. Additionally, the Laplace functional of cumulative interference excluding the $n$ dominant interferers is also derived, which turns out to be a generalization of omni-directional reception and complete accumulative interference. As an applic…
▽ More
In this paper, we study the characteristics of dominant interference power with directional reception in a random network modelled by a Poisson Point Process. Additionally, the Laplace functional of cumulative interference excluding the $n$ dominant interferers is also derived, which turns out to be a generalization of omni-directional reception and complete accumulative interference. As an application of these results, we study the impact of directional receivers in random networks in terms of outage probability and error probability with queue length constraint.
△ Less
Submitted 3 March, 2020;
originally announced March 2020.
-
Powering Hidden Markov Model by Neural Network based Generative Models
Authors:
Dong Liu,
Antoine Honoré,
Saikat Chatterjee,
Lars K. Rasmussen
Abstract:
Hidden Markov model (HMM) has been successfully used for sequential data modeling problems. In this work, we propose to power the modeling capacity of HMM by bringing in neural network based generative models. The proposed model is termed as GenHMM. In the proposed GenHMM, each HMM hidden state is associated with a neural network based generative model that has tractability of exact likelihood and…
▽ More
Hidden Markov model (HMM) has been successfully used for sequential data modeling problems. In this work, we propose to power the modeling capacity of HMM by bringing in neural network based generative models. The proposed model is termed as GenHMM. In the proposed GenHMM, each HMM hidden state is associated with a neural network based generative model that has tractability of exact likelihood and provides efficient likelihood computation. A generative model in GenHMM consists of mixture of generators that are realized by flow models. A learning algorithm for GenHMM is proposed in expectation-maximization framework. The convergence of the learning GenHMM is analyzed. We demonstrate the efficiency of GenHMM by classification tasks on practical sequential data. Code available at https://github.com/FirstHandScientist/genhmm.
△ Less
Submitted 24 May, 2020; v1 submitted 13 October, 2019;
originally announced October 2019.
-
$α$ Belief Propagation as Fully Factorized Approximation
Authors:
Dong Liu,
Nima N. Moghadam,
Lars K. Rasmussen,
Jinliang Huang,
Saikat Chatterjee
Abstract:
Belief propagation (BP) can do exact inference in loop-free graphs, but its performance could be poor in graphs with loops, and the understanding of its solution is limited. This work gives an interpretable belief propagation rule that is actually minimization of a localized $α$-divergence. We term this algorithm as $α$ belief propagation ($α$-BP). The performance of $α$-BP is tested in MAP (maxim…
▽ More
Belief propagation (BP) can do exact inference in loop-free graphs, but its performance could be poor in graphs with loops, and the understanding of its solution is limited. This work gives an interpretable belief propagation rule that is actually minimization of a localized $α$-divergence. We term this algorithm as $α$ belief propagation ($α$-BP). The performance of $α$-BP is tested in MAP (maximum a posterior) inference problems, where $α$-BP can outperform (loopy) BP by a significant margin even in fully-connected graphs.
△ Less
Submitted 23 August, 2019;
originally announced August 2019.
-
Neural Network based Explicit Mixture Models and Expectation-maximization based Learning
Authors:
Dong Liu,
Minh Thành Vu,
Saikat Chatterjee,
Lars K. Rasmussen
Abstract:
We propose two neural network based mixture models in this article. The proposed mixture models are explicit in nature. The explicit models have analytical forms with the advantages of computing likelihood and efficiency of generating samples. Computation of likelihood is an important aspect of our models. Expectation-maximization based algorithms are developed for learning parameters of the propo…
▽ More
We propose two neural network based mixture models in this article. The proposed mixture models are explicit in nature. The explicit models have analytical forms with the advantages of computing likelihood and efficiency of generating samples. Computation of likelihood is an important aspect of our models. Expectation-maximization based algorithms are developed for learning parameters of the proposed models. We provide sufficient conditions to realize the expectation-maximization based learning. The main requirements are invertibility of neural networks that are used as generators and Jacobian computation of functional form of the neural networks. The requirements are practically realized using a flow-based neural network. In our first mixture model, we use multiple flow-based neural networks as generators. Naturally the model is complex. A single latent variable is used as the common input to all the neural networks. The second mixture model uses a single flow-based neural network as a generator to reduce complexity. The single generator has a latent variable input that follows a Gaussian mixture distribution. We demonstrate efficiency of proposed mixture models through extensive experiments for generating samples and maximum likelihood based classification.
△ Less
Submitted 24 May, 2020; v1 submitted 31 July, 2019;
originally announced July 2019.
-
Taxonomy and Challenges of Out-of-Band Signal Injection Attacks and Defenses
Authors:
Ilias Giechaskiel,
Kasper Bonne Rasmussen
Abstract:
Recent research has shown that the integrity of sensor measurements can be violated through out-of-band signal injection attacks. These attacks target the conversion process from a physical quantity to an analog property---a process that fundamentally cannot be authenticated. Out-of-band signal injection attacks thus pose previously-unexplored security risks by exploiting hardware imperfections in…
▽ More
Recent research has shown that the integrity of sensor measurements can be violated through out-of-band signal injection attacks. These attacks target the conversion process from a physical quantity to an analog property---a process that fundamentally cannot be authenticated. Out-of-band signal injection attacks thus pose previously-unexplored security risks by exploiting hardware imperfections in the sensors themselves, or in their interfaces to microcontrollers. In response to the growing-yet-disjointed literature in the subject, this article presents the first survey of out-of-band signal injection attacks. It focuses on unifying their terminology and identifying commonalities in their causes and effects through a chronological, evolutionary, and thematic taxonomy of attacks. By highlighting cross-influences between different types of out-of-band signal injections, this paper underscores the need for a common language irrespective of the attack method. By placing attack and defense mechanisms in the wider context of their dual counterparts of side-channel leakage and electromagnetic interference, this study identifies common threads and gaps that can help guide and inform future research. Overall, the ever-increasing reliance on sensors embedded in everyday commodity devices necessitates that a stronger focus be placed on improving the security of such systems against out-of-band signal injection attacks.
△ Less
Submitted 16 March, 2020; v1 submitted 21 January, 2019;
originally announced January 2019.
-
A Framework for Evaluating Security in the Presence of Signal Injection Attacks
Authors:
Ilias Giechaskiel,
Youqian Zhang,
Kasper B. Rasmussen
Abstract:
Sensors are embedded in security-critical applications from medical devices to nuclear power plants, but their outputs can be spoofed through electromagnetic and other types of signals transmitted by attackers at a distance. To address the lack of a unifying framework for evaluating the effects of such transmissions, we introduce a system and threat model for signal injection attacks. We further d…
▽ More
Sensors are embedded in security-critical applications from medical devices to nuclear power plants, but their outputs can be spoofed through electromagnetic and other types of signals transmitted by attackers at a distance. To address the lack of a unifying framework for evaluating the effects of such transmissions, we introduce a system and threat model for signal injection attacks. We further define the concepts of existential, selective, and universal security, which address attacker goals from mere disruptions of the sensor readings to precise waveform injections. Moreover, we introduce an algorithm which allows circuit designers to concretely calculate the security level of real systems. Finally, we apply our definitions and algorithm in practice using measurements of injections against a smartphone microphone, and analyze the demodulation characteristics of commercial Analog-to-Digital Converters (ADCs). Overall, our work highlights the importance of evaluating the susceptibility of systems against signal injection attacks, and introduces both the terminology and the methodology to do so.
△ Less
Submitted 10 November, 2019; v1 submitted 11 January, 2019;
originally announced January 2019.
-
Entropy-regularized Optimal Transport Generative Models
Authors:
Dong Liu,
Minh Thành Vu,
Saikat Chatterjee,
Lars K. Rasmussen
Abstract:
We investigate the use of entropy-regularized optimal transport (EOT) cost in developing generative models to learn implicit distributions. Two generative models are proposed. One uses EOT cost directly in an one-shot optimization problem and the other uses EOT cost iteratively in an adversarial game. The proposed generative models show improved performance over contemporary models for image gener…
▽ More
We investigate the use of entropy-regularized optimal transport (EOT) cost in developing generative models to learn implicit distributions. Two generative models are proposed. One uses EOT cost directly in an one-shot optimization problem and the other uses EOT cost iteratively in an adversarial game. The proposed generative models show improved performance over contemporary models for image generation on MNSIT.
△ Less
Submitted 16 November, 2018;
originally announced November 2018.
-
On the Feasibility of Fine-Grained TLS Security Configurations in Web Browsers Based on the Requested Domain Name
Authors:
Eman Salem Alashwali,
Kasper Rasmussen
Abstract:
Most modern web browsers today sacrifice optimal TLS security for backward compatibility. They apply coarse-grained TLS configurations that support (by default) legacy versions of the protocol that have known design weaknesses, and weak ciphersuites that provide fewer security guarantees (e.g. non Forward Secrecy), and silently fall back to them if the server selects to. This introduces various ri…
▽ More
Most modern web browsers today sacrifice optimal TLS security for backward compatibility. They apply coarse-grained TLS configurations that support (by default) legacy versions of the protocol that have known design weaknesses, and weak ciphersuites that provide fewer security guarantees (e.g. non Forward Secrecy), and silently fall back to them if the server selects to. This introduces various risks including downgrade attacks such as the POODLE attack [15] that exploits the browsers silent fallback mechanism to downgrade the protocol version in order to exploit the legacy version flaws. To achieve a better balance between security and backward compatibility, we propose a mechanism for fine-grained TLS configurations in web browsers based on the sensitivity of the domain name in the HTTPS request using a whitelisting technique. That is, the browser enforces optimal TLS configurations for connections going to sensitive domains while enforcing default configurations for the rest of the connections. We demonstrate the feasibility of our proposal by implementing a proof-of-concept as a Firefox browser extension. We envision this mechanism as a built-in security feature in web browsers, e.g. a button similar to the \quotes{Bookmark} button in Firefox browsers and as a standardised HTTP header, to augment browsers security.
△ Less
Submitted 15 September, 2018;
originally announced September 2018.
-
What's in a Downgrade? A Taxonomy of Downgrade Attacks in the TLS Protocol and Application Protocols Using TLS
Authors:
Eman Salem Alashwali,
Kasper Rasmussen
Abstract:
A number of important real-world protocols including the Transport Layer Security (TLS) protocol have the ability to negotiate various security-related choices such as the protocol version and the cryptographic algorithms to be used in a particular session. Furthermore, some insecure application-layer protocols such as the Simple Mail Transfer Protocol (SMTP) negotiate the use of TLS itself on top…
▽ More
A number of important real-world protocols including the Transport Layer Security (TLS) protocol have the ability to negotiate various security-related choices such as the protocol version and the cryptographic algorithms to be used in a particular session. Furthermore, some insecure application-layer protocols such as the Simple Mail Transfer Protocol (SMTP) negotiate the use of TLS itself on top of the application protocol to secure the communication channel. These protocols are often vulnerable to a class of attacks known as downgrade attacks which targets this negotiation mechanism. In this paper we create the first taxonomy of TLS downgrade attacks. Our taxonomy classifies possible attacks with respect to four different vectors: the protocol element that is targeted, the type of vulnerability that enables the attack, the attack method, and the level of damage that the attack causes. We base our taxonomy on a thorough analysis of fifteen notable published attacks. Our taxonomy highlights clear and concrete aspects that many downgrade attacks have in common, and allows for a common language, classification, and comparison of downgrade attacks. We demonstrate the application of our taxonomy by classifying the surveyed attacks.
△ Less
Submitted 26 January, 2019; v1 submitted 15 September, 2018;
originally announced September 2018.
-
Locally Convex Sparse Learning over Networks
Authors:
Ahmed Zaki,
Saikat Chatterjee,
Partha P. Mitra,
Lars K. Rasmussen
Abstract:
We consider a distributed learning setup where a sparse signal is estimated over a network. Our main interest is to save communication resource for information exchange over the network and reduce processing time. Each node of the network uses a convex optimization based algorithm that provides a locally optimum solution for that node. The nodes exchange their signal estimates over the network in…
▽ More
We consider a distributed learning setup where a sparse signal is estimated over a network. Our main interest is to save communication resource for information exchange over the network and reduce processing time. Each node of the network uses a convex optimization based algorithm that provides a locally optimum solution for that node. The nodes exchange their signal estimates over the network in order to refine their local estimates. At a node, the optimization algorithm is based on an $\ell_1$-norm minimization with appropriate modifications to promote sparsity as well as to include influence of estimates from neighboring nodes. Our expectation is that local estimates in each node improve fast and converge, resulting in a limited demand for communication of estimates between nodes and reducing the processing time. We provide restricted-isometry-property (RIP)-based theoretical analysis on estimation quality. In the scenario of clean observation, it is shown that the local estimates converge to the exact sparse signal under certain technical conditions. Simulation results show that the proposed algorithms show competitive performance compared to a globally optimum distributed LASSO algorithm in the sense of convergence speed and estimation error.
△ Less
Submitted 31 March, 2018;
originally announced April 2018.
-
Golden Angle Modulation: Approaching the AWGN Capacity
Authors:
Peter Larsson,
Lars K. Rasmussen,
Mikael Skoglund
Abstract:
In this work, targeting, e.g., future generation cellular, microwave-links, or optical fiber systems, we propose a new geometric shaping design for golden angle modulation (GAM) based on a (double) truncated Gaussian input distribution. The design improves the mutual information (MI), and the peak-to-average power ratio, over the full signal-to-noise ratio (SNR) range relative to two key GAM schem…
▽ More
In this work, targeting, e.g., future generation cellular, microwave-links, or optical fiber systems, we propose a new geometric shaping design for golden angle modulation (GAM) based on a (double) truncated Gaussian input distribution. The design improves the mutual information (MI), and the peak-to-average power ratio, over the full signal-to-noise ratio (SNR) range relative to two key GAM schemes introduced in [1],[2]. Inspired by the proposed geometric shaping, a simpler, SNR-dependent, design is also suggested. The performance is numerically evaluated with respect to MI and compared with classical modulation schemes. With the proposed design, the SNR can be decreased relative to classical quadrature amplitude modulation, even for relatively modest target spectral efficiencies. As the GAM design can approach the Gaussian channel capacity, the power/energy efficiency is expected to improve.
△ Less
Submitted 27 February, 2018;
originally announced February 2018.
-
The Golden Quantizer: The Complex Gaussian Random Variable Case
Authors:
Peter Larsson,
Lars K. Rasmussen,
Mikael Skoglund
Abstract:
The problem of quantizing a circularly-symmetric complex Gaussian random variable is considered. For this purpose, we design two non-uniform quantizers, a high-rate-, and a Lloyd-Max-, quantizer that are both based on the (golden angle) spiral-phyllotaxis packing principle. We find that the proposed schemes have lower mean-square error distortion compared to (non)-uniform polar/rectangular-quantiz…
▽ More
The problem of quantizing a circularly-symmetric complex Gaussian random variable is considered. For this purpose, we design two non-uniform quantizers, a high-rate-, and a Lloyd-Max-, quantizer that are both based on the (golden angle) spiral-phyllotaxis packing principle. We find that the proposed schemes have lower mean-square error distortion compared to (non)-uniform polar/rectangular-quantizers, and near-identical to the best performing trained vector quantizers. The proposed quantizer scheme offers a structured design, a simple natural index ordering, and allow for any number of centroids.
△ Less
Submitted 10 September, 2017;
originally announced September 2017.
-
The Matrix Exponential Distribution - A Tool for Wireless System Performance Analysis
Authors:
Peter Larsson,
Lars K. Rasmussen,
Mikael Skoglund
Abstract:
In [1], we introduced a new, matrix algebraic, performance analysis framework for wireless systems with fading channels based on the matrix exponential distribution. The main idea was to use the compact, powerful, and easy-to-use, matrix exponential (ME)-distribution for i) modeling the unprocessed channel signal to noise ratio (SNR), ii) exploiting the closure property of the ME-distribution for…
▽ More
In [1], we introduced a new, matrix algebraic, performance analysis framework for wireless systems with fading channels based on the matrix exponential distribution. The main idea was to use the compact, powerful, and easy-to-use, matrix exponential (ME)-distribution for i) modeling the unprocessed channel signal to noise ratio (SNR), ii) exploiting the closure property of the ME-distribution for SNR processing operations to give the effective channel random variable (r.v.) on ME-distribution form, and then to iii) express the performance measure in a closed-form based on ME-distribution matrix/vector parameters only. In this work, we aim to more clearly present, formalize, refine and develop this unified bottom-up analysis framework, show its versatility to handle important communication cases, performance evaluation levels, and performance metrics. The bivariate ME-distribution is introduced here as yet another useful ME-tool, e.g. to account for dependency among two r.v.s. We propose that the ME-distribution may, in addition to fading, also characterize the pdf of discrete-time signal r.v.s, thus extending the ME-distribution matrix form to new generalized 1D/2D-Gaussian-, and Rayleigh-, distribution-like matrix forms. Our findings here, strengthen the observation from [1], [2], and indicates that the ME-distribution can be a promising tool for wireless system modeling and performance analysis.
△ Less
Submitted 20 December, 2016;
originally announced December 2016.
-
Leaky Wires: Information Leakage and Covert Communication Between FPGA Long Wires
Authors:
Ilias Giechaskiel,
Kasper B. Rasmussen,
Ken Eguro
Abstract:
Field-Programmable Gate Arrays (FPGAs) are integrated circuits that implement reconfigurable hardware. They are used in modern systems, creating specialized, highly-optimized integrated circuits without the need to design and manufacture dedicated chips. As the capacity of FPGAs grows, it is increasingly common for designers to incorporate implementations of algorithms and protocols from a range o…
▽ More
Field-Programmable Gate Arrays (FPGAs) are integrated circuits that implement reconfigurable hardware. They are used in modern systems, creating specialized, highly-optimized integrated circuits without the need to design and manufacture dedicated chips. As the capacity of FPGAs grows, it is increasingly common for designers to incorporate implementations of algorithms and protocols from a range of third-party sources. The monolithic nature of FPGAs means that all on-chip circuits, including third party black-box designs, must share common on-chip infrastructure, such as routing resources. In this paper, we observe that a "long" routing wire carrying a logical 1 reduces the propagation delay of other adjacent but unconnected long wires in the FPGA interconnect, thereby leaking information about its state. We exploit this effect and propose a communication channel that can be used for both covert transmissions between circuits, and for exfiltration of secrets from the chip. We show that the effect is measurable for both static and dynamic signals, and that it can be detected using very small on-board circuits. In our prototype, we are able to correctly infer the logical state of an adjacent long wire over 99% of the time, even without error correction, and for signals that are maintained for as little as 82us. Using a Manchester encoding scheme, our channel bandwidth is as high as 6kbps. We characterize the channel in detail and show that it is measurable even when multiple competing circuits are present and can be replicated on different generations and families of Xilinx devices (Virtex 5, Virtex 6, and Artix 7). Finally, we propose countermeasures that can be deployed by systems and tools designers to reduce the impact of this information leakage.
△ Less
Submitted 13 March, 2018; v1 submitted 27 November, 2016;
originally announced November 2016.
-
Effective Capacity of Retransmission Schemes - A Recurrence Relation Approach
Authors:
Peter Larsson,
James Gross,
Hussein Al-Zubaidy,
Lars K. Rasmussen,
Mikael Skoglund
Abstract:
We consider the effective capacity performance measure of persistent- and truncated-retransmission schemes that can involve any combination of multiple transmissions per packet, multiple communication modes, or multiple packet communication. We present a structured unified analytical approach, based on a random walk model and recurrence relation formulation, and give exact effective capacity expre…
▽ More
We consider the effective capacity performance measure of persistent- and truncated-retransmission schemes that can involve any combination of multiple transmissions per packet, multiple communication modes, or multiple packet communication. We present a structured unified analytical approach, based on a random walk model and recurrence relation formulation, and give exact effective capacity expressions for persistent hybrid automatic repeat request (HARQ) and for truncated-retransmission schemes. For the latter, effective capacity expressions are given for systems with finite (infinite) time horizon on an algebraic (spectral radius-based) form of a special block companion matrix. In contrast to prior HARQ models, assuming infinite time horizon, the proposed method does not involve a non-trivial per case modeling step. We give effective capacity expressions for several important cases that have not been addressed before, e.g. persistent-HARQ, truncated-HARQ, network-coded ARQ (NC-ARQ), two-mode-ARQ, and multilayer-ARQ. We propose an alternative QoS parameter (instead of the commonly used moment generating function parameter) that represents explicitly the target delay and the delay violation probability. This also enables closed-form expressions for many of the studied systems. Moreover, we use the recently proposed matrix-exponential distributed (MED) modeling of wireless fading channels to provide the basis for numerous new effective capacity results for HARQ.
△ Less
Submitted 17 August, 2016; v1 submitted 28 January, 2016;
originally announced January 2016.
-
Cooperative Communication Using Network Coding
Authors:
Nan Li,
Lars K. Rasmussen,
Ming Xiao
Abstract:
We consider a cognitive radio network scenario where a primary transmitter and a secondary transmitter, respectively, communicate a message to their respective primary receiver and secondary receiver over a packet-based wireless link, using a joint automatic-repeat-request (ARQ) error control scheme. The secondary transmitter assists in the retransmission of the primary message, which improves the…
▽ More
We consider a cognitive radio network scenario where a primary transmitter and a secondary transmitter, respectively, communicate a message to their respective primary receiver and secondary receiver over a packet-based wireless link, using a joint automatic-repeat-request (ARQ) error control scheme. The secondary transmitter assists in the retransmission of the primary message, which improves the primary performance, and is granted limited access to the transmission resources. Conventional ARQ, as well as two network-coding schemes are investigated for application in the retransmission phase; namely the static network-coding (SNC) scheme and the adaptive network-coding (ANC) scheme. For each scheme we analyze the transmission process by investigating the distribution of the number of transmission attempts and approximate it by normal distributions. Considering both the cases of an adaptive frame size and a truncated frame size, we derive analytical results on packet throughput and infer that the ANC scheme outperforms the SNC scheme.
△ Less
Submitted 10 December, 2015; v1 submitted 27 November, 2015;
originally announced November 2015.
-
On the Transmit Beamforming for MIMO Wiretap Channels: Large-System Analysis
Authors:
Maksym A. Girnyk,
Frédéric Gabry,
Mikko Vehkaperä,
Lars K. Rasmussen,
Mikael Skoglund
Abstract:
With the growth of wireless networks, security has become a fundamental issue in wireless communications due to the broadcast nature of these networks. In this work, we consider MIMO wiretap channels in a fast fading environment, for which the overall performance is characterized by the ergodic MIMO secrecy rate. Unfortunately, the direct solution to finding ergodic secrecy rates is prohibitive du…
▽ More
With the growth of wireless networks, security has become a fundamental issue in wireless communications due to the broadcast nature of these networks. In this work, we consider MIMO wiretap channels in a fast fading environment, for which the overall performance is characterized by the ergodic MIMO secrecy rate. Unfortunately, the direct solution to finding ergodic secrecy rates is prohibitive due to the expectations in the rates expressions in this setting. To overcome this difficulty, we invoke the large-system assumption, which allows a deterministic approximation to the ergodic mutual information. Leveraging results from random matrix theory, we are able to characterize the achievable ergodic secrecy rates. Based on this characterization, we address the problem of covariance optimization at the transmitter. Our numerical results demonstrate a good match between the large-system approximation and the actual simulated secrecy rates, as well as some interesting features of the precoder optimization.
△ Less
Submitted 1 November, 2014;
originally announced November 2014.
-
On the Optimal Precoding for MIMO Gaussian Wire-Tap Channels
Authors:
Arash Khabbazibasmenj,
Maksym A. Girnyk,
Sergiy A. Vorobyov,
Mikko Vehkaperä,
Lars K. Rasmussen
Abstract:
We consider the problem of finding secrecy rate of a multiple-input multiple-output (MIMO) wire-tap channel. A transmitter, a legitimate receiver, and an eavesdropper are all equipped with multiple antennas. The channel states from the transmitter to the legitimate user and to the eavesdropper are assumed to be known at the transmitter. In this contribution, we address the problem of finding the o…
▽ More
We consider the problem of finding secrecy rate of a multiple-input multiple-output (MIMO) wire-tap channel. A transmitter, a legitimate receiver, and an eavesdropper are all equipped with multiple antennas. The channel states from the transmitter to the legitimate user and to the eavesdropper are assumed to be known at the transmitter. In this contribution, we address the problem of finding the optimal precoder/transmit covariance matrix maximizing the secrecy rate of the given wiretap channel. The problem formulation is shown to be equivalent to a difference of convex functions programming problem and an efficient algorithm for addressing this problem is developed.
△ Less
Submitted 1 November, 2014;
originally announced November 2014.
-
Asymptotic Performance Analysis of a K-Hop Amplify-and-Forward Relay MIMO Channel
Authors:
Maksym A. Girnyk,
Mikko Vehkaperä,
Lars K. Rasmussen
Abstract:
The present paper studies the asymptotic performance of multi-hop amplify-and-forward relay multiple-antenna communication channels. Each multi-antenna terminal in the network amplifies the received signal, sent by a source, and retransmits it upstream towards a destination. Achievable ergodic rates under both jointly optimal detection and decoding and practical separate decoding schemes for arbit…
▽ More
The present paper studies the asymptotic performance of multi-hop amplify-and-forward relay multiple-antenna communication channels. Each multi-antenna terminal in the network amplifies the received signal, sent by a source, and retransmits it upstream towards a destination. Achievable ergodic rates under both jointly optimal detection and decoding and practical separate decoding schemes for arbitrary signaling schemes, along with the average bit error rate for various receiver structures are derived in the regime where the number of antennas at each terminal grows large without a bound. To overcome the difficulty of averaging over channel realizations we apply large-system analysis based on the replica method from statistical physics. The validity of the large-system analysis is further verified through Monte Carlo simulations of realistic finite-sized systems.
△ Less
Submitted 9 March, 2016; v1 submitted 21 October, 2014;
originally announced October 2014.
-
Buffer-Based Distributed LT Codes
Authors:
Iqbal Hussain,
Ming Xiao,
Lars K. Rasmussen
Abstract:
We focus on the design of distributed Luby transform (DLT) codes for erasure networks with multiple sources and multiple relays, communicating to a single destination. The erasure-floor performance of DLT codes improves with the maximum degree of the relay-degree distribution. However, for conventional DLT codes, the maximum degree is upper-bounded by the number of sources. An additional constrain…
▽ More
We focus on the design of distributed Luby transform (DLT) codes for erasure networks with multiple sources and multiple relays, communicating to a single destination. The erasure-floor performance of DLT codes improves with the maximum degree of the relay-degree distribution. However, for conventional DLT codes, the maximum degree is upper-bounded by the number of sources. An additional constraint is that the sources are required to have the same information block length. We introduce a $D$-bit buffer for each source-relay link, which allows the relay to select multiple encoded bits from the same source for the relay-encoding process; thus, the number of sources no longer limits the maximum degree at the relay. Furthermore, the introduction of buffers facilitates the use of different information block sizes across sources. Based on density evolution we develop an asymptotic analytical framework for optimization of the relay-degree distribution. We further integrate techniques for unequal erasure protection into the optimization framework. The proposed codes are considered for both lossless and lossy source-relay links. Numerical examples show that there is no loss in erasure performance for transmission over lossy source-relay links as compared to lossless links. Additional delays, however, may occur. The design framework and our contributions are demonstrated by a number of illustrative examples, showing the improvements obtained by the proposed buffer-based DLT codes.
△ Less
Submitted 1 October, 2014;
originally announced October 2014.
-
Rateless Codes for the Multi-Way Relay Channel
Authors:
Iqbal Hussain,
Ming Xiao,
Lars K. Rasmussen
Abstract:
We consider distributed Luby transform (DLT) codes for efficient packet transmission in a multi-way relay network, where the links are modeled as erasure channels. Density evolution is applied for asymptotic performance analysis, and subsequently used in a linear-programming design framework for optimizing the degree distribution at the relay in terms of overhead. Moreover a buffer is introduced a…
▽ More
We consider distributed Luby transform (DLT) codes for efficient packet transmission in a multi-way relay network, where the links are modeled as erasure channels. Density evolution is applied for asymptotic performance analysis, and subsequently used in a linear-programming design framework for optimizing the degree distribution at the relay in terms of overhead. Moreover a buffer is introduced at the relay to enable efficient downlink transmission even if packets are lost during uplink transmission. Performance losses in terms of delay and/or erasure rates caused by link erasures during uplink transmission are thus alleviated. The proposed DLT codes provide significant improvements in overhead and decoded erasure rates. Numerical results for finite-length codes follow closely the asymptotic analysis. Our results demonstrate that the proposed buffer-based DLT codes outperform its counterparts for lossy uplink transmission.
△ Less
Submitted 1 October, 2014;
originally announced October 2014.
-
Causal/Predictive Imperfect Channel State Information in Block-Fading Channels
Authors:
Khoa D. Nguyen,
Nick Letzepis,
Albert Guillen i Fabregas,
Lars K. Rasmussen
Abstract:
We consider a multi-input multi-output (MIMO) block-fading channel with a general model for channel state information at the transmitter (CSIT). The model covers systems with causal CSIT, where only CSIT of past fading blocks is available, and predictive CSIT, where CSIT of some future fading blocks is available. The optimal diversity-multiplexing tradeoff (DMT) and rate-diversity tradeoff (RDT) o…
▽ More
We consider a multi-input multi-output (MIMO) block-fading channel with a general model for channel state information at the transmitter (CSIT). The model covers systems with causal CSIT, where only CSIT of past fading blocks is available, and predictive CSIT, where CSIT of some future fading blocks is available. The optimal diversity-multiplexing tradeoff (DMT) and rate-diversity tradeoff (RDT) of the channel are studied under long-term power constraints. The impact of imperfect (mismatched) CSIT on the optimal DMT and RDT is also investigated. Our results show the outage diversity gain obtained by providing imperfect causal/predictive CSIT, leading to new insights into system design and analysis.
△ Less
Submitted 25 September, 2014;
originally announced September 2014.
-
Asymptotic Analysis of SU-MIMO Channels With Transmitter Noise and Mismatched Joint Decoding
Authors:
Mikko Vehkaperä,
Taneli Riihonen,
Maksym A. Girnyk,
Emil Björnson,
Mérouane Debbah,
Lars K. Rasmussen,
Risto Wichman
Abstract:
Hardware impairments in radio-frequency components of a wireless system cause unavoidable distortions to transmission that are not captured by the conventional linear channel model. In this paper, a 'binoisy' single-user multiple-input multiple-output (SU-MIMO) relation is considered where the additional distortions are modeled via an additive noise term at the transmit side. Through this extended…
▽ More
Hardware impairments in radio-frequency components of a wireless system cause unavoidable distortions to transmission that are not captured by the conventional linear channel model. In this paper, a 'binoisy' single-user multiple-input multiple-output (SU-MIMO) relation is considered where the additional distortions are modeled via an additive noise term at the transmit side. Through this extended SU-MIMO channel model, the effects of transceiver hardware impairments on the achievable rate of multi-antenna point-to-point systems are studied. Channel input distributions encompassing practical discrete modulation schemes, such as, QAM and PSK, as well as Gaussian signaling are covered. In addition, the impact of mismatched detection and decoding when the receiver has insufficient information about the non-idealities is investigated. The numerical results show that for realistic system parameters, the effects of transmit-side noise and mismatched decoding become significant only at high modulation orders.
△ Less
Submitted 22 October, 2014; v1 submitted 19 June, 2014;
originally announced June 2014.
-
Large-System Analysis of Correlated MIMO Multiple Access Channels with Arbitrary Signaling in the Presence of Interference
Authors:
Maksym A. Girnyk,
Mikko Vehkaperä,
Lars K. Rasmussen
Abstract:
Presence of multiple antennas on both sides of a communication channel promises significant improvements in system throughput and power efficiency. In effect, a new class of large multiple-input multiple-output (MIMO) communication systems has recently emerged and attracted both scientific and industrial attention. To analyze these systems in realistic scenarios, one has to include such aspects as…
▽ More
Presence of multiple antennas on both sides of a communication channel promises significant improvements in system throughput and power efficiency. In effect, a new class of large multiple-input multiple-output (MIMO) communication systems has recently emerged and attracted both scientific and industrial attention. To analyze these systems in realistic scenarios, one has to include such aspects as co-channel interference, multiple access and spatial correlation. In this paper, we study the properties of correlated MIMO multiple-access channels in the presence of external interference. Using the replica method from statistical physics, we derive the ergodic sum-rate of the communication for arbitrary signal constellations when the numbers of antennas at both ends of the channel grow large. Based on these asymptotic expressions, we also address the problem of sum-rate maximization using statistical channel information and linear precoding. The numerical results demonstrate that when the interfering terminals use discrete constellations, the resulting interference becomes easier to handle compared to Gaussian signals. Thus, it may be possible to accommodate more interfering transmitter-receiver pairs within the same area as compared to the case of Gaussian signals. In addition, we demonstrate numerically for the Gaussian and QPSK signaling schemes that it is possible to design precoder matrices that significantly improve the achievable rates at low-to-mid range of signal-to-noise ratios when compared to isotropic precoding.
△ Less
Submitted 26 January, 2014; v1 submitted 21 May, 2013;
originally announced May 2013.
-
Evolutionary Dynamics of Scientific Collaboration Networks: Multi-Levels and Cross-time Analysis
Authors:
Alireza Abbasi,
Liaquat Hossain,
Shahadat Uddin,
Kim J. R. Rasmussen
Abstract:
Several studies exist which use scientific literature for comparing scientific activities (e.g., productivity, and collaboration). In this study, using co-authorship data over the last 40 years, we present the evolutionary dynamics of multi level (i.e., individual, institutional and national) collaboration networks for exploring the emergence of collaborations in the research field of "steel struc…
▽ More
Several studies exist which use scientific literature for comparing scientific activities (e.g., productivity, and collaboration). In this study, using co-authorship data over the last 40 years, we present the evolutionary dynamics of multi level (i.e., individual, institutional and national) collaboration networks for exploring the emergence of collaborations in the research field of "steel structures". The collaboration network of scientists in the field has been analyzed using author affiliations extracted from Scopus between 1970 and 2009. We have studied collaboration distribution networks at the micro-, meso- and macro-levels for the 40 years. We compared and analyzed a number of properties of these networks (i.e., density, centrality measures, the giant component and clustering coefficient) for presenting a longitudinal analysis and statistical validation of the evolutionary dynamics of "steel structures" collaboration networks. At all levels, the scientific collaborations network structures were central considering the closeness centralization while betweenness and degree centralization were much lower. In general networks density, connectedness, centralization and clustering coefficient were highest in marco-level and decreasing as the network size grow to the lowest in micro-level. We also find that the average distance between countries about two and institutes five and for authors eight meaning that only about eight steps are necessary to get from one randomly chosen author to another.
△ Less
Submitted 29 July, 2011;
originally announced July 2011.
-
MIMO ARQ with Multi-bit Feedback: Outage Analysis
Authors:
Khoa D. Nguyen,
Lars K. Rasmussen,
Albert Guillen i Fabregas,
Nick Letzepis
Abstract:
We study the asymptotic outage performance of incremental redundancy automatic repeat request (INR-ARQ) transmission over the multiple-input multiple-output (MIMO) block-fading channels with discrete input constellations. We first show that transmission with random codes using a discrete signal constellation across all transmit antennas achieves the optimal outage diversity given by the Singleton…
▽ More
We study the asymptotic outage performance of incremental redundancy automatic repeat request (INR-ARQ) transmission over the multiple-input multiple-output (MIMO) block-fading channels with discrete input constellations. We first show that transmission with random codes using a discrete signal constellation across all transmit antennas achieves the optimal outage diversity given by the Singleton bound. We then analyze the optimal SNR-exponent and outage diversity of INR-ARQ transmission over the MIMO block-fading channel. We show that a significant gain in outage diversity is obtained by providing more than one bit feedback at each ARQ round. Thus, the outage performance of INR-ARQ transmission can be remarkably improved with minimal additional overhead. A suboptimal feedback and power adaptation rule, which achieves the optimal outage diversity, is proposed for MIMO INR-ARQ, demonstrating the benefits provided by multi-bit feedback.
△ Less
Submitted 10 June, 2010; v1 submitted 6 June, 2010;
originally announced June 2010.
-
Adaptive Decoding of LDPC Codes with Binary Messages
Authors:
Ingmar Land,
Gottfried Lechner,
Lars K. Rasmussen
Abstract:
A novel adaptive binary decoding algorithm for LDPC codes is proposed, which reduces the decoding complexity while having a comparable or even better performance than corresponding non-adaptive alternatives. In each iteration the variable node decoders use the binary check node decoders multiple times; each single use is referred to as a sub-iteration. To process the sequences of binary messages…
▽ More
A novel adaptive binary decoding algorithm for LDPC codes is proposed, which reduces the decoding complexity while having a comparable or even better performance than corresponding non-adaptive alternatives. In each iteration the variable node decoders use the binary check node decoders multiple times; each single use is referred to as a sub-iteration. To process the sequences of binary messages in each iteration, the variable node decoders employ pre-computed look-up tables. These look-up tables as well as the number of sub-iterations per iteration are dynamically adapted during the decoding process based on the decoder state, represented by the mutual information between the current messages and the syndrome bits. The look-up tables and the number of sub-iterations per iteration are determined and optimized using density evolution. The performance and the complexity of the proposed adaptive decoding algorithm is exemplified by simulations.
△ Less
Submitted 23 April, 2009; v1 submitted 18 February, 2009;
originally announced February 2009.
-
Power Allocation for Fading Channels with Peak-to-Average Power Constraints
Authors:
Khoa D. Nguyen,
Albert Guillen i Fabregas,
Lars K. Rasmussen
Abstract:
Power allocation with peak-to-average power ratio constraints is investigated for transmission over Nakagami-m fading channels with arbitrary input distributions. In the case of delay-limited block-fading channels, we find the solution to the minimum outage power allocation scheme with peak-to-average power constraints and arbitrary input distributions, and show that the signal-to-noise ratio ex…
▽ More
Power allocation with peak-to-average power ratio constraints is investigated for transmission over Nakagami-m fading channels with arbitrary input distributions. In the case of delay-limited block-fading channels, we find the solution to the minimum outage power allocation scheme with peak-to-average power constraints and arbitrary input distributions, and show that the signal-to-noise ratio exponent for any finite peak-to-average power ratio is the same as that of the peak-power limited problem, resulting in an error floor. In the case of the ergodic fully-interleaved channel, we find the power allocation rule that yields the maximal information rate for an arbitrary input distribution and show that capacities with peak-to-average power ratio constraints, even for small ratios, are very close to capacities without peak-power restrictions.
△ Less
Submitted 25 February, 2008;
originally announced February 2008.
-
A Tight Lower Bound to the Outage Probability of Discrete-Input Block-Fading Channels
Authors:
Khoa D. Nguyen,
Albert Guillen i Fabregas,
Lars K. Rasmussen
Abstract:
In this correspondence, we propose a tight lower bound to the outage probability of discrete-input Nakagami-m block-fading channels. The approach permits an efficient method for numerical evaluation of the bound, providing an additional tool for system design. The optimal rate-diversity trade-off for the Nakagami-m block-fading channel is also derived and a tight upper bound is obtained for the…
▽ More
In this correspondence, we propose a tight lower bound to the outage probability of discrete-input Nakagami-m block-fading channels. The approach permits an efficient method for numerical evaluation of the bound, providing an additional tool for system design. The optimal rate-diversity trade-off for the Nakagami-m block-fading channel is also derived and a tight upper bound is obtained for the optimal coding gain constant.
△ Less
Submitted 11 July, 2007;
originally announced July 2007.
-
Power Allocation for Discrete-Input Delay-Limited Fading Channels
Authors:
Khoa D. Nguyen,
Albert Guillen i Fabregas,
Lars K. Rasmussen
Abstract:
We consider power allocation algorithms for fixed-rate transmission over Nakagami-m non-ergodic block-fading channels with perfect transmitter and receiver channel state information and discrete input signal constellations, under both short- and long-term power constraints. Optimal power allocation schemes are shown to be direct applications of previous results in the literature. We show that th…
▽ More
We consider power allocation algorithms for fixed-rate transmission over Nakagami-m non-ergodic block-fading channels with perfect transmitter and receiver channel state information and discrete input signal constellations, under both short- and long-term power constraints. Optimal power allocation schemes are shown to be direct applications of previous results in the literature. We show that the SNR exponent of the optimal short-term scheme is given by m times the Singleton bound. We also illustrate the significant gains available by employing long-term power constraints. In particular, we analyze the optimal long-term solution, showing that zero outage can be achieved provided that the corresponding short-term SNR exponent with the same system parameters is strictly greater than one. Conversely, if the short-term SNR exponent is smaller than one, we show that zero outage cannot be achieved. In this case, we derive the corresponding long-term SNR exponent as a function of the Singleton bound. Due to the nature of the expressions involved, the complexity of optimal schemes may be prohibitive for system implementation. We therefore propose simple sub-optimal power allocation schemes whose outage probability performance is very close to the minimum outage probability obtained by optimal schemes. We also show the applicability of these techniques to practical systems employing orthogonal frequency division multiplexing.
△ Less
Submitted 13 June, 2007;
originally announced June 2007.