-
Privacy-Preserving Data Deduplication for Enhancing Federated Learning of Language Models
Authors:
Aydin Abadi,
Vishnu Asutosh Dasu,
Sumanta Sarkar
Abstract:
Deduplication is a vital preprocessing step that enhances machine learning model performance and saves training time and energy. However, enhancing federated learning through deduplication poses challenges, especially regarding scalability and potential privacy violations if deduplication involves sharing all clients' data. In this paper, we address the problem of deduplication in a federated setu…
▽ More
Deduplication is a vital preprocessing step that enhances machine learning model performance and saves training time and energy. However, enhancing federated learning through deduplication poses challenges, especially regarding scalability and potential privacy violations if deduplication involves sharing all clients' data. In this paper, we address the problem of deduplication in a federated setup by introducing a pioneering protocol, Efficient Privacy-Preserving Multi-Party Deduplication (EP-MPD). It efficiently removes duplicates from multiple clients' datasets without compromising data privacy. EP-MPD is constructed in a modular fashion, utilizing two novel variants of the Private Set Intersection protocol. Our extensive experiments demonstrate the significant benefits of deduplication in federated learning of large language models. For instance, we observe up to 19.61% improvement in perplexity and up to 27.95% reduction in running time. EP-MPD effectively balances privacy and performance in federated learning, making it a valuable solution for large-scale applications.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Supersonic OT: Fast Unconditionally Secure Oblivious Transfer
Authors:
Aydin Abadi,
Yvo Desmedt
Abstract:
Oblivious Transfer (OT) is a fundamental cryptographic protocol with applications in secure Multi-Party Computation, Federated Learning, and Private Set Intersection. With the advent of quantum computing, it is crucial to develop unconditionally secure core primitives like OT to ensure their continued security in the post-quantum era. Despite over four decades since OT's introduction, the literatu…
▽ More
Oblivious Transfer (OT) is a fundamental cryptographic protocol with applications in secure Multi-Party Computation, Federated Learning, and Private Set Intersection. With the advent of quantum computing, it is crucial to develop unconditionally secure core primitives like OT to ensure their continued security in the post-quantum era. Despite over four decades since OT's introduction, the literature has predominantly relied on computational assumptions, except in cases using unconventional methods like noisy channels or a fully trusted party. Introducing "Supersonic OT", a highly efficient and unconditionally secure OT scheme that avoids public-key-based primitives, we offer an alternative to traditional approaches. Supersonic OT enables a receiver to obtain a response of size O(1). Its simple (yet non-trivial) design facilitates easy security analysis and implementation. The protocol employs a basic secret-sharing scheme, controlled swaps, the one-time pad, and a third-party helper who may be corrupted by a semi-honest adversary. Our implementation and runtime analysis indicate that a single instance of Supersonic OT completes in 0.35 milliseconds, making it up to 2000 times faster than the state-of-the-art base OT.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Tempora-Fusion: Time-Lock Puzzle with Efficient Verifiable Homomorphic Linear Combination
Authors:
Aydin Abadi
Abstract:
To securely transmit sensitive information into the future, Time-Lock Puzzles (TLPs) have been developed. Their applications include scheduled payments, timed commitments, e-voting, and sealed-bid auctions. Homomorphic TLP is a key variant of TLP that enables computation on puzzles from different clients. This allows a solver/server to tackle only a single puzzle encoding the computation's result.…
▽ More
To securely transmit sensitive information into the future, Time-Lock Puzzles (TLPs) have been developed. Their applications include scheduled payments, timed commitments, e-voting, and sealed-bid auctions. Homomorphic TLP is a key variant of TLP that enables computation on puzzles from different clients. This allows a solver/server to tackle only a single puzzle encoding the computation's result. However, existing homomorphic TLPs lack support for verifying the correctness of the computation results. We address this limitation by introducing Tempora-Fusion, a TLP that allows a server to perform homomorphic linear combinations of puzzles from different clients while ensuring verification of computation correctness. This scheme avoids asymmetric-key cryptography for verification, thus paving the way for efficient implementations. We discuss our scheme's application in various domains, such as federated learning, scheduled payments in online banking, and e-voting.
△ Less
Submitted 24 June, 2024; v1 submitted 21 June, 2024;
originally announced June 2024.
-
Delegated-Query Oblivious Transfer and its Practical Applications
Authors:
Yvo Desmedt,
Aydin Abadi
Abstract:
Databases play a pivotal role in the contemporary World Wide Web and the world of cloud computing. Unfortunately, numerous privacy violations have recently garnered attention in the news. To enhance database privacy, we consider Oblivious Transfer (OT), an elegant cryptographic technology. Our observation reveals that existing research in this domain primarily concentrates on theoretical cryptogra…
▽ More
Databases play a pivotal role in the contemporary World Wide Web and the world of cloud computing. Unfortunately, numerous privacy violations have recently garnered attention in the news. To enhance database privacy, we consider Oblivious Transfer (OT), an elegant cryptographic technology. Our observation reveals that existing research in this domain primarily concentrates on theoretical cryptographic applications, overlooking various practical aspects:
- OTs assume parties have direct access to databases. Our "1-out-of-2 Delegated-Query OT" enables parties to privately query a database, without direct access.
- With the rise of cloud computing, physically separated databases may no longer remain so. Our "1-out-of-2 Delegated-Query Multi-Receiver OT" protects privacy in such evolving scenarios.
- Research often ignores the limitations of thin clients, e.g., Internet of Things devices. To address this, we propose a compiler that transforms any 1-out-of-n OT into a thin client version.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Starlit: Privacy-Preserving Federated Learning to Enhance Financial Fraud Detection
Authors:
Aydin Abadi,
Bradley Doyle,
Francesco Gini,
Kieron Guinamard,
Sasi Kumar Murakonda,
Jack Liddell,
Paul Mellor,
Steven J. Murdoch,
Mohammad Naseri,
Hector Page,
George Theodorakopoulos,
Suzanne Weller
Abstract:
Federated Learning (FL) is a data-minimization approach enabling collaborative model training across diverse clients with local data, avoiding direct data exchange. However, state-of-the-art FL solutions to identify fraudulent financial transactions exhibit a subset of the following limitations. They (1) lack a formal security definition and proof, (2) assume prior freezing of suspicious customers…
▽ More
Federated Learning (FL) is a data-minimization approach enabling collaborative model training across diverse clients with local data, avoiding direct data exchange. However, state-of-the-art FL solutions to identify fraudulent financial transactions exhibit a subset of the following limitations. They (1) lack a formal security definition and proof, (2) assume prior freezing of suspicious customers' accounts by financial institutions (limiting the solutions' adoption), (3) scale poorly, involving either $O(n^2)$ computationally expensive modular exponentiation (where $n$ is the total number of financial institutions) or highly inefficient fully homomorphic encryption, (4) assume the parties have already completed the identity alignment phase, hence excluding it from the implementation, performance evaluation, and security analysis, and (5) struggle to resist clients' dropouts. This work introduces Starlit, a novel scalable privacy-preserving FL mechanism that overcomes these limitations. It has various applications, such as enhancing financial fraud detection, mitigating terrorism, and enhancing digital health. We implemented Starlit and conducted a thorough performance analysis using synthetic data from a key player in global financial transactions. The evaluation indicates Starlit's scalability, efficiency, and accuracy.
△ Less
Submitted 22 January, 2024; v1 submitted 19 January, 2024;
originally announced January 2024.
-
An Assessment of PC-mer's Performance in Alignment-Free Phylogenetic Tree Construction
Authors:
Saeedeh Akbari Rokn Abadi,
Melika Honarmand,
Ali Hajialinaghi,
Somayyeh Koohi
Abstract:
Background: Sequence comparison is essential in bioinformatics, serving various purposes such as taxonomy, functional inference, and drug discovery. The traditional method of aligning sequences for comparison is time-consuming, especially with large datasets. To overcome this, alignment-free methods have emerged as an alternative approach, prioritizing comparison scores over alignment itself. Thes…
▽ More
Background: Sequence comparison is essential in bioinformatics, serving various purposes such as taxonomy, functional inference, and drug discovery. The traditional method of aligning sequences for comparison is time-consuming, especially with large datasets. To overcome this, alignment-free methods have emerged as an alternative approach, prioritizing comparison scores over alignment itself. These methods directly compare sequences without the need for alignment. However, accurately representing the relationships between sequences is a significant challenge in the design of these tools. Methods:One of the alignment-free comparison approaches utilizes the frequency of fixed-length substrings, known as K-mers, which serves as the foundation for many sequence comparison methods. However, a challenge arises in these methods when increasing the length of the substring (K), as it leads to an exponential growth in the number of possible states. In this work, we explore the PC-mer method, which utilizes a more limited set of words that experience slower growth 2^k instead of 4^k compared to K. We conducted a comparison of sequences and evaluated how the reduced input vector size influenced the performance of the PC-mer method. Results: For the evaluation, we selected the Clustal Omega method as our reference approach, alongside three alignment-free methods: kmacs, FFP, and alfpy (word count). These methods also leverage the frequency of K-mers. We applied all five methods to 9 datasets for comprehensive analysis. The results were compared using phylogenetic trees and metrics such as Robinson-Foulds and normalized quartet distance (nQD). Conclusion: Our findings indicate that, unlike reducing the input features in other alignment-independent methods, the PC-mer method exhibits competitive performance when compared to the aforementioned methods especially when input sequences are very varied.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Delegated Time-Lock Puzzle
Authors:
Aydin Abadi,
Dan Ristea,
Steven J. Murdoch
Abstract:
Time-Lock Puzzles (TLPs) are cryptographic protocols that enable a client to lock a message in such a way that a server can only unlock it after a specific time period. However, existing TLPs have certain limitations: (i) they assume that both the client and server always possess sufficient computational resources and (ii) they solely focus on the lower time bound for finding a solution, disregard…
▽ More
Time-Lock Puzzles (TLPs) are cryptographic protocols that enable a client to lock a message in such a way that a server can only unlock it after a specific time period. However, existing TLPs have certain limitations: (i) they assume that both the client and server always possess sufficient computational resources and (ii) they solely focus on the lower time bound for finding a solution, disregarding the upper bound that guarantees a regular server can find a solution within a certain time frame. Additionally, existing TLPs designed to handle multiple puzzles either (a) entail high verification costs or (b) lack generality, requiring identical time intervals between consecutive solutions. To address these limitations, this paper introduces, for the first time, the concept of a "Delegated Time-Lock Puzzle" and presents a protocol called "Efficient Delegated Time-Lock Puzzle" (ED-TLP) that realises this concept. ED-TLP allows the client and server to delegate their resource-demanding tasks to third-party helpers. It facilitates real-time verification of solution correctness and efficiently handles multiple puzzles with varying time intervals. ED-TLP ensures the delivery of solutions within predefined time limits by incorporating both an upper bound and a fair payment algorithm. We have implemented ED-TLP and conducted a comprehensive analysis of its overheads, demonstrating the efficiency of the construction.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
Safeguarding Physical Sneaker Sale Through a Decentralized Medium
Authors:
Marwan Zeggari,
Aydin Abadi,
Renaud Lambiotte,
Mohamad Kassab
Abstract:
Sneakers were designated as the most counterfeited fashion item online, with three times more risk in a trade than any other fashion purchase. As the market expands, the current sneaker scene displays several vulnerabilities and trust flaws, mostly related to the legitimacy of assets or actors. In this paper, we investigate various blockchain-based mechanisms to address these large-scale trust iss…
▽ More
Sneakers were designated as the most counterfeited fashion item online, with three times more risk in a trade than any other fashion purchase. As the market expands, the current sneaker scene displays several vulnerabilities and trust flaws, mostly related to the legitimacy of assets or actors. In this paper, we investigate various blockchain-based mechanisms to address these large-scale trust issues. We argue that (i) pre-certified and tracked assets through the use of non-fungible tokens can ensure the genuine nature of an asset and authenticate its owner more effectively during peer-to-peer trading across a marketplace; (ii) a game-theoretic-based system with economic incentives for participating users can greatly reduce the rate of online fraud and address missed delivery deadlines; (iii) a decentralized dispute resolution system biased in favour of an honest party can solve potential conflicts more reliably.
△ Less
Submitted 9 June, 2023; v1 submitted 20 May, 2023;
originally announced June 2023.
-
Earn While You Reveal: Private Set Intersection that Rewards Participants
Authors:
Aydin Abadi
Abstract:
In Private Set Intersection protocols (PSIs), a non-empty result always reveals something about the private input sets of the parties. Moreover, in various variants of PSI, not all parties necessarily receive or are interested in the result. Nevertheless, to date, the literature has assumed that those parties who do not receive or are not interested in the result still contribute their private inp…
▽ More
In Private Set Intersection protocols (PSIs), a non-empty result always reveals something about the private input sets of the parties. Moreover, in various variants of PSI, not all parties necessarily receive or are interested in the result. Nevertheless, to date, the literature has assumed that those parties who do not receive or are not interested in the result still contribute their private input sets to the PSI for free, although doing so would cost them their privacy. In this work, for the first time, we propose a multi-party PSI, called "Anesidora", that rewards parties who contribute their private input sets to the protocol. Anesidora is efficient; it mainly relies on symmetric key primitives and its computation and communication complexities are linear with the number of parties and set cardinality. It remains secure even if the majority of parties are corrupted by active colluding adversaries.
△ Less
Submitted 26 April, 2024; v1 submitted 10 January, 2023;
originally announced January 2023.
-
An Efficient and Decentralized Blockchain-based Commercial Alternative (Full Version)
Authors:
Marwan Zeggari,
Renaud Lambiotte,
Aydin Abadi,
Louise Axon,
Mohamad Kassab
Abstract:
While online interactions and exchanges have grown exponentially over the past decade, most commercial infrastructures still operate through centralized protocols, and their success essentially depends on trust between different economic actors. Digital advances such as blockchain technology has led to a massive wave of \textit{Decentralized Ledger Technology} (\textit{DLT}) initiatives, protocols…
▽ More
While online interactions and exchanges have grown exponentially over the past decade, most commercial infrastructures still operate through centralized protocols, and their success essentially depends on trust between different economic actors. Digital advances such as blockchain technology has led to a massive wave of \textit{Decentralized Ledger Technology} (\textit{DLT}) initiatives, protocols and solutions. This advance makes it possible to implement trustless systems in the real world, which, combined with appropriate economic and participatory incentives, would foster the proper functioning and drive the adoption of a decentralized platform among different actors. This paper describes an alternative to current commercial structures and networks by introducing \textit{Lyzis Labs}, which is is an incentive-driven and democratic protocol designed to support a decentralized online marketplace, based on blockchain technology. The proposal, \textit{Lyzis Marketplace}, allows to connect two or more people in a decentralized and secure way without having to rely on a \textit{Trusted Third Party} (\textit{TTP}) in order to perform physical asset exchanges while mainly providing transparent and fully protected data storage. This approach can potentially lead to the creation of a permissionless, efficient, secure and transparent business environment where each user can gain purchasing and decision-making power by supporting the collective welfare while following their personal interests during their various interactions on the network.
△ Less
Submitted 5 November, 2022; v1 submitted 15 October, 2022;
originally announced October 2022.
-
Glass-Vault: A Generic Transparent Privacy-preserving Exposure Notification Analytics Platform
Authors:
Lorenzo Martinico,
Aydin Abadi,
Thomas Zacharias,
Thomas Win
Abstract:
The highly transmissible COVID-19 disease is a serious threat to people's health and life. To automate tracing those who have been in close physical contact with newly infected people and/or to analyse tracing-related data, researchers have proposed various ad-hoc programs that require being executed on users' smartphones. Nevertheless, the existing solutions have two primary limitations: (1) lack…
▽ More
The highly transmissible COVID-19 disease is a serious threat to people's health and life. To automate tracing those who have been in close physical contact with newly infected people and/or to analyse tracing-related data, researchers have proposed various ad-hoc programs that require being executed on users' smartphones. Nevertheless, the existing solutions have two primary limitations: (1) lack of generality: for each type of analytic task, a certain kind of data needs to be sent to an analyst; (2) lack of transparency: parties who provide data to an analyst are not necessarily infected individuals; therefore, infected individuals' data can be shared with others (e.g., the analyst) without their fine-grained and direct consent. In this work, we present Glass-Vault, a protocol that addresses both limitations simultaneously. It allows an analyst to run authorised programs over the collected data of infectious users, without learning the input data. Glass-Vault relies on a new variant of generic Functional Encryption that we propose in this work. This new variant, called DD-Steel, offers these two additional properties: dynamic and decentralised. We illustrate the security of both Glass-Vault and DD-Steel in the Universal Composability setting. Glass-Vault is the first UC-secure protocol that allows analysing the data of Exposure Notification users in a privacy-preserving manner. As a sample application, we indicate how it can be used to generate "infection heatmaps".
△ Less
Submitted 19 August, 2022;
originally announced August 2022.
-
A Forward-secure Efficient Two-factor Authentication Protocol
Authors:
Steven J. Murdoch,
Aydin Abadi
Abstract:
Two-factor authentication (2FA) schemes that rely on a combination of knowledge factors (e.g., PIN) and device possession have gained popularity. Some of these schemes remain secure even against strong adversaries that (a) observe the traffic between a client and server, and (b) have physical access to the client's device, or its PIN, or breach the server. However, these solutions have several sho…
▽ More
Two-factor authentication (2FA) schemes that rely on a combination of knowledge factors (e.g., PIN) and device possession have gained popularity. Some of these schemes remain secure even against strong adversaries that (a) observe the traffic between a client and server, and (b) have physical access to the client's device, or its PIN, or breach the server. However, these solutions have several shortcomings; namely, they (i) require a client to remember multiple secret values to prove its identity, (ii) involve several modular exponentiations, and (iii) are in the non-standard random oracle model. In this work, we present a 2FA protocol that resists such a strong adversary while addressing the above shortcomings. Our protocol requires a client to remember only a single secret value/PIN, does not involve any modular exponentiations, and is in a standard model. It is the first one that offers these features without using trusted chipsets. This protocol also imposes up to 40% lower communication overhead than the state-of-the-art solutions do.
△ Less
Submitted 4 August, 2022;
originally announced August 2022.
-
Recurring Contingent Service Payment
Authors:
Aydin Abadi,
Steven J. Murdoch,
Thomas Zacharias
Abstract:
Fair exchange protocols let two mutually distrustful parties exchange digital data in a way that neither party can cheat. They have various applications such as the exchange of digital items, or the exchange of digital coins and digital services between a buyer/client and seller/server.
In this work, we formally define and propose a generic blockchain-based construction called "Recurring Conting…
▽ More
Fair exchange protocols let two mutually distrustful parties exchange digital data in a way that neither party can cheat. They have various applications such as the exchange of digital items, or the exchange of digital coins and digital services between a buyer/client and seller/server.
In this work, we formally define and propose a generic blockchain-based construction called "Recurring Contingent Service Payment" (RC-S-P). It (i) lets a fair exchange of digital coins and verifiable service reoccur securely between clients and a server while ensuring that the server is paid if and only if it delivers a valid service, and (ii) ensures the parties' privacy is preserved. RC-S-P supports arbitrary verifiable services, such as "Proofs of Retrievability" (PoR) or verifiable computation and imposes low on-chain overheads. Our formal treatment and construction, for the first time, consider the setting where either client or server is malicious.
We also present a concrete efficient instantiation of RC- S-P when the verifiable service is PoR. We implemented the concrete instantiation and analysed its cost. When it deals with a 4-GB outsourced file, a verifier can check a proof in only 90 milliseconds, and a dispute between a prover and verifier is resolved in 0.1 milliseconds.
At CCS 2017, two blockchain-based protocols were proposed to support the fair exchange of digital coins and a certain verifiable service; namely, PoR. In this work, we show that these protocols (i) are susceptible to a free-riding attack which enables a client to receive the service without paying the server, and (ii) are not suitable for cases where parties' privacy matters, e.g., when the server's proof status or buyer's file size must remain private from the public. RC- S-P simultaneously mitigates the above attack and preserves the parties' privacy.
△ Less
Submitted 5 April, 2023; v1 submitted 30 July, 2022;
originally announced August 2022.
-
NoCFG: A Lightweight Approach for Sound Call Graph Approximation
Authors:
Aharon Abadi,
Bar Makovitzki,
Ron Shemer,
Shmuel Tyszberowicz
Abstract:
Interprocedural analysis refers to gathering information about the entire program rather than for a single procedure only, as in intraprocedural analysis. Interprocedural analysis enables a more precise analysis; however, it is complicated due to the difficulty of constructing an accurate program call graph. Current algorithms for constructing sound and precise call graphs analyze complex program…
▽ More
Interprocedural analysis refers to gathering information about the entire program rather than for a single procedure only, as in intraprocedural analysis. Interprocedural analysis enables a more precise analysis; however, it is complicated due to the difficulty of constructing an accurate program call graph. Current algorithms for constructing sound and precise call graphs analyze complex program dependencies, therefore they might be difficult to scale. Their complexity stems from the kind of type-inference analysis they use, in particular the use of some variations of points-to analysis. To address this problem, we propose NoCFG, a new sound and scalable method for approximating a call graph that supports a wide variety of programming languages. A key property of NoCFG is that it works on a coarse abstraction of the program, discarding many of the programming language constructs. Due to the coarse program abstraction, extending it to support also other languages is easy. We provide a formal proof for the soundness of NoCFG and evaluations for real-world projects written in both Python and C#. The experimental results demonstrate a high precision rate of 90% (lower bound) and scalability through a security use-case over projects with up to 2 million lines of code.
△ Less
Submitted 7 May, 2021;
originally announced May 2021.
-
TI-Capsule: Capsule Network for Stock Exchange Prediction
Authors:
Ramin Mousa,
Sara Nazari,
Ali Karhe Abadi,
Reza Shoukhcheshm,
Mohammad Niknam Pirzadeh,
Leila Safari
Abstract:
Today, the use of social networking data has attracted a lot of academic and commercial attention in predicting the stock market. In most studies in this area, the sentiment analysis of the content of user posts on social networks is used to predict market fluctuations. Predicting stock marketing is challenging because of the variables involved. In the short run, the market behaves like a voting m…
▽ More
Today, the use of social networking data has attracted a lot of academic and commercial attention in predicting the stock market. In most studies in this area, the sentiment analysis of the content of user posts on social networks is used to predict market fluctuations. Predicting stock marketing is challenging because of the variables involved. In the short run, the market behaves like a voting machine, but in the long run, it acts like a weighing machine. The purpose of this study is to predict EUR/USD stock behavior using Capsule Network on finance texts and Candlestick images. One of the most important features of Capsule Network is the maintenance of features in a vector, which also takes into account the space between features. The proposed model, TI-Capsule (Text and Image information based Capsule Neural Network), is trained with both the text and image information simultaneously. Extensive experiments carried on the collected dataset have demonstrated the effectiveness of TI-Capsule in solving the stock exchange prediction problem with 91% accuracy.
△ Less
Submitted 15 February, 2021;
originally announced February 2021.
-
Codeless Screen-Oriented Programming for Enterprise Mobile Applications
Authors:
Aharon Abadi,
Yael Dubinsky,
Andrei Kirshin,
Yossi Mesika,
Idan Ben-Harrush
Abstract:
Designing the user interface (UI) of mobile applications in the enterprise is performed in many cases by the application users who represent the business needs. This is based on existing developed enterprise services that can be accessed by these applications. Design the UI of the mobile applications require programming environments that are as much as possible codeless and easy to use. In this pa…
▽ More
Designing the user interface (UI) of mobile applications in the enterprise is performed in many cases by the application users who represent the business needs. This is based on existing developed enterprise services that can be accessed by these applications. Design the UI of the mobile applications require programming environments that are as much as possible codeless and easy to use. In this paper, we present NitroGen which is a cloud-based platform-independent tool that provides with no installation, a consumable integrated set of capabilities to construct mobile solutions aiming at reducing development and maintenance costs. We further illustrate how to use NitroGen showing its visual touch-based mostly codeless way of programming. NitroGen can easily connect to back-end services thus enable fast and facile development in enterprises.
△ Less
Submitted 5 October, 2013;
originally announced October 2013.