-
Self-supervised Pretraining for Partial Differential Equations
Authors:
Varun Madhavan,
Amal S Sebastian,
Bharath Ramsundar,
Venkatasubramanian Viswanathan
Abstract:
In this work, we describe a novel approach to building a neural PDE solver leveraging recent advances in transformer based neural network architectures. Our model can provide solutions for different values of PDE parameters without any need for retraining the network. The training is carried out in a self-supervised manner, similar to pretraining approaches applied in language and vision tasks. We…
▽ More
In this work, we describe a novel approach to building a neural PDE solver leveraging recent advances in transformer based neural network architectures. Our model can provide solutions for different values of PDE parameters without any need for retraining the network. The training is carried out in a self-supervised manner, similar to pretraining approaches applied in language and vision tasks. We hypothesize that the model is in effect learning a family of operators (for multiple parameters) mapping the initial condition to the solution of the PDE at any future time step t. We compare this approach with the Fourier Neural Operator (FNO), and demonstrate that it can generalize over the space of PDE parameters, despite having a higher prediction error for individual parameter values compared to the FNO. We show that performance on a specific parameter can be improved by finetuning the model with very small amounts of data. We also demonstrate that the model scales with data as well as model size.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Image Classification with Rotation-Invariant Variational Quantum Circuits
Authors:
Paul San Sebastian,
Mikel Cañizo,
Román Orús
Abstract:
Variational quantum algorithms are gaining attention as an early application of Noisy Intermediate-Scale Quantum (NISQ) devices. One of the main problems of variational methods lies in the phenomenon of Barren Plateaus, present in the optimization of variational parameters. Adding geometric inductive bias to the quantum models has been proposed as a potential solution to mitigate this problem, lea…
▽ More
Variational quantum algorithms are gaining attention as an early application of Noisy Intermediate-Scale Quantum (NISQ) devices. One of the main problems of variational methods lies in the phenomenon of Barren Plateaus, present in the optimization of variational parameters. Adding geometric inductive bias to the quantum models has been proposed as a potential solution to mitigate this problem, leading to a new field called Geometric Quantum Machine Learning. In this work, an equivariant architecture for variational quantum classifiers is introduced to create a label-invariant model for image classification with $C_4$ rotational label symmetry. The equivariant circuit is benchmarked against two different architectures, and it is experimentally observed that the geometric approach boosts the model's performance. Finally, a classical equivariant convolution operation is proposed to extend the quantum model for the processing of larger images, employing the resources available in NISQ devices.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
The Rise of GoodFATR: A Novel Accuracy Comparison Methodology for Indicator Extraction Tools
Authors:
Juan Caballero,
Gibran Gomez,
Srdjan Matic,
Gustavo Sánchez,
Silvia Sebastián,
Arturo Villacañas
Abstract:
To adapt to a constantly evolving landscape of cyber threats, organizations actively need to collect Indicators of Compromise (IOCs), i.e., forensic artifacts that signal that a host or network might have been compromised. IOCs can be collected through open-source and commercial structured IOC feeds. But, they can also be extracted from a myriad of unstructured threat reports written in natural la…
▽ More
To adapt to a constantly evolving landscape of cyber threats, organizations actively need to collect Indicators of Compromise (IOCs), i.e., forensic artifacts that signal that a host or network might have been compromised. IOCs can be collected through open-source and commercial structured IOC feeds. But, they can also be extracted from a myriad of unstructured threat reports written in natural language and distributed using a wide array of sources such as blogs and social media. There exist multiple indicator extraction tools that can identify IOCs in natural language reports. But, it is hard to compare their accuracy due to the difficulty of building large ground truth datasets. This work presents a novel majority vote methodology for comparing the accuracy of indicator extraction tools, which does not require a manually-built ground truth. We implement our methodology into GoodFATR, an automated platform for collecting threat reports from a wealth of sources, extracting IOCs from the collected reports using multiple tools, and comparing their accuracy.
GoodFATR supports 6 threat report sources: RSS, Twitter, Telegram, Malpedia, APTnotes, and ChainSmith. GoodFATR continuously monitors the sources, downloads new threat reports, extracts 41 indicator types from the collected reports, and filters non-malicious indicators to output the IOCs. We run GoodFATR over 15 months to collect 472,891 reports from the 6 sources; extract 978,151 indicators from the reports; and identify 618,217 IOCs. We analyze the collected data to identify the top IOC contributors and the IOC class distribution. We apply GoodFATR to compare the IOC extraction accuracy of 7 popular open-source tools with GoodFATR's own indicator extraction module.
△ Less
Submitted 8 March, 2023; v1 submitted 29 July, 2022;
originally announced August 2022.
-
AVClass2: Massive Malware Tag Extraction from AV Labels
Authors:
Silvia Sebastián,
Juan Caballero
Abstract:
Tags can be used by malware repositories and analysis services to enable searches for samples of interest across different dimensions. Automatically extracting tags from AV labels is an efficient approach to categorize and index massive amounts of samples. Recent tools like AVClass and Euphony have demonstrated that, despite their noisy nature, it is possible to extract family names from AV labels…
▽ More
Tags can be used by malware repositories and analysis services to enable searches for samples of interest across different dimensions. Automatically extracting tags from AV labels is an efficient approach to categorize and index massive amounts of samples. Recent tools like AVClass and Euphony have demonstrated that, despite their noisy nature, it is possible to extract family names from AV labels. However, beyond the family name, AV labels contain much valuable information such as malware classes, file properties, and behaviors.
This work presents AVClass2, an automatic malware tagging tool that given the AV labels for a potentially massive number of samples, extracts clean tags that categorize the samples. AVClass2 uses, and helps building, an open taxonomy that organizes concepts in AV labels, but is not constrained to a predefined set of tags. To keep itself updated as AV vendors introduce new tags, it provides an update module that automatically identifies new taxonomy entries, as well as tagging and expansion rules that capture relations between tags. We have evaluated AVClass2 on 42M and showed how it enables advanced malware searches and to maintain an updated knowledge base of malware concepts in AV labels.
△ Less
Submitted 28 October, 2020; v1 submitted 18 June, 2020;
originally announced June 2020.
-
PySPH: a Python-based framework for smoothed particle hydrodynamics
Authors:
Prabhu Ramachandran,
Aditya Bhosale,
Kunal Puri,
Pawan Negi,
Abhinav Muta,
A Dinesh,
Dileep Menon,
Rahul Govind,
Suraj Sanka,
Amal S Sebastian,
Ananyo Sen,
Rohan Kaushik,
Anshuman Kumar,
Vikas Kurapati,
Mrinalgouda Patil,
Deep Tavker,
Pankaj Pandey,
Chandrashekhar Kaushik,
Arkopal Dutt,
Arpit Agarwal
Abstract:
PySPH is an open-source, Python-based, framework for particle methods in general and Smoothed Particle Hydrodynamics (SPH) in particular. PySPH allows a user to define a complete SPH simulation using pure Python. High-performance code is generated from this high-level Python code and executed on either multiple cores, or on GPUs, seamlessly. It also supports distributed execution using MPI. PySPH…
▽ More
PySPH is an open-source, Python-based, framework for particle methods in general and Smoothed Particle Hydrodynamics (SPH) in particular. PySPH allows a user to define a complete SPH simulation using pure Python. High-performance code is generated from this high-level Python code and executed on either multiple cores, or on GPUs, seamlessly. It also supports distributed execution using MPI. PySPH supports a wide variety of SPH schemes and formulations. These include, incompressible and compressible fluid flow, elastic dynamics, rigid body dynamics, shallow water equations, and other problems. PySPH supports a variety of boundary conditions including mirror, periodic, solid wall, and inlet/outlet boundary conditions. The package is written to facilitate reuse and reproducibility. This paper discusses the overall design of PySPH and demonstrates many of its features. Several example results are shown to demonstrate the range of features that PySPH provides.
△ Less
Submitted 28 December, 2020; v1 submitted 10 September, 2019;
originally announced September 2019.
-
A 700uW 1GS/s 4-bit Folding-Flash ADC in 65nm CMOS for Wideband Wireless Communications
Authors:
Bayan Nasri,
Sunit P. Sebastian,
Kae-Dyi You,
RamKumar RanjithKumar,
Davood Shahrjerdi
Abstract:
We present the design of a low-power 4-bit 1GS/s folding-flash ADC with a folding factor of two. The design of a new unbalanced double-tail dynamic comparator affords an ultra-low power operation and a high dynamic range. Unlike the conventional approaches, this design uses a fully matched input stage, an unbalanced latch stage, and a two-clock operation scheme. A combination of these features yie…
▽ More
We present the design of a low-power 4-bit 1GS/s folding-flash ADC with a folding factor of two. The design of a new unbalanced double-tail dynamic comparator affords an ultra-low power operation and a high dynamic range. Unlike the conventional approaches, this design uses a fully matched input stage, an unbalanced latch stage, and a two-clock operation scheme. A combination of these features yields significant reduction of the kick-back noise, while allowing the design flexibility for adjusting the trip points of the comparators. As a result, the ADC achieves SNDR of 22.3 dB at 100MHz and 21.8 dB at 500MHz (i.e. the Nyquist frequency). The maximum INL and DNL are about 0.2 LSB. The converter consumes about 700uW from a 1-V supply yielding a figure of merit of 65fJ/conversion step. These attributes make the proposed folding-flash ADC attractive for the next-generation wireless applications.
△ Less
Submitted 14 December, 2016;
originally announced December 2016.