-
Incorporating Anatomical Awareness for Enhanced Generalizability and Progression Prediction in Deep Learning-Based Radiographic Sacroiliitis Detection
Authors:
Felix J. Dorfner,
Janis L. Vahldiek,
Leonhard Donle,
Andrei Zhukov,
Lina Xu,
Hartmut Häntze,
Marcus R. Makowski,
Hugo J. W. L. Aerts,
Fabian Proft,
Valeria Rios Rodriguez,
Judith Rademacher,
Mikhail Protopopov,
Hildrun Haibel,
Torsten Diekhoff,
Murat Torgutalp,
Lisa C. Adams,
Denis Poddubnyy,
Keno K. Bressem
Abstract:
Purpose: To examine whether incorporating anatomical awareness into a deep learning model can improve generalizability and enable prediction of disease progression.
Methods: This retrospective multicenter study included conventional pelvic radiographs of 4 different patient cohorts focusing on axial spondyloarthritis (axSpA) collected at university and community hospitals. The first cohort, whic…
▽ More
Purpose: To examine whether incorporating anatomical awareness into a deep learning model can improve generalizability and enable prediction of disease progression.
Methods: This retrospective multicenter study included conventional pelvic radiographs of 4 different patient cohorts focusing on axial spondyloarthritis (axSpA) collected at university and community hospitals. The first cohort, which consisted of 1483 radiographs, was split into training (n=1261) and validation (n=222) sets. The other cohorts comprising 436, 340, and 163 patients, respectively, were used as independent test datasets. For the second cohort, follow-up data of 311 patients was used to examine progression prediction capabilities. Two neural networks were trained, one on images cropped to the bounding box of the sacroiliac joints (anatomy-aware) and the other one on full radiographs. The performance of the models was compared using the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and specificity.
Results: On the three test datasets, the standard model achieved AUC scores of 0.853, 0.817, 0.947, with an accuracy of 0.770, 0.724, 0.850. Whereas the anatomy-aware model achieved AUC scores of 0.899, 0.846, 0.957, with an accuracy of 0.821, 0.744, 0.906, respectively. The patients who were identified as high risk by the anatomy aware model had an odds ratio of 2.16 (95% CI: 1.19, 3.86) for having progression of radiographic sacroiliitis within 2 years.
Conclusion: Anatomical awareness can improve the generalizability of a deep learning model in detecting radiographic sacroiliitis. The model is published as fully open source alongside this study.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
MRSegmentator: Robust Multi-Modality Segmentation of 40 Classes in MRI and CT Sequences
Authors:
Hartmut Häntze,
Lina Xu,
Felix J. Dorfner,
Leonhard Donle,
Daniel Truhn,
Hugo Aerts,
Mathias Prokop,
Bram van Ginneken,
Alessa Hering,
Lisa C. Adams,
Keno K. Bressem
Abstract:
Purpose: To introduce a deep learning model capable of multi-organ segmentation in MRI scans, offering a solution to the current limitations in MRI analysis due to challenges in resolution, standardized intensity values, and variability in sequences.
Materials and Methods: he model was trained on 1,200 manually annotated MRI scans from the UK Biobank, 221 in-house MRI scans and 1228 CT scans, le…
▽ More
Purpose: To introduce a deep learning model capable of multi-organ segmentation in MRI scans, offering a solution to the current limitations in MRI analysis due to challenges in resolution, standardized intensity values, and variability in sequences.
Materials and Methods: he model was trained on 1,200 manually annotated MRI scans from the UK Biobank, 221 in-house MRI scans and 1228 CT scans, leveraging cross-modality transfer learning from CT segmentation models. A human-in-the-loop annotation workflow was employed to efficiently create high-quality segmentations. The model's performance was evaluated on NAKO and the AMOS22 dataset containing 600 and 60 MRI examinations. Dice Similarity Coefficient (DSC) and Hausdorff Distance (HD) was used to assess segmentation accuracy. The model will be open sourced.
Results: The model showcased high accuracy in segmenting well-defined organs, achieving Dice Similarity Coefficient (DSC) scores of 0.97 for the right and left lungs, and 0.95 for the heart. It also demonstrated robustness in organs like the liver (DSC: 0.96) and kidneys (DSC: 0.95 left, 0.95 right), which present more variability. However, segmentation of smaller and complex structures such as the portal and splenic veins (DSC: 0.54) and adrenal glands (DSC: 0.65 left, 0.61 right) revealed the need for further model optimization.
Conclusion: The proposed model is a robust, tool for accurate segmentation of 40 anatomical structures in MRI and CT images. By leveraging cross-modality learning and interactive annotation, the model achieves strong performance and generalizability across diverse datasets, making it a valuable resource for researchers and clinicians. It is open source and can be downloaded from https://github.com/hhaentze/MRSegmentator.
△ Less
Submitted 13 May, 2024; v1 submitted 10 May, 2024;
originally announced May 2024.
-
Improve Cross-Modality Segmentation by Treating MRI Images as Inverted CT Scans
Authors:
Hartmut Häntze,
Lina Xu,
Leonhard Donle,
Felix J. Dorfner,
Alessa Hering,
Lisa C. Adams,
Keno K. Bressem
Abstract:
Computed tomography (CT) segmentation models frequently include classes that are not currently supported by magnetic resonance imaging (MRI) segmentation models. In this study, we show that a simple image inversion technique can significantly improve the segmentation quality of CT segmentation models on MRI data, by using the TotalSegmentator model, applied to T1-weighted MRI images, as example. I…
▽ More
Computed tomography (CT) segmentation models frequently include classes that are not currently supported by magnetic resonance imaging (MRI) segmentation models. In this study, we show that a simple image inversion technique can significantly improve the segmentation quality of CT segmentation models on MRI data, by using the TotalSegmentator model, applied to T1-weighted MRI images, as example. Image inversion is straightforward to implement and does not require dedicated graphics processing units (GPUs), thus providing a quick alternative to complex deep modality-transfer models for generating segmentation masks for MRI data.
△ Less
Submitted 4 May, 2024;
originally announced May 2024.
-
Is Open-Source There Yet? A Comparative Study on Commercial and Open-Source LLMs in Their Ability to Label Chest X-Ray Reports
Authors:
Felix J. Dorfner,
Liv Jürgensen,
Leonhard Donle,
Fares Al Mohamad,
Tobias R. Bodenmann,
Mason C. Cleveland,
Felix Busch,
Lisa C. Adams,
James Sato,
Thomas Schultz,
Albert E. Kim,
Jameson Merkow,
Keno K. Bressem,
Christopher P. Bridge
Abstract:
Introduction: With the rapid advances in large language models (LLMs), there have been numerous new open source as well as commercial models. While recent publications have explored GPT-4 in its application to extracting information of interest from radiology reports, there has not been a real-world comparison of GPT-4 to different leading open-source models.
Materials and Methods: Two different…
▽ More
Introduction: With the rapid advances in large language models (LLMs), there have been numerous new open source as well as commercial models. While recent publications have explored GPT-4 in its application to extracting information of interest from radiology reports, there has not been a real-world comparison of GPT-4 to different leading open-source models.
Materials and Methods: Two different and independent datasets were used. The first dataset consists of 540 chest x-ray reports that were created at the Massachusetts General Hospital between July 2019 and July 2021. The second dataset consists of 500 chest x-ray reports from the ImaGenome dataset. We then compared the commercial models GPT-3.5 Turbo and GPT-4 from OpenAI to the open-source models Mistral-7B, Mixtral-8x7B, Llama2-13B, Llama2-70B, QWEN1.5-72B and CheXbert and CheXpert-labeler in their ability to accurately label the presence of multiple findings in x-ray text reports using different prompting techniques.
Results: On the ImaGenome dataset, the best performing open-source model was Llama2-70B with micro F1-scores of 0.972 and 0.970 for zero- and few-shot prompts, respectively. GPT-4 achieved micro F1-scores of 0.975 and 0.984, respectively. On the institutional dataset, the best performing open-source model was QWEN1.5-72B with micro F1-scores of 0.952 and 0.965 for zero- and few-shot prompting, respectively. GPT-4 achieved micro F1-scores of 0.975 and 0.973, respectively.
Conclusion: In this paper, we show that while GPT-4 is superior to open-source models in zero-shot report labeling, the implementation of few-shot prompting can bring open-source models on par with GPT-4. This shows that open-source models could be a performant and privacy preserving alternative to GPT-4 for the task of radiology report classification.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Large Language Models for User Interest Journeys
Authors:
Konstantina Christakopoulou,
Alberto Lalama,
Cj Adams,
Iris Qu,
Yifat Amir,
Samer Chucri,
Pierce Vollucci,
Fabio Soldo,
Dina Bseiso,
Sarah Scodel,
Lucas Dixon,
Ed H. Chi,
Minmin Chen
Abstract:
Large language models (LLMs) have shown impressive capabilities in natural language understanding and generation. Their potential for deeper user understanding and improved personalized user experience on recommendation platforms is, however, largely untapped. This paper aims to address this gap. Recommender systems today capture users' interests through encoding their historical activities on the…
▽ More
Large language models (LLMs) have shown impressive capabilities in natural language understanding and generation. Their potential for deeper user understanding and improved personalized user experience on recommendation platforms is, however, largely untapped. This paper aims to address this gap. Recommender systems today capture users' interests through encoding their historical activities on the platforms. The generated user representations are hard to examine or interpret. On the other hand, if we were to ask people about interests they pursue in their life, they might talk about their hobbies, like I just started learning the ukulele, or their relaxation routines, e.g., I like to watch Saturday Night Live, or I want to plant a vertical garden. We argue, and demonstrate through extensive experiments, that LLMs as foundation models can reason through user activities, and describe their interests in nuanced and interesting ways, similar to how a human would.
We define interest journeys as the persistent and overarching user interests, in other words, the non-transient ones. These are the interests that we believe will benefit most from the nuanced and personalized descriptions. We introduce a framework in which we first perform personalized extraction of interest journeys, and then summarize the extracted journeys via LLMs, using techniques like few-shot prompting, prompt-tuning and fine-tuning. Together, our results in prompting LLMs to name extracted user journeys in a large-scale industrial platform demonstrate great potential of these models in providing deeper, more interpretable, and controllable user understanding. We believe LLM powered user understanding can be a stepping stone to entirely new user experiences on recommendation platforms that are journey-aware, assistive, and enabling frictionless conversation down the line.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
MedAlpaca -- An Open-Source Collection of Medical Conversational AI Models and Training Data
Authors:
Tianyu Han,
Lisa C. Adams,
Jens-Michalis Papaioannou,
Paul Grundmann,
Tom Oberhauser,
Alexander Löser,
Daniel Truhn,
Keno K. Bressem
Abstract:
As large language models (LLMs) like OpenAI's GPT series continue to make strides, we witness the emergence of artificial intelligence applications in an ever-expanding range of fields. In medicine, these LLMs hold considerable promise for improving medical workflows, diagnostics, patient care, and education. Yet, there is an urgent need for open-source models that can be deployed on-premises to s…
▽ More
As large language models (LLMs) like OpenAI's GPT series continue to make strides, we witness the emergence of artificial intelligence applications in an ever-expanding range of fields. In medicine, these LLMs hold considerable promise for improving medical workflows, diagnostics, patient care, and education. Yet, there is an urgent need for open-source models that can be deployed on-premises to safeguard patient privacy. In our work, we present an innovative dataset consisting of over 160,000 entries, specifically crafted to fine-tune LLMs for effective medical applications. We investigate the impact of fine-tuning these datasets on publicly accessible pre-trained LLMs, and subsequently, we juxtapose the performance of pre-trained-only models against the fine-tuned models concerning the examinations that future medical doctors must pass to achieve certification.
△ Less
Submitted 4 October, 2023; v1 submitted 14 April, 2023;
originally announced April 2023.
-
MEDBERT.de: A Comprehensive German BERT Model for the Medical Domain
Authors:
Keno K. Bressem,
Jens-Michalis Papaioannou,
Paul Grundmann,
Florian Borchert,
Lisa C. Adams,
Leonhard Liu,
Felix Busch,
Lina Xu,
Jan P. Loyen,
Stefan M. Niehues,
Moritz Augustin,
Lennart Grosser,
Marcus R. Makowski,
Hugo JWL. Aerts,
Alexander Löser
Abstract:
This paper presents medBERTde, a pre-trained German BERT model specifically designed for the German medical domain. The model has been trained on a large corpus of 4.7 Million German medical documents and has been shown to achieve new state-of-the-art performance on eight different medical benchmarks covering a wide range of disciplines and medical document types. In addition to evaluating the ove…
▽ More
This paper presents medBERTde, a pre-trained German BERT model specifically designed for the German medical domain. The model has been trained on a large corpus of 4.7 Million German medical documents and has been shown to achieve new state-of-the-art performance on eight different medical benchmarks covering a wide range of disciplines and medical document types. In addition to evaluating the overall performance of the model, this paper also conducts a more in-depth analysis of its capabilities. We investigate the impact of data deduplication on the model's performance, as well as the potential benefits of using more efficient tokenization methods. Our results indicate that domain-specific models such as medBERTde are particularly useful for longer texts, and that deduplication of training data does not necessarily lead to improved performance. Furthermore, we found that efficient tokenization plays only a minor role in improving model performance, and attribute most of the improved performance to the large amount of training data. To encourage further research, the pre-trained model weights and new benchmarks based on radiological data are made publicly available for use by the scientific community.
△ Less
Submitted 24 March, 2023; v1 submitted 14 March, 2023;
originally announced March 2023.
-
What Does DALL-E 2 Know About Radiology?
Authors:
Lisa C. Adams,
Felix Busch,
Daniel Truhn,
Marcus R. Makowski,
Hugo JWL. Aerts,
Keno K. Bressem
Abstract:
Generative models such as DALL-E 2 could represent a promising future tool for image generation, augmentation, and manipulation for artificial intelligence research in radiology provided that these models have sufficient medical domain knowledge. Here we show that DALL-E 2 has learned relevant representations of X-ray images with promising capabilities in terms of zero-shot text-to-image generatio…
▽ More
Generative models such as DALL-E 2 could represent a promising future tool for image generation, augmentation, and manipulation for artificial intelligence research in radiology provided that these models have sufficient medical domain knowledge. Here we show that DALL-E 2 has learned relevant representations of X-ray images with promising capabilities in terms of zero-shot text-to-image generation of new images, continuation of an image beyond its original boundaries, or removal of elements, while pathology generation or CT, MRI, and ultrasound images are still limited. The use of generative models for augmenting and generating radiological data thus seems feasible, even if further fine-tuning and adaptation of these models to the respective domain is required beforehand.
△ Less
Submitted 27 September, 2022;
originally announced September 2022.
-
Modification tolerant signature schemes: location and correction
Authors:
Thais Bardini Idalino,
Lucia Moura,
Carlisle Adams
Abstract:
This paper considers malleable digital signatures, for situations where data is modified after it is signed. They can be used in applications where either the data can be modified (collaborative work), or the data must be modified (redactable and content extraction signatures) or we need to know which parts of the data have been modified (data forensics). A \new{classical} digital signature is val…
▽ More
This paper considers malleable digital signatures, for situations where data is modified after it is signed. They can be used in applications where either the data can be modified (collaborative work), or the data must be modified (redactable and content extraction signatures) or we need to know which parts of the data have been modified (data forensics). A \new{classical} digital signature is valid for a message only if the signature is authentic and not even one bit of the message has been modified. We propose a general framework of modification tolerant signature schemes (MTSS), which can provide either location only or both location and correction, for modifications in a signed message divided into $n$ blocks. This general scheme uses a set of allowed modifications that must be specified. We present an instantiation of MTSS with a tolerance level of $d$, indicating modifications can appear in any set of up to $d$ message blocks. This tolerance level $d$ is needed in practice for parametrizing and controlling the growth of the signature size with respect to the number $n$ of blocks; using combinatorial group testing (CGT) the signature has size $O(d^2 \log n)$ which is close to the \new{best known} lower bound \new{of $Ω(\frac{d^2}{\log d} (\log n))$}. There has been work in this very same direction using CGT by Goodrich et al. (ACNS 2005) and Idalino et al. (IPL 2015). Our work differs from theirs in that in one scheme we extend these ideas to include corrections of modification with provable security, and in another variation of the scheme we go in the opposite direction and guarantee privacy for redactable signatures, in this case preventing any leakage of redacted information.
△ Less
Submitted 31 July, 2022;
originally announced August 2022.
-
A Machine Learning Paradigm for Studying Pictorial Realism: Are Constable's Clouds More Real than His Contemporaries?
Authors:
Zhuomin Zhang,
Elizabeth C. Mansfield,
Jia Li,
John Russell,
George S. Young,
Catherine Adams,
James Z. Wang
Abstract:
The British landscape painter John Constable is considered foundational for the Realist movement in 19th-century European painting. Constable's painted skies, in particular, were seen as remarkably accurate by his contemporaries, an impression shared by many viewers today. Yet, assessing the accuracy of realist paintings like Constable's is subjective or intuitive, even for professional art histor…
▽ More
The British landscape painter John Constable is considered foundational for the Realist movement in 19th-century European painting. Constable's painted skies, in particular, were seen as remarkably accurate by his contemporaries, an impression shared by many viewers today. Yet, assessing the accuracy of realist paintings like Constable's is subjective or intuitive, even for professional art historians, making it difficult to say with certainty what set Constable's skies apart from those of his contemporaries. Our goal is to contribute to a more objective understanding of Constable's realism. We propose a new machine-learning-based paradigm for studying pictorial realism in an explainable way. Our framework assesses realism by measuring the similarity between clouds painted by artists noted for their skies, like Constable, and photographs of clouds. The experimental results of cloud classification show that Constable approximates more consistently than his contemporaries the formal features of actual clouds in his paintings. The study, as a novel interdisciplinary approach that combines computer vision and machine learning, meteorology, and art history, is a springboard for broader and deeper analyses of pictorial realism.
△ Less
Submitted 12 October, 2023; v1 submitted 18 February, 2022;
originally announced February 2022.
-
Testing Self-Organized Criticality Across the Main Sequence using Stellar Flares from TESS
Authors:
Adina D. Feinstein,
Darryl Z. Seligman,
Maximilian N. Günther,
Fred C. Adams
Abstract:
Self-organized criticality describes a class of dynamical systems that maintain themselves in an attractor state with no intrinsic length or time scale. Fundamentally, this theoretical construct requires a mechanism for instability that may trigger additional instabilities locally via dissipative processes. This concept has been invoked to explain nonlinear dynamical phenomena such as featureless…
▽ More
Self-organized criticality describes a class of dynamical systems that maintain themselves in an attractor state with no intrinsic length or time scale. Fundamentally, this theoretical construct requires a mechanism for instability that may trigger additional instabilities locally via dissipative processes. This concept has been invoked to explain nonlinear dynamical phenomena such as featureless energy spectra that have been observed empirically for earthquakes, avalanches, and solar flares. If this interpretation proves correct, it implies that the solar coronal magnetic field maintains itself in a critical state via a delicate balance between the dynamo-driven injection of magnetic energy and the release of that energy via flaring events. All-sky high-cadence surveys like the Transiting Exoplanet Survey Satellite (TESS) provide the necessary data to compare the energy distribution of flaring events in stars of different spectral types to that observed in the Sun. We identified $\sim 10^6$ flaring events on $\sim 10^5$ stars observed by TESS at 2-minute cadence. By fitting the flare frequency distribution for different mass bins, we find that all main sequence stars exhibit distributions of flaring events similar to that observed in the Sun, independent of their mass or age. This may suggest that stars universally maintain a critical state in their coronal topologies via magnetic reconnection events. If this interpretation proves correct, we may be able to infer properties of magnetic fields, interior structure, and dynamo mechanisms for stars that are otherwise unresolved point sources.
△ Less
Submitted 12 January, 2022; v1 submitted 14 September, 2021;
originally announced September 2021.
-
3D U-Net for segmentation of COVID-19 associated pulmonary infiltrates using transfer learning: State-of-the-art results on affordable hardware
Authors:
Keno K. Bressem,
Stefan M. Niehues,
Bernd Hamm,
Marcus R. Makowski,
Janis L. Vahldiek,
Lisa C. Adams
Abstract:
Segmentation of pulmonary infiltrates can help assess severity of COVID-19, but manual segmentation is labor and time-intensive. Using neural networks to segment pulmonary infiltrates would enable automation of this task. However, training a 3D U-Net from computed tomography (CT) data is time- and resource-intensive. In this work, we therefore developed and tested a solution on how transfer learni…
▽ More
Segmentation of pulmonary infiltrates can help assess severity of COVID-19, but manual segmentation is labor and time-intensive. Using neural networks to segment pulmonary infiltrates would enable automation of this task. However, training a 3D U-Net from computed tomography (CT) data is time- and resource-intensive. In this work, we therefore developed and tested a solution on how transfer learning can be used to train state-of-the-art segmentation models on limited hardware and in shorter time. We use the recently published RSNA International COVID-19 Open Radiology Database (RICORD) to train a fully three-dimensional U-Net architecture using an 18-layer 3D ResNet, pretrained on the Kinetics-400 dataset as encoder. The generalization of the model was then tested on two openly available datasets of patients with COVID-19, who received chest CTs (Corona Cases and MosMed datasets). Our model performed comparable to previously published 3D U-Net architectures, achieving a mean Dice score of 0.679 on the tuning dataset, 0.648 on the Coronacases dataset and 0.405 on the MosMed dataset. Notably, these results were achieved with shorter training time on a single GPU with less memory available than the GPUs used in previous studies.
△ Less
Submitted 25 January, 2021;
originally announced January 2021.
-
Cyberphysical Security Through Resiliency: A Systems-centric Approach
Authors:
Cody Fleming,
Carl Elks,
Georgios Bakirtzis,
Stephen C. Adams,
Bryan Carter,
Peter A. Beling,
Barry Horowitz
Abstract:
Cyber-physical systems (CPS) are often defended in the same manner as information technology (IT) systems -- by using perimeter security. Multiple factors make such defenses insufficient for CPS. Resiliency shows potential in overcoming these shortfalls. Techniques for achieving resilience exist; however, methods and theory for evaluating resilience in CPS are lacking. We argue that such methods a…
▽ More
Cyber-physical systems (CPS) are often defended in the same manner as information technology (IT) systems -- by using perimeter security. Multiple factors make such defenses insufficient for CPS. Resiliency shows potential in overcoming these shortfalls. Techniques for achieving resilience exist; however, methods and theory for evaluating resilience in CPS are lacking. We argue that such methods and theory should assist stakeholders in deciding where and how to apply design patterns for resilience. Such a problem potentially involves tradeoffs between different objectives and criteria, and such decisions need to be driven by traceable, defensible, repeatable engineering evidence. Multi-criteria resiliency problems require a system-oriented approach that evaluates systems in the presence of threats as well as potential design solutions once vulnerabilities have been identified. We present a systems-oriented view of cyber-physical security, termed Mission Aware, that is based on a holistic understanding of mission goals, system dynamics, and risk.
△ Less
Submitted 9 October, 2021; v1 submitted 29 November, 2020;
originally announced November 2020.
-
PILArNet: Public Dataset for Particle Imaging Liquid Argon Detectors in High Energy Physics
Authors:
Corey Adams,
Kazuhiro Terao,
Taritree Wongjirad
Abstract:
Rapid advancement of machine learning solutions has often coincided with the production of a test public data set. Such datasets reduce the largest barrier to entry for tackling a problem -- procuring data -- while also providing a benchmark to compare different solutions. Furthermore, large datasets have been used to train high-performing feature finders which are then used in new approaches to p…
▽ More
Rapid advancement of machine learning solutions has often coincided with the production of a test public data set. Such datasets reduce the largest barrier to entry for tackling a problem -- procuring data -- while also providing a benchmark to compare different solutions. Furthermore, large datasets have been used to train high-performing feature finders which are then used in new approaches to problems beyond that initially defined. In order to encourage the rapid development in the analysis of data collected using liquid argon time projection chambers, a class of particle detectors used in high energy physics experiments, we have produced the PILArNet, first 2D and 3D open dataset to be used for a couple of key analysis tasks. The initial dataset presented in this paper contains 300,000 samples simulated and recorded in three different volume sizes. The dataset is stored efficiently in sparse 2D and 3D matrix format with auxiliary information about simulated particles in the volume, and is made available for public research use. In this paper we describe the dataset, tasks, and the method used to procure the sample.
△ Less
Submitted 2 June, 2020;
originally announced June 2020.
-
The Newspaper Navigator Dataset: Extracting And Analyzing Visual Content from 16 Million Historic Newspaper Pages in Chronicling America
Authors:
Benjamin Charles Germain Lee,
Jaime Mears,
Eileen Jakeway,
Meghan Ferriter,
Chris Adams,
Nathan Yarasavage,
Deborah Thomas,
Kate Zwaard,
Daniel S. Weld
Abstract:
Chronicling America is a product of the National Digital Newspaper Program, a partnership between the Library of Congress and the National Endowment for the Humanities to digitize historic newspapers. Over 16 million pages of historic American newspapers have been digitized for Chronicling America to date, complete with high-resolution images and machine-readable METS/ALTO OCR. Of considerable int…
▽ More
Chronicling America is a product of the National Digital Newspaper Program, a partnership between the Library of Congress and the National Endowment for the Humanities to digitize historic newspapers. Over 16 million pages of historic American newspapers have been digitized for Chronicling America to date, complete with high-resolution images and machine-readable METS/ALTO OCR. Of considerable interest to Chronicling America users is a semantified corpus, complete with extracted visual content and headlines. To accomplish this, we introduce a visual content recognition model trained on bounding box annotations of photographs, illustrations, maps, comics, and editorial cartoons collected as part of the Library of Congress's Beyond Words crowdsourcing initiative and augmented with additional annotations including those of headlines and advertisements. We describe our pipeline that utilizes this deep learning model to extract 7 classes of visual content: headlines, photographs, illustrations, maps, comics, editorial cartoons, and advertisements, complete with textual content such as captions derived from the METS/ALTO OCR, as well as image embeddings for fast image similarity querying. We report the results of running the pipeline on 16.3 million pages from the Chronicling America corpus and describe the resulting Newspaper Navigator dataset, the largest dataset of extracted visual content from historic newspapers ever produced. The Newspaper Navigator dataset, finetuned visual content recognition model, and all source code are placed in the public domain for unrestricted re-use.
△ Less
Submitted 4 May, 2020;
originally announced May 2020.
-
On the security and privacy of Interac e-Transfers
Authors:
Fabian Willems,
Mohammad Raahemi,
Prasadith Buddhitha,
Carlisle Adams,
Thomas Tran
Abstract:
Nowadays, the Interac e-Transfer is one of the most important remote payment methods for Canadian consumers. To the best of our knowledge, this paper is the very first to examine the privacy and security of Interac e-Transfers. Experimental results show that the notifications sent to customers via email and SMS contain sensitive private information that can potentially be observed by third parties…
▽ More
Nowadays, the Interac e-Transfer is one of the most important remote payment methods for Canadian consumers. To the best of our knowledge, this paper is the very first to examine the privacy and security of Interac e-Transfers. Experimental results show that the notifications sent to customers via email and SMS contain sensitive private information that can potentially be observed by third parties. Anyone with illegitimate intent can use this information to carry out attacks, including the fraudulent redirection of Standard e-Transfers. Such an attack is shown to be possible at least in an experimental setup but likely also in reality. Recent news articles support this finding. Improvements to overcome these interconnected privacy and security problems are proposed and discussed.
△ Less
Submitted 11 December, 2019; v1 submitted 3 October, 2019;
originally announced October 2019.
-
Scaling Distributed Training of Flood-Filling Networks on HPC Infrastructure for Brain Mapping
Authors:
Wushi Dong,
Murat Keceli,
Rafael Vescovi,
Hanyu Li,
Corey Adams,
Elise Jennings,
Samuel Flender,
Tom Uram,
Venkatram Vishwanath,
Nicola Ferrier,
Narayanan Kasthuri,
Peter Littlewood
Abstract:
Mapping all the neurons in the brain requires automatic reconstruction of entire cells from volume electron microscopy data. The flood-filling network (FFN) architecture has demonstrated leading performance for segmenting structures from this data. However, the training of the network is computationally expensive. In order to reduce the training time, we implemented synchronous and data-parallel d…
▽ More
Mapping all the neurons in the brain requires automatic reconstruction of entire cells from volume electron microscopy data. The flood-filling network (FFN) architecture has demonstrated leading performance for segmenting structures from this data. However, the training of the network is computationally expensive. In order to reduce the training time, we implemented synchronous and data-parallel distributed training using the Horovod library, which is different from the asynchronous training scheme used in the published FFN code. We demonstrated that our distributed training scaled well up to 2048 Intel Knights Landing (KNL) nodes on the Theta supercomputer. Our trained models achieved similar level of inference performance, but took less training time compared to previous methods. Our study on the effects of different batch sizes on FFN training suggests ways to further improve training efficiency. Our findings on optimal learning rate and batch sizes agree with previous works.
△ Less
Submitted 9 December, 2019; v1 submitted 13 May, 2019;
originally announced May 2019.
-
A Deep Neural Network for Pixel-Level Electromagnetic Particle Identification in the MicroBooNE Liquid Argon Time Projection Chamber
Authors:
MicroBooNE collaboration,
C. Adams,
M. Alrashed,
R. An,
J. Anthony,
J. Asaadi,
A. Ashkenazi,
M. Auger,
S. Balasubramanian,
B. Baller,
C. Barnes,
G. Barr,
M. Bass,
F. Bay,
A. Bhat,
K. Bhattacharya,
M. Bishai,
A. Blake,
T. Bolton,
L. Camilleri,
D. Caratelli,
I. Caro Terrazas,
R. Carr,
R. Castillo Fernandez,
F. Cavanna
, et al. (148 additional authors not shown)
Abstract:
We have developed a convolutional neural network (CNN) that can make a pixel-level prediction of objects in image data recorded by a liquid argon time projection chamber (LArTPC) for the first time. We describe the network design, training techniques, and software tools developed to train this network. The goal of this work is to develop a complete deep neural network based data reconstruction cha…
▽ More
We have developed a convolutional neural network (CNN) that can make a pixel-level prediction of objects in image data recorded by a liquid argon time projection chamber (LArTPC) for the first time. We describe the network design, training techniques, and software tools developed to train this network. The goal of this work is to develop a complete deep neural network based data reconstruction chain for the MicroBooNE detector. We show the first demonstration of a network's validity on real LArTPC data using MicroBooNE collection plane images. The demonstration is performed for stopping muon and a $ν_μ$ charged current neutral pion data samples.
△ Less
Submitted 22 August, 2018;
originally announced August 2018.
-
Multi-agent Inverse Reinforcement Learning for Certain General-sum Stochastic Games
Authors:
Xiaomin Lin,
Stephen C. Adams,
Peter A. Beling
Abstract:
This paper addresses the problem of multi-agent inverse reinforcement learning (MIRL) in a two-player general-sum stochastic game framework. Five variants of MIRL are considered: uCS-MIRL, advE-MIRL, cooE-MIRL, uCE-MIRL, and uNE-MIRL, each distinguished by its solution concept. Problem uCS-MIRL is a cooperative game in which the agents employ cooperative strategies that aim to maximize the total g…
▽ More
This paper addresses the problem of multi-agent inverse reinforcement learning (MIRL) in a two-player general-sum stochastic game framework. Five variants of MIRL are considered: uCS-MIRL, advE-MIRL, cooE-MIRL, uCE-MIRL, and uNE-MIRL, each distinguished by its solution concept. Problem uCS-MIRL is a cooperative game in which the agents employ cooperative strategies that aim to maximize the total game value. In problem uCE-MIRL, agents are assumed to follow strategies that constitute a correlated equilibrium while maximizing total game value. Problem uNE-MIRL is similar to uCE-MIRL in total game value maximization, but it is assumed that the agents are playing a Nash equilibrium. Problems advE-MIRL and cooE-MIRL assume agents are playing an adversarial equilibrium and a coordination equilibrium, respectively. We propose novel approaches to address these five problems under the assumption that the game observer either knows or is able to accurate estimate the policies and solution concepts for players. For uCS-MIRL, we first develop a characteristic set of solutions ensuring that the observed bi-policy is a uCS and then apply a Bayesian inverse learning method. For uCE-MIRL, we develop a linear programming problem subject to constraints that define necessary and sufficient conditions for the observed policies to be correlated equilibria. The objective is to choose a solution that not only minimizes the total game value difference between the observed bi-policy and a local uCS, but also maximizes the scale of the solution. We apply a similar treatment to the problem of uNE-MIRL. The remaining two problems can be solved efficiently by taking advantage of solution uniqueness and setting up a convex optimization problem. Results are validated on various benchmark grid-world games.
△ Less
Submitted 10 October, 2019; v1 submitted 26 June, 2018;
originally announced June 2018.
-
Coverage and Field Estimation on Bounded Domains by Diffusive Swarms
Authors:
Karthik Elamvazhuthi,
Chase Adams,
Spring Berman
Abstract:
In this paper, we consider stochastic coverage of bounded domains by a diffusing swarm of robots that take local measurements of an underlying scalar field. We introduce three control methodologies with diffusion, advection, and reaction as independent control inputs. We analyze the diffusion-based control strategy using standard operator semigroup-theoretic arguments. We show that the diffusion c…
▽ More
In this paper, we consider stochastic coverage of bounded domains by a diffusing swarm of robots that take local measurements of an underlying scalar field. We introduce three control methodologies with diffusion, advection, and reaction as independent control inputs. We analyze the diffusion-based control strategy using standard operator semigroup-theoretic arguments. We show that the diffusion coefficient can be chosen to be dependent only on the robots' local measurements to ensure that the swarm density converges to a function proportional to the scalar field. The boundedness of the domain precludes the need to impose assumptions on decaying properties of the scalar field at infinity. Moreover, exponential convergence of the swarm density to the equilibrium follows from properties of the spectrum of the semigroup generator. In addition, we use the proposed coverage method to construct a time-inhomogenous diffusion process and apply the observability of the heat equation to reconstruct the scalar field over the entire domain from observations of the robots' random motion over a small subset of the domain. We verify our results through simulations of the coverage scenario on a 2D domain and the field estimation scenario on a 1D domain.
△ Less
Submitted 1 October, 2016; v1 submitted 24 September, 2016;
originally announced September 2016.
-
Stochastic Matrix Factorization
Authors:
Christopher Adams
Abstract:
This paper considers a restriction to non-negative matrix factorization in which at least one matrix factor is stochastic. That is, the elements of the matrix factors are non-negative and the columns of one matrix factor sum to 1. This restriction includes topic models, a popular method for analyzing unstructured data. It also includes a method for storing and finding pictures. The paper presents…
▽ More
This paper considers a restriction to non-negative matrix factorization in which at least one matrix factor is stochastic. That is, the elements of the matrix factors are non-negative and the columns of one matrix factor sum to 1. This restriction includes topic models, a popular method for analyzing unstructured data. It also includes a method for storing and finding pictures. The paper presents necessary and sufficient conditions on the observed data such that the factorization is unique. In addition, the paper characterizes natural bounds on the parameters for any observed data and presents a consistent least squares estimator. The results are illustrated using a topic model analysis of PhD abstracts in economics and the problem of storing and retrieving a set of pictures of faces.
△ Less
Submitted 19 September, 2016;
originally announced September 2016.
-
Secure Data Storage Structure and Privacy-Preserving Mobile Search Scheme for Public Safety Networks
Authors:
Hamidreza Ghafghazi,
Amr ElMougy,
Hussein T. Mouftah,
Carlisle Adams
Abstract:
In a Public Safety (PS) situation, agents may require critical and personally identifiable information. Therefore, not only does context and location-aware information need to be available, but also the privacy of such information should be preserved. Existing solutions do not address such a problem in a PS environment. This paper proposes a framework in which anonymized Personal Information (PI)…
▽ More
In a Public Safety (PS) situation, agents may require critical and personally identifiable information. Therefore, not only does context and location-aware information need to be available, but also the privacy of such information should be preserved. Existing solutions do not address such a problem in a PS environment. This paper proposes a framework in which anonymized Personal Information (PI) is accessible to authorized public safety agents under a PS circumstance. In particular, we propose a secure data storage structure along with privacy-preserving mobile search framework, suitable for Public Safety Networks (PSNs). As a result, availability and privacy of PI are achieved simultaneously. However, the design of such a framework encounters substantial challenges, including scalability, reliability of the data, computation and communication and storage efficiency, etc. We leverage Secure Indexing (SI) methods and modify Bloom Filters (BFs) to create a secure data storage structure to store encrypted meta-data. As a result, our construction enables secure and privacy-preserving multi-keyword search capability. In addition, our system scales very well, maintains availability of data, imposes minimum delay, and has affordable storage overhead. We provide extensive security analysis, simulation studies, and performance comparison with the state-of-the-art solutions to demonstrate the efficiency and effectiveness of the proposed approach. To the best of our knowledge, this work is the first to address such issues in the context of PSNs.
△ Less
Submitted 14 February, 2016;
originally announced February 2016.