subscribe to arXiv mailings

Controlling Space and Time with Diffusion Models

Authors: Daniel Watson, Saurabh Saxena, Lala Li, Andrea Tagliasacchi, David J. Fleet

Abstract: We present 4DiM, a cascaded diffusion model for 4D novel view synthesis (NVS), conditioned on one or more images of a general scene, and a set of camera poses and timestamps. To overcome challenges due to limited availability of 4D training data, we advocate joint training on 3D (with camera pose), 4D (pose+time) and video (time but no pose) data and propose a new architecture that enables the sam… ▽ More We present 4DiM, a cascaded diffusion model for 4D novel view synthesis (NVS), conditioned on one or more images of a general scene, and a set of camera poses and timestamps. To overcome challenges due to limited availability of 4D training data, we advocate joint training on 3D (with camera pose), 4D (pose+time) and video (time but no pose) data and propose a new architecture that enables the same. We further advocate the calibration of SfM posed data using monocular metric depth estimators for metric scale camera control. For model evaluation, we introduce new metrics to enrich and overcome shortcomings of current evaluation schemes, demonstrating state-of-the-art results in both fidelity and pose control compared to existing diffusion models for 3D NVS, while at the same time adding the ability to handle temporal dynamics. 4DiM is also used for improved panorama stitching, pose-conditioned video to video translation, and several other tasks. For an overview see https://4d-diffusion.github.io △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.06234 [pdf]

doi 10.1007/s11023-022-09612-y

The US Algorithmic Accountability Act of 2022 vs. The EU Artificial Intelligence Act: What can they learn from each other?

Authors: Jakob Mokander, Prathm Juneja, David Watson, Luciano Floridi

Abstract: On the whole, the U.S. Algorithmic Accountability Act of 2022 (US AAA) is a pragmatic approach to balancing the benefits and risks of automated decision systems. Yet there is still room for improvement. This commentary highlights how the US AAA can both inform and learn from the European Artificial Intelligence Act (EU AIA). On the whole, the U.S. Algorithmic Accountability Act of 2022 (US AAA) is a pragmatic approach to balancing the benefits and risks of automated decision systems. Yet there is still room for improvement. This commentary highlights how the US AAA can both inform and learn from the European Artificial Intelligence Act (EU AIA). △ Less

Submitted 7 July, 2024; originally announced July 2024.

Comments: Minds & Machines (2022)

arXiv:2407.05341 [pdf]

doi 10.1007/s11023-022-09620-y

The Switch, the Ladder, and the Matrix: Models for Classifying AI Systems

Authors: Jakob Mokander, Margi Sheth, David Watson, Luciano Floridi

Abstract: Organisations that design and deploy artificial intelligence (AI) systems increasingly commit themselves to high-level, ethical principles. However, there still exists a gap between principles and practices in AI ethics. One major obstacle organisations face when attempting to operationalise AI Ethics is the lack of a well-defined material scope. Put differently, the question to which systems and… ▽ More Organisations that design and deploy artificial intelligence (AI) systems increasingly commit themselves to high-level, ethical principles. However, there still exists a gap between principles and practices in AI ethics. One major obstacle organisations face when attempting to operationalise AI Ethics is the lack of a well-defined material scope. Put differently, the question to which systems and processes AI ethics principles ought to apply remains unanswered. Of course, there exists no universally accepted definition of AI, and different systems pose different ethical challenges. Nevertheless, pragmatic problem-solving demands that things should be sorted so that their grouping will promote successful actions for some specific end. In this article, we review and compare previous attempts to classify AI systems for the purpose of implementing AI governance in practice. We find that attempts to classify AI systems found in previous literature use one of three mental model. The Switch, i.e., a binary approach according to which systems either are or are not considered AI systems depending on their characteristics. The Ladder, i.e., a risk-based approach that classifies systems according to the ethical risks they pose. And the Matrix, i.e., a multi-dimensional classification of systems that take various aspects into account, such as context, data input, and decision-model. Each of these models for classifying AI systems comes with its own set of strengths and weaknesses. By conceptualising different ways of classifying AI systems into simple mental models, we hope to provide organisations that design, deploy, or regulate AI systems with the conceptual tools needed to operationalise AI governance in practice. △ Less

Submitted 7 July, 2024; originally announced July 2024.

Journal ref: Minds and Machines, 2023

arXiv:2404.10754 [pdf]

A Systematic Survey of the Gemini Principles for Digital Twin Ontologies

Authors: James Michael Tooth, Nilufer Tuptuk, Jeremy Daniel McKendrick Watson

Abstract: Ontologies are widely used for achieving interoperable Digital Twins (DTws), yet competing DTw definitions compound interoperability issues. Semantically linking these differing twins is feasible through ontologies and Cognitive Digital Twins (CDTws). However, it is often unclear how ontology use bolsters broader DTw advancements. This article presents a systematic survey following the PRISMA meth… ▽ More Ontologies are widely used for achieving interoperable Digital Twins (DTws), yet competing DTw definitions compound interoperability issues. Semantically linking these differing twins is feasible through ontologies and Cognitive Digital Twins (CDTws). However, it is often unclear how ontology use bolsters broader DTw advancements. This article presents a systematic survey following the PRISMA method, to explore the potential of ontologies to support DTws to meet the Centre for Digital Built Britain's Gemini Principles and aims to link progress in ontologies to this framework. The Gemini Principles focus on common DTw requirements, considering: Purpose for 1) Public Good, 2) Value Creation, and 3) Insight; Trustworthiness with sufficient 4) Security, 5) Openness, and 6) Quality; and appropriate Functionality of 7) Federation, 8) Curation, and 9) Evolution. This systematic literature review examines the role of ontologies in facilitating each principle. Existing research uses ontologies to solve DTw challenges within these principles, particularly by connecting DTws, optimising decisionmaking, and reasoning governance policies. Furthermore, analysing the sectoral distribution of literature found that research encompassing the crossover of ontologies, DTws and the Gemini Principles is emerging, and that most innovation is predominantly within manufacturing and built environment sectors. Critical gaps for researchers, industry practitioners, and policymakers are subsequently identified. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: 35 pages + 4 page appendix, 8 figures

arXiv:2404.04446 [pdf, other]

Bounding Causal Effects with Leaky Instruments

Authors: David S. Watson, Jordan Penn, Lee M. Gunderson, Gecia Bravo-Hermsdorff, Afsaneh Mastouri, Ricardo Silva

Abstract: Instrumental variables (IVs) are a popular and powerful tool for estimating causal effects in the presence of unobserved confounding. However, classical approaches rely on strong assumptions such as the $\textit{exclusion criterion}$, which states that instrumental effects must be entirely mediated by treatments. This assumption often fails in practice. When IV methods are improperly applied to da… ▽ More Instrumental variables (IVs) are a popular and powerful tool for estimating causal effects in the presence of unobserved confounding. However, classical approaches rely on strong assumptions such as the $\textit{exclusion criterion}$, which states that instrumental effects must be entirely mediated by treatments. This assumption often fails in practice. When IV methods are improperly applied to data that do not meet the exclusion criterion, estimated causal effects may be badly biased. In this work, we propose a novel solution that provides $\textit{partial}$ identification in linear systems given a set of $\textit{leaky instruments}$, which are allowed to violate the exclusion criterion to some limited degree. We derive a convex optimization objective that provides provably sharp bounds on the average treatment effect under some common forms of information leakage, and implement inference procedures to quantify the uncertainty of resulting estimates. We demonstrate our method in a set of experiments with simulated data, where it performs favorably against the state of the art. An accompanying $\texttt{R}$ package, $\texttt{leakyIV}$, is available from $\texttt{CRAN}$. △ Less

Submitted 8 May, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

Comments: Camera ready version (UAI 2024)

Journal ref: 40th Conference on Uncertainty in Artificial Intelligence (UAI 2024)

arXiv:2404.01203 [pdf, other]

Video Interpolation with Diffusion Models

Authors: Siddhant Jain, Daniel Watson, Eric Tabellion, Aleksander Hołyński, Ben Poole, Janne Kontkanen

Abstract: We present VIDIM, a generative model for video interpolation, which creates short videos given a start and end frame. In order to achieve high fidelity and generate motions unseen in the input data, VIDIM uses cascaded diffusion models to first generate the target video at low resolution, and then generate the high-resolution video conditioned on the low-resolution generated video. We compare VIDI… ▽ More We present VIDIM, a generative model for video interpolation, which creates short videos given a start and end frame. In order to achieve high fidelity and generate motions unseen in the input data, VIDIM uses cascaded diffusion models to first generate the target video at low resolution, and then generate the high-resolution video conditioned on the low-resolution generated video. We compare VIDIM to previous state-of-the-art methods on video interpolation, and demonstrate how such works fail in most settings where the underlying motion is complex, nonlinear, or ambiguous while VIDIM can easily handle such cases. We additionally demonstrate how classifier-free guidance on the start and end frame and conditioning the super-resolution model on the original high-resolution frames without additional parameters unlocks high-fidelity results. VIDIM is fast to sample from as it jointly denoises all the frames to be generated, requires less than a billion parameters per diffusion model to produce compelling results, and still enjoys scalability and improved quality at larger parameter counts. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: CVPR 2024, Project page at https://vidim-interpolation.github.io/

arXiv:2403.14607 [pdf, other]

Polynomial-Time Classical Simulation of Noisy IQP Circuits with Constant Depth

Authors: Joel Rajakumar, James D. Watson, Yi-Kai Liu

Abstract: Sampling from the output distributions of quantum computations comprising only commuting gates, known as instantaneous quantum polynomial (IQP) computations, is believed to be intractable for classical computers, and hence this task has become a leading candidate for testing the capabilities of quantum devices. Here we demonstrate that for an arbitrary IQP circuit undergoing dephasing or depolariz… ▽ More Sampling from the output distributions of quantum computations comprising only commuting gates, known as instantaneous quantum polynomial (IQP) computations, is believed to be intractable for classical computers, and hence this task has become a leading candidate for testing the capabilities of quantum devices. Here we demonstrate that for an arbitrary IQP circuit undergoing dephasing or depolarizing noise, whose depth is greater than a critical $O(1)$ threshold, the output distribution can be efficiently sampled by a classical computer. Unlike other simulation algorithms for quantum supremacy tasks, we do not require assumptions on the circuit's architecture, on anti-concentration properties, nor do we require $Ω(\log(n))$ circuit depth. We take advantage of the fact that IQP circuits have deep sections of diagonal gates, which allows the noise to build up predictably and induce a large-scale breakdown of entanglement within the circuit. Our results suggest that quantum supremacy experiments based on IQP circuits may be more susceptible to classical simulation than previously thought. △ Less

Submitted 21 March, 2024; originally announced March 2024.

Comments: 17 pages, 5 figures

arXiv:2403.01538 [pdf]

A Preliminary Exploration of the Disruption of a Generative AI Systems: Faculty/Staff and Student Perceptions of ChatGPT and its Capability of Completing Undergraduate Engineering Coursework

Authors: Lance White, Trini Balart, Sara Amani, Dr. Kristi J. Shryock, Dr. Karan L. Watson

Abstract: The authors of this study aim to assess the capabilities of the OpenAI ChatGPT tool to understand just how effective such a system might be for students to utilize in their studies as well as deepen understanding of faculty/staff and student perceptions about ChatGPT in general. The purpose of what is learned from the study is to continue the design of a model to facilitate the development of facu… ▽ More The authors of this study aim to assess the capabilities of the OpenAI ChatGPT tool to understand just how effective such a system might be for students to utilize in their studies as well as deepen understanding of faculty/staff and student perceptions about ChatGPT in general. The purpose of what is learned from the study is to continue the design of a model to facilitate the development of faculty for becoming adept at embracing change, the DANCE model (Designing Adaptations for the Next Changes in Education). This model is used in this study to help faculty with examining the impact that a disruptive new tool, such as ChatGPT, can pose for the learning environment. The authors analyzed the performance of ChatGPT used to complete course assignments at a variety of levels by novice engineering students working as research assistants. Those completed works have been assessed by the faculty who created those assignments to understand how these completed assignments might compare with the performance of a typical student. A set of surveys conducted by the authors of this work are discussed where students, faculty, and staff respondents in March of 2023 addressed their perceptions of ChatGPT (A follow-up survey is being administered now, February 2024). These survey instruments were analyzed, and the data visualized in this work to bring attention to relevant findings by the researchers. This work reports the findings of the researchers with the purpose of sharing the current state of this work at Texas A&M University with the intention to provide insights to scholars both at our own institution and around the world. This work is not intended to be a finished work but reports these findings with full transparency that this work is currently continuing as the researchers gather new data and develop and validate various measurement instruments. △ Less

Submitted 3 March, 2024; originally announced March 2024.

Comments: 22 pages, 13 figures

arXiv:2312.02981 [pdf, other]

ReconFusion: 3D Reconstruction with Diffusion Priors

Authors: Rundi Wu, Ben Mildenhall, Philipp Henzler, Keunhong Park, Ruiqi Gao, Daniel Watson, Pratul P. Srinivasan, Dor Verbin, Jonathan T. Barron, Ben Poole, Aleksander Holynski

Abstract: 3D reconstruction methods such as Neural Radiance Fields (NeRFs) excel at rendering photorealistic novel views of complex scenes. However, recovering a high-quality NeRF typically requires tens to hundreds of input images, resulting in a time-consuming capture process. We present ReconFusion to reconstruct real-world scenes using only a few photos. Our approach leverages a diffusion prior for nove… ▽ More 3D reconstruction methods such as Neural Radiance Fields (NeRFs) excel at rendering photorealistic novel views of complex scenes. However, recovering a high-quality NeRF typically requires tens to hundreds of input images, resulting in a time-consuming capture process. We present ReconFusion to reconstruct real-world scenes using only a few photos. Our approach leverages a diffusion prior for novel view synthesis, trained on synthetic and multiview datasets, which regularizes a NeRF-based 3D reconstruction pipeline at novel camera poses beyond those captured by the set of input images. Our method synthesizes realistic geometry and texture in underconstrained regions while preserving the appearance of observed regions. We perform an extensive evaluation across various real-world datasets, including forward-facing and 360-degree scenes, demonstrating significant performance improvements over previous few-view NeRF reconstruction approaches. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Comments: Project page: https://reconfusion.github.io/

arXiv:2306.05724 [pdf, other]

Explaining Predictive Uncertainty with Information Theoretic Shapley Values

Authors: David S. Watson, Joshua O'Hara, Niek Tax, Richard Mudd, Ido Guy

Abstract: Researchers in explainable artificial intelligence have developed numerous methods for helping users understand the predictions of complex supervised learning models. By contrast, explaining the $\textit{uncertainty}$ of model outputs has received relatively little attention. We adapt the popular Shapley value framework to explain various types of predictive uncertainty, quantifying each feature's… ▽ More Researchers in explainable artificial intelligence have developed numerous methods for helping users understand the predictions of complex supervised learning models. By contrast, explaining the $\textit{uncertainty}$ of model outputs has received relatively little attention. We adapt the popular Shapley value framework to explain various types of predictive uncertainty, quantifying each feature's contribution to the conditional entropy of individual model outputs. We consider games with modified characteristic functions and find deep connections between the resulting Shapley values and fundamental quantities from information theory and conditional independence testing. We outline inference procedures for finite sample error rate control with provable guarantees, and implement efficient algorithms that perform well in a range of experiments on real and simulated data. Our method has applications to covariate shift detection, active learning, feature selection, and active feature-value acquisition. △ Less

Submitted 31 October, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

Comments: Camera ready version (NeurIPS 2023)

arXiv:2306.04027 [pdf, other]

Intervention Generalization: A View from Factor Graph Models

Authors: Gecia Bravo-Hermsdorff, David S. Watson, Jialin Yu, Jakob Zeitler, Ricardo Silva

Abstract: One of the goals of causal inference is to generalize from past experiments and observational data to novel conditions. While it is in principle possible to eventually learn a mapping from a novel experimental condition to an outcome of interest, provided a sufficient variety of experiments is available in the training data, coping with a large combinatorial space of possible interventions is hard… ▽ More One of the goals of causal inference is to generalize from past experiments and observational data to novel conditions. While it is in principle possible to eventually learn a mapping from a novel experimental condition to an outcome of interest, provided a sufficient variety of experiments is available in the training data, coping with a large combinatorial space of possible interventions is hard. Under a typical sparse experimental design, this mapping is ill-posed without relying on heavy regularization or prior distributions. Such assumptions may or may not be reliable, and can be hard to defend or test. In this paper, we take a close look at how to warrant a leap from past experiments to novel conditions based on minimal assumptions about the factorization of the distribution of the manipulated system, communicated in the well-understood language of factor graph models. A postulated $\textit{interventional factor model}$ (IFM) may not always be informative, but it conveniently abstracts away a need for explicitly modeling unmeasured confounding and feedback mechanisms, leading to directly testable claims. Given an IFM and datasets from a collection of experimental regimes, we derive conditions for identifiability of the expected outcomes of new regimes never observed in these training data. We implement our framework using several efficient algorithms, and apply them on a range of semi-synthetic experiments. △ Less

Submitted 8 November, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

Comments: Camera ready version (NeurIPS 2023)

arXiv:2305.14452 [pdf, other]

Fourier Neural Operators for Arbitrary Resolution Climate Data Downscaling

Authors: Qidong Yang, Alex Hernandez-Garcia, Paula Harder, Venkatesh Ramesh, Prasanna Sattegeri, Daniela Szwarcman, Campbell D. Watson, David Rolnick

Abstract: Climate simulations are essential in guiding our understanding of climate change and responding to its effects. However, it is computationally expensive to resolve complex climate processes at high spatial resolution. As one way to speed up climate simulations, neural networks have been used to downscale climate variables from fast-running low-resolution simulations, but high-resolution training d… ▽ More Climate simulations are essential in guiding our understanding of climate change and responding to its effects. However, it is computationally expensive to resolve complex climate processes at high spatial resolution. As one way to speed up climate simulations, neural networks have been used to downscale climate variables from fast-running low-resolution simulations, but high-resolution training data are often unobtainable or scarce, greatly limiting accuracy. In this work, we propose a downscaling method based on the Fourier neural operator. It trains with data of a small upsampling factor and then can zero-shot downscale its input to arbitrary unseen high resolution. Evaluated both on ERA5 climate model data and on the Navier-Stokes equation solution data, our downscaling model significantly outperforms state-of-the-art convolutional and generative adversarial downscaling models, both in standard single-resolution downscaling and in zero-shot generalization to higher upsampling factors. Furthermore, we show that our method also outperforms state-of-the-art data-driven partial differential equation solvers on Navier-Stokes equations. Overall, our work bridges the gap between simulation of a physical process and interpolation of low-resolution output, showing that it is possible to combine both approaches and significantly improve upon each other. △ Less

Submitted 30 May, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

Comments: Presented at the ICLR 2023 workshop on "Tackling Climate Change with Machine Learning"

arXiv:2304.14415 [pdf]

Generative AI Perceptions: A Survey to Measure the Perceptions of Faculty, Staff, and Students on Generative AI Tools in Academia

Authors: Sara Amani, Lance White, Trini Balart, Laksha Arora, Dr. Kristi J. Shryock, Dr. Kelly Brumbelow, Dr. Karan L. Watson

Abstract: ChatGPT is a natural language processing tool that can engage in human-like conversations and generate coherent and contextually relevant responses to various prompts. ChatGPT is capable of understanding natural text that is input by a user and generating appropriate responses in various forms. This tool represents a major step in how humans are interacting with technology. This paper specifically… ▽ More ChatGPT is a natural language processing tool that can engage in human-like conversations and generate coherent and contextually relevant responses to various prompts. ChatGPT is capable of understanding natural text that is input by a user and generating appropriate responses in various forms. This tool represents a major step in how humans are interacting with technology. This paper specifically focuses on how ChatGPT is revolutionizing the realm of engineering education and the relationship between technology, students, and faculty and staff. Because this tool is quickly changing and improving with the potential for even greater future capability, it is a critical time to collect pertinent data. A survey was created to measure the effects of ChatGPT on students, faculty, and staff. This survey is shared as a Texas A&M University technical report to allow other universities and entities to use this survey and measure the effects elsewhere. △ Less

Submitted 21 April, 2023; originally announced April 2023.

Comments: 17 pages, 3 figures

arXiv:2302.07864 [pdf, other]

Denoising Diffusion Probabilistic Models for Robust Image Super-Resolution in the Wild

Authors: Hshmat Sahak, Daniel Watson, Chitwan Saharia, David Fleet

Abstract: Diffusion models have shown promising results on single-image super-resolution and other image- to-image translation tasks. Despite this success, they have not outperformed state-of-the-art GAN models on the more challenging blind super-resolution task, where the input images are out of distribution, with unknown degradations. This paper introduces SR3+, a diffusion-based model for blind super-res… ▽ More Diffusion models have shown promising results on single-image super-resolution and other image- to-image translation tasks. Despite this success, they have not outperformed state-of-the-art GAN models on the more challenging blind super-resolution task, where the input images are out of distribution, with unknown degradations. This paper introduces SR3+, a diffusion-based model for blind super-resolution, establishing a new state-of-the-art. To this end, we advocate self-supervised training with a combination of composite, parameterized degradations for self-supervised training, and noise-conditioing augmentation during training and testing. With these innovations, a large-scale convolutional architecture, and large-scale datasets, SR3+ greatly outperforms SR3. It outperforms Real-ESRGAN when trained on the same data, with a DRealSR FID score of 36.82 vs. 37.22, which further improves to FID of 32.37 with larger models, and further still with larger training sets. △ Less

Submitted 15 February, 2023; originally announced February 2023.

arXiv:2301.00886 [pdf]

Effect of emotions and personalisation on cancer website reuse intentions

Authors: Suncica Hadzidedic, Alexandra I. Cristea, Derrick G. Watson

Abstract: The effect of emotions and personalisation on continuance use intentions in online health services is underexplored. Accordingly, we propose a research model for examining the impact of emotion- and personalisation-based factors on cancer website reuse intentions. We conducted a study using a real-world NGO cancer-support website, which was evaluated by 98 participants via an online questionnaire.… ▽ More The effect of emotions and personalisation on continuance use intentions in online health services is underexplored. Accordingly, we propose a research model for examining the impact of emotion- and personalisation-based factors on cancer website reuse intentions. We conducted a study using a real-world NGO cancer-support website, which was evaluated by 98 participants via an online questionnaire. Model relations were estimated using the PLS-SEM method. Our findings indicated that pre-use emotions did not significantly influence perceived personalisation. However, satisfaction with personalisation, and perceived usefulness mediated by satisfaction, increased reuse intentions. In addition, post-use positive emotions potentially influenced reuse intentions. Our paper, therefore, illustrates the applicability of theory regarding continuance use intentions to cancer-support websites and highlights the importance of personalisation for these purposes. △ Less

Submitted 2 January, 2023; originally announced January 2023.

Comments: 19 pages, 4 figures, 3 tables

arXiv:2210.13752 [pdf, other]

Aboveground carbon biomass estimate with Physics-informed deep network

Authors: Juan Nathaniel, Levente J. Klein, Campbell D. Watson, Gabrielle Nyirjesy, Conrad M. Albrecht

Abstract: The global carbon cycle is a key process to understand how our climate is changing. However, monitoring the dynamics is difficult because a high-resolution robust measurement of key state parameters including the aboveground carbon biomass (AGB) is required. Here, we use deep neural network to generate a wall-to-wall map of AGB within the Continental USA (CONUS) with 30-meter spatial resolution fo… ▽ More The global carbon cycle is a key process to understand how our climate is changing. However, monitoring the dynamics is difficult because a high-resolution robust measurement of key state parameters including the aboveground carbon biomass (AGB) is required. Here, we use deep neural network to generate a wall-to-wall map of AGB within the Continental USA (CONUS) with 30-meter spatial resolution for the year 2021. We combine radar and optical hyperspectral imagery, with a physical climate parameter of SIF-based GPP. Validation results show that a masked variation of UNet has the lowest validation RMSE of 37.93 $\pm$ 1.36 Mg C/ha, as compared to 52.30 $\pm$ 0.03 Mg C/ha for random forest algorithm. Furthermore, models that learn from SIF-based GPP in addition to radar and optical imagery reduce validation RMSE by almost 10% and the standard deviation by 40%. Finally, we apply our model to measure losses in AGB from the recent 2021 Caldor wildfire in California, and validate our analysis with Sentinel-based burn index. △ Less

Submitted 24 October, 2022; originally announced October 2022.

Comments: 6 pages, 5 figures

arXiv:2210.04628 [pdf, other]

Novel View Synthesis with Diffusion Models

Authors: Daniel Watson, William Chan, Ricardo Martin-Brualla, Jonathan Ho, Andrea Tagliasacchi, Mohammad Norouzi

Abstract: We present 3DiM, a diffusion model for 3D novel view synthesis, which is able to translate a single input view into consistent and sharp completions across many views. The core component of 3DiM is a pose-conditional image-to-image diffusion model, which takes a source view and its pose as inputs, and generates a novel view for a target pose as output. 3DiM can generate multiple views that are 3D… ▽ More We present 3DiM, a diffusion model for 3D novel view synthesis, which is able to translate a single input view into consistent and sharp completions across many views. The core component of 3DiM is a pose-conditional image-to-image diffusion model, which takes a source view and its pose as inputs, and generates a novel view for a target pose as output. 3DiM can generate multiple views that are 3D consistent using a novel technique called stochastic conditioning. The output views are generated autoregressively, and during the generation of each novel view, one selects a random conditioning view from the set of available views at each denoising step. We demonstrate that stochastic conditioning significantly improves the 3D consistency of a naive sampler for an image-to-image diffusion model, which involves conditioning on a single fixed view. We compare 3DiM to prior work on the SRN ShapeNet dataset, demonstrating that 3DiM's generated completions from a single view achieve much higher fidelity, while being approximately 3D consistent. We also introduce a new evaluation methodology, 3D consistency scoring, to measure the 3D consistency of a generated object by training a neural field on the model's output views. 3DiM is geometry free, does not rely on hyper-networks or test-time optimization for novel view synthesis, and allows a single model to easily scale to a large number of scenes. △ Less

Submitted 6 October, 2022; originally announced October 2022.

arXiv:2210.03047 [pdf, other]

doi 10.1007/s10182-023-00477-9

Conditional Feature Importance for Mixed Data

Authors: Kristin Blesch, David S. Watson, Marvin N. Wright

Abstract: Despite the popularity of feature importance (FI) measures in interpretable machine learning, the statistical adequacy of these methods is rarely discussed. From a statistical perspective, a major distinction is between analyzing a variable's importance before and after adjusting for covariates - i.e., between $\textit{marginal}$ and $\textit{conditional}$ measures. Our work draws attention to thi… ▽ More Despite the popularity of feature importance (FI) measures in interpretable machine learning, the statistical adequacy of these methods is rarely discussed. From a statistical perspective, a major distinction is between analyzing a variable's importance before and after adjusting for covariates - i.e., between $\textit{marginal}$ and $\textit{conditional}$ measures. Our work draws attention to this rarely acknowledged, yet crucial distinction and showcases its implications. Further, we reveal that for testing conditional FI, only few methods are available and practitioners have hitherto been severely restricted in method application due to mismatching data requirements. Most real-world data exhibits complex feature dependencies and incorporates both continuous and categorical data (mixed data). Both properties are oftentimes neglected by conditional FI measures. To fill this gap, we propose to combine the conditional predictive impact (CPI) framework with sequential knockoff sampling. The CPI enables conditional FI measurement that controls for any feature dependencies by sampling valid knockoffs - hence, generating synthetic data with similar statistical properties - for the data to be analyzed. Sequential knockoffs were deliberately designed to handle mixed data and thus allow us to extend the CPI approach to such datasets. We demonstrate through numerous simulations and a real-world example that our proposed workflow controls type I error, achieves high power and is in line with results given by other conditional FI measures, whereas marginal FI metrics result in misleading interpretations. Our findings highlight the necessity of developing statistically adequate, specialized methods for mixed data. △ Less

Submitted 2 May, 2023; v1 submitted 6 October, 2022; originally announced October 2022.

Journal ref: AStA Advances in Statistical Analysis (2023)

arXiv:2208.07965 [pdf]

Improving the Cybersecurity of Critical National Infrastructure using Modelling and Simulation

Authors: Uchenna D Ani, Jeremy D McK Watson, Nilufer Tuptuk, Steve Hailes, Madeline Carr, Carsten Maple

Abstract: The UK Critical National Infrastructure is critically dependent on digital technologies that provide communications, monitoring, control, and decision-support functionalities. Digital technologies are progressively enhancing efficiency, reliability, and availability of infrastructure, and enabling new benefits not previously available. These benefits can introduce vulnerabilities through the conne… ▽ More The UK Critical National Infrastructure is critically dependent on digital technologies that provide communications, monitoring, control, and decision-support functionalities. Digital technologies are progressively enhancing efficiency, reliability, and availability of infrastructure, and enabling new benefits not previously available. These benefits can introduce vulnerabilities through the connectivity enabled by the digital systems, thus, making it easier for would-be attackers, who frequently use socio-technical approaches, exploiting humans-in-the-loop to break in and sabotage an organization. Therefore, policies and strategies that minimize and manage risks must include an understanding of operator and corporate behaviors, as well as technical elements and the interfaces between them and humans. Better security via socio-technical security Modelling and Simulation can be achieved if backed by government effort, including appropriate policy interventions. Government, through its departments and agencies, can contribute by sign-posting and shaping the decision-making environment concerning cybersecurity M&S approaches and tools, showing how they can contribute to enhancing security in Modern Critical Infrastructure Systems. △ Less

Submitted 16 August, 2022; originally announced August 2022.

Comments: 7 pages, 5 Figures, Policy Briefing

arXiv:2207.11417 [pdf, other]

Multiscale Neural Operator: Learning Fast and Grid-independent PDE Solvers

Authors: Björn Lütjens, Catherine H. Crawford, Campbell D Watson, Christopher Hill, Dava Newman

Abstract: Numerical simulations in climate, chemistry, or astrophysics are computationally too expensive for uncertainty quantification or parameter-exploration at high-resolution. Reduced-order or surrogate models are multiple orders of magnitude faster, but traditional surrogates are inflexible or inaccurate and pure machine learning (ML)-based surrogates too data-hungry. We propose a hybrid, flexible sur… ▽ More Numerical simulations in climate, chemistry, or astrophysics are computationally too expensive for uncertainty quantification or parameter-exploration at high-resolution. Reduced-order or surrogate models are multiple orders of magnitude faster, but traditional surrogates are inflexible or inaccurate and pure machine learning (ML)-based surrogates too data-hungry. We propose a hybrid, flexible surrogate model that exploits known physics for simulating large-scale dynamics and limits learning to the hard-to-model term, which is called parametrization or closure and captures the effect of fine- onto large-scale dynamics. Leveraging neural operators, we are the first to learn grid-independent, non-local, and flexible parametrizations. Our \textit{multiscale neural operator} is motivated by a rich literature in multiscale modeling, has quasilinear runtime complexity, is more accurate or flexible than state-of-the-art parametrizations and demonstrated on the chaotic equation multiscale Lorenz96. △ Less

Submitted 23 July, 2022; originally announced July 2022.

Comments: Presented at International Conference on Machine Learning Workshop AI for Science, 2022

arXiv:2205.09435 [pdf, other]

Adversarial random forests for density estimation and generative modeling

Authors: David S. Watson, Kristin Blesch, Jan Kapar, Marvin N. Wright

Abstract: We propose methods for density estimation and data synthesis using a novel form of unsupervised random forests. Inspired by generative adversarial networks, we implement a recursive procedure in which trees gradually learn structural properties of the data through alternating rounds of generation and discrimination. The method is provably consistent under minimal assumptions. Unlike classic tree-b… ▽ More We propose methods for density estimation and data synthesis using a novel form of unsupervised random forests. Inspired by generative adversarial networks, we implement a recursive procedure in which trees gradually learn structural properties of the data through alternating rounds of generation and discrimination. The method is provably consistent under minimal assumptions. Unlike classic tree-based alternatives, our approach provides smooth (un)conditional densities and allows for fully synthetic data generation. We achieve comparable or superior performance to state-of-the-art probabilistic circuits and deep learning models on various tabular data benchmarks while executing about two orders of magnitude faster on average. An accompanying $\texttt{R}$ package, $\texttt{arf}$, is available on $\texttt{CRAN}$. △ Less

Submitted 13 March, 2023; v1 submitted 19 May, 2022; originally announced May 2022.

Comments: Camera ready version (AISTATS 2023)

Journal ref: Proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS 2023)

arXiv:2205.05715 [pdf, other]

Causal discovery under a confounder blanket

Authors: David S. Watson, Ricardo Silva

Abstract: Inferring causal relationships from observational data is rarely straightforward, but the problem is especially difficult in high dimensions. For these applications, causal discovery algorithms typically require parametric restrictions or extreme sparsity constraints. We relax these assumptions and focus on an important but more specialized problem, namely recovering the causal order among a subgr… ▽ More Inferring causal relationships from observational data is rarely straightforward, but the problem is especially difficult in high dimensions. For these applications, causal discovery algorithms typically require parametric restrictions or extreme sparsity constraints. We relax these assumptions and focus on an important but more specialized problem, namely recovering the causal order among a subgraph of variables known to descend from some (possibly large) set of confounding covariates, i.e. a $\textit{confounder blanket}$. This is useful in many settings, for example when studying a dynamic biomolecular subsystem with genetic data providing background information. Under a structural assumption called the $\textit{confounder blanket principle}$, which we argue is essential for tractable causal discovery in high dimensions, our method accommodates graphs of low or high sparsity while maintaining polynomial time complexity. We present a structure learning algorithm that is provably sound and complete with respect to a so-called $\textit{lazy oracle}$. We design inference procedures with finite sample error control for linear and nonlinear systems, and demonstrate our approach on a range of simulated and real-world datasets. An accompanying $\texttt{R}$ package, $\texttt{cbl}$, is available from $\texttt{CRAN}$. △ Less

Submitted 28 June, 2022; v1 submitted 11 May, 2022; originally announced May 2022.

Comments: Camera ready version (UAI 2022)

Journal ref: 38th Conference on Uncertainty in Artificial Intelligence (UAI 2022)

arXiv:2202.10806 [pdf, other]

Stochastic Causal Programming for Bounding Treatment Effects

Authors: Kirtan Padh, Jakob Zeitler, David Watson, Matt Kusner, Ricardo Silva, Niki Kilbertus

Abstract: Causal effect estimation is important for many tasks in the natural and social sciences. We design algorithms for the continuous partial identification problem: bounding the effects of multivariate, continuous treatments when unmeasured confounding makes identification impossible. Specifically, we cast causal effects as objective functions within a constrained optimization problem, and minimize/ma… ▽ More Causal effect estimation is important for many tasks in the natural and social sciences. We design algorithms for the continuous partial identification problem: bounding the effects of multivariate, continuous treatments when unmeasured confounding makes identification impossible. Specifically, we cast causal effects as objective functions within a constrained optimization problem, and minimize/maximize these functions to obtain bounds. We combine flexible learning algorithms with Monte Carlo methods to implement a family of solutions under the name of stochastic causal programming. In particular, we show how the generic framework can be efficiently formulated in settings where auxiliary variables are clustered into pre-treatment and post-treatment sets, where no fine-grained causal graph can be easily specified. In these settings, we can avoid the need for fully specifying the distribution family of hidden common causes. Monte Carlo computation is also much simplified, leading to algorithms which are more computationally stable against alternatives. △ Less

Submitted 17 May, 2023; v1 submitted 22 February, 2022; originally announced February 2022.

Journal ref: Proceedings of Machine Learning Research vol 213:1-35, 2023

arXiv:2202.05830 [pdf, other]

Learning Fast Samplers for Diffusion Models by Differentiating Through Sample Quality

Authors: Daniel Watson, William Chan, Jonathan Ho, Mohammad Norouzi

Abstract: Diffusion models have emerged as an expressive family of generative models rivaling GANs in sample quality and autoregressive models in likelihood scores. Standard diffusion models typically require hundreds of forward passes through the model to generate a single high-fidelity sample. We introduce Differentiable Diffusion Sampler Search (DDSS): a method that optimizes fast samplers for any pre-tr… ▽ More Diffusion models have emerged as an expressive family of generative models rivaling GANs in sample quality and autoregressive models in likelihood scores. Standard diffusion models typically require hundreds of forward passes through the model to generate a single high-fidelity sample. We introduce Differentiable Diffusion Sampler Search (DDSS): a method that optimizes fast samplers for any pre-trained diffusion model by differentiating through sample quality scores. We also present Generalized Gaussian Diffusion Models (GGDM), a family of flexible non-Markovian samplers for diffusion models. We show that optimizing the degrees of freedom of GGDM samplers by maximizing sample quality scores via gradient descent leads to improved sample quality. Our optimization procedure backpropagates through the sampling process using the reparametrization trick and gradient rematerialization. DDSS achieves strong results on unconditional image generation across various datasets (e.g., FID scores on LSUN church 128x128 of 11.6 with only 10 inference steps, and 4.82 with 20 steps, compared to 51.1 and 14.9 with strongest DDPM/DDIM baselines). Our method is compatible with any pre-trained diffusion model without fine-tuning or re-training required. △ Less

Submitted 11 February, 2022; originally announced February 2022.

Comments: Published as a conference paper at ICLR 2022

arXiv:2201.01837 [pdf, other]

Frame Shift Prediction

Authors: Zheng-Xin Yong, Patrick D. Watson, Tiago Timponi Torrent, Oliver Czulo, Collin F. Baker

Abstract: Frame shift is a cross-linguistic phenomenon in translation which results in corresponding pairs of linguistic material evoking different frames. The ability to predict frame shifts enables automatic creation of multilingual FrameNets through annotation projection. Here, we propose the Frame Shift Prediction task and demonstrate that graph attention networks, combined with auxiliary training, can… ▽ More Frame shift is a cross-linguistic phenomenon in translation which results in corresponding pairs of linguistic material evoking different frames. The ability to predict frame shifts enables automatic creation of multilingual FrameNets through annotation projection. Here, we propose the Frame Shift Prediction task and demonstrate that graph attention networks, combined with auxiliary training, can learn cross-linguistic frame-to-frame correspondence and predict frame shifts. △ Less

Submitted 5 January, 2022; originally announced January 2022.

arXiv:2112.05254 [pdf, other]

Addressing Deep Learning Model Uncertainty in Long-Range Climate Forecasting with Late Fusion

Authors: Ken C. L. Wong, Hongzhi Wang, Etienne E. Vos, Bianca Zadrozny, Campbell D. Watson, Tanveer Syeda-Mahmood

Abstract: Global warming leads to the increase in frequency and intensity of climate extremes that cause tremendous loss of lives and property. Accurate long-range climate prediction allows more time for preparation and disaster risk management for such extreme events. Although machine learning approaches have shown promising results in long-range climate forecasting, the associated model uncertainties may… ▽ More Global warming leads to the increase in frequency and intensity of climate extremes that cause tremendous loss of lives and property. Accurate long-range climate prediction allows more time for preparation and disaster risk management for such extreme events. Although machine learning approaches have shown promising results in long-range climate forecasting, the associated model uncertainties may reduce their reliability. To address this issue, we propose a late fusion approach that systematically combines the predictions from multiple models to reduce the expected errors of the fused results. We also propose a network architecture with the novel denormalization layer to gain the benefits of data normalization without actually normalizing the data. The experimental results on long-range 2m temperature forecasting show that the framework outperforms the 30-year climate normals, and the accuracy can be improved by increasing the number of models. △ Less

Submitted 9 December, 2021; originally announced December 2021.

Comments: Accepted by the NeurIPS 2021 Workshop on Tackling Climate Change with Machine Learning

arXiv:2106.10191 [pdf, other]

doi 10.1145/3531146.3533170

Rational Shapley Values

Authors: David S. Watson

Abstract: Explaining the predictions of opaque machine learning algorithms is an important and challenging task, especially as complex models are increasingly used to assist in high-stakes decisions such as those arising in healthcare and finance. Most popular tools for post-hoc explainable artificial intelligence (XAI) are either insensitive to context (e.g., feature attributions) or difficult to summarize… ▽ More Explaining the predictions of opaque machine learning algorithms is an important and challenging task, especially as complex models are increasingly used to assist in high-stakes decisions such as those arising in healthcare and finance. Most popular tools for post-hoc explainable artificial intelligence (XAI) are either insensitive to context (e.g., feature attributions) or difficult to summarize (e.g., counterfactuals). In this paper, I introduce $\textit{rational Shapley values}$, a novel XAI method that synthesizes and extends these seemingly incompatible approaches in a rigorous, flexible manner. I leverage tools from decision theory and causal modeling to formalize and implement a pragmatic approach that resolves a number of known challenges in XAI. By pairing the distribution of random variables with the appropriate reference class for a given explanation task, I illustrate through theory and experiments how user goals and knowledge can inform and constrain the solution set in an iterative fashion. The method compares favorably to state of the art XAI tools in a range of quantitative and qualitative comparisons. △ Less

Submitted 16 May, 2022; v1 submitted 18 June, 2021; originally announced June 2021.

Comments: To be presented at the 2022 ACM FAccT Conference

Journal ref: 2022 ACM Conference on Fairness, Accountability, and Transparency

arXiv:2106.05074 [pdf, other]

Operationalizing Complex Causes: A Pragmatic View of Mediation

Authors: Limor Gultchin, David S. Watson, Matt J. Kusner, Ricardo Silva

Abstract: We examine the problem of causal response estimation for complex objects (e.g., text, images, genomics). In this setting, classical \emph{atomic} interventions are often not available (e.g., changes to characters, pixels, DNA base-pairs). Instead, we only have access to indirect or \emph{crude} interventions (e.g., enrolling in a writing program, modifying a scene, applying a gene therapy). In thi… ▽ More We examine the problem of causal response estimation for complex objects (e.g., text, images, genomics). In this setting, classical \emph{atomic} interventions are often not available (e.g., changes to characters, pixels, DNA base-pairs). Instead, we only have access to indirect or \emph{crude} interventions (e.g., enrolling in a writing program, modifying a scene, applying a gene therapy). In this work, we formalize this problem and provide an initial solution. Given a collection of candidate mediators, we propose (a) a two-step method for predicting the causal responses of crude interventions; and (b) a testing procedure to identify mediators of crude interventions. We demonstrate, on a range of simulated and real-world-inspired examples, that our approach allows us to efficiently estimate the effect of crude interventions with limited data from new treatment regimes. △ Less

Submitted 10 June, 2021; v1 submitted 9 June, 2021; originally announced June 2021.

Journal ref: International Conference on Machine Learning 2021

arXiv:2106.03802 [pdf, other]

Learning to Efficiently Sample from Diffusion Probabilistic Models

Authors: Daniel Watson, Jonathan Ho, Mohammad Norouzi, William Chan

Abstract: Denoising Diffusion Probabilistic Models (DDPMs) have emerged as a powerful family of generative models that can yield high-fidelity samples and competitive log-likelihoods across a range of domains, including image and speech synthesis. Key advantages of DDPMs include ease of training, in contrast to generative adversarial networks, and speed of generation, in contrast to autoregressive models. H… ▽ More Denoising Diffusion Probabilistic Models (DDPMs) have emerged as a powerful family of generative models that can yield high-fidelity samples and competitive log-likelihoods across a range of domains, including image and speech synthesis. Key advantages of DDPMs include ease of training, in contrast to generative adversarial networks, and speed of generation, in contrast to autoregressive models. However, DDPMs typically require hundreds-to-thousands of steps to generate a high fidelity sample, making them prohibitively expensive for high dimensional problems. Fortunately, DDPMs allow trading generation speed for sample quality through adjusting the number of refinement steps as a post process. Prior work has been successful in improving generation speed through handcrafting the time schedule by trial and error. We instead view the selection of the inference time schedules as an optimization problem, and introduce an exact dynamic programming algorithm that finds the optimal discrete time schedules for any pre-trained DDPM. Our method exploits the fact that ELBO can be decomposed into separate KL terms, and given any computation budget, discovers the time schedule that maximizes the training ELBO exactly. Our method is efficient, has no hyper-parameters of its own, and can be applied to any pre-trained DDPM with no retraining. We discover inference time schedules requiring as few as 32 refinement steps, while sacrificing less than 0.1 bits per dimension compared to the default 4,000 steps used on ImageNet 64x64 [Ho et al., 2020; Nichol and Dhariwal, 2021]. △ Less

Submitted 7 June, 2021; originally announced June 2021.

arXiv:2103.14651 [pdf, other]

Local Explanations via Necessity and Sufficiency: Unifying Theory and Practice

Authors: David Watson, Limor Gultchin, Ankur Taly, Luciano Floridi

Abstract: Necessity and sufficiency are the building blocks of all successful explanations. Yet despite their importance, these notions have been conceptually underdeveloped and inconsistently applied in explainable artificial intelligence (XAI), a fast-growing research area that is so far lacking in firm theoretical foundations. Building on work in logic, probability, and causality, we establish the centra… ▽ More Necessity and sufficiency are the building blocks of all successful explanations. Yet despite their importance, these notions have been conceptually underdeveloped and inconsistently applied in explainable artificial intelligence (XAI), a fast-growing research area that is so far lacking in firm theoretical foundations. Building on work in logic, probability, and causality, we establish the central role of necessity and sufficiency in XAI, unifying seemingly disparate methods in a single formal framework. We provide a sound and complete algorithm for computing explanatory factors with respect to a given context, and demonstrate its flexibility and competitive performance against state of the art alternatives on various tasks. △ Less

Submitted 10 June, 2021; v1 submitted 26 March, 2021; originally announced March 2021.

Journal ref: 37th Conference on Uncertainty in Artificial Intelligence (UAI 2021)

arXiv:2102.04534 [pdf, other]

A modular framework for extreme weather generation

Authors: Bianca Zadrozny, Campbell D. Watson, Daniela Szwarcman, Daniel Civitarese, Dario Oliveira, Eduardo Rodrigues, Jorge Guevara

Abstract: Extreme weather events have an enormous impact on society and are expected to become more frequent and severe with climate change. In this context, resilience planning becomes crucial for risk mitigation and coping with these extreme events. Machine learning techniques can play a critical role in resilience planning through the generation of realistic extreme weather event scenarios that can be us… ▽ More Extreme weather events have an enormous impact on society and are expected to become more frequent and severe with climate change. In this context, resilience planning becomes crucial for risk mitigation and coping with these extreme events. Machine learning techniques can play a critical role in resilience planning through the generation of realistic extreme weather event scenarios that can be used to evaluate possible mitigation actions. This paper proposes a modular framework that relies on interchangeable components to produce extreme weather event scenarios. We discuss possible alternatives for each of the components and show initial results comparing two approaches on the task of generating precipitation scenarios. △ Less

Submitted 5 February, 2021; originally announced February 2021.

arXiv:2012.12717 [pdf, ps, other]

The Complexity of Translationally Invariant Problems beyond Ground State Energies

Authors: James D. Watson, Johannes Bausch, Sevag Gharibian

Abstract: It is known that three fundamental questions regarding local Hamiltonians -- approximating the ground state energy (the Local Hamiltonian problem), simulating local measurements on the ground space (APX-SIM), and deciding if the low energy space has an energy barrier (GSCON) -- are $\mathsf{QMA}$-hard, $\mathsf{P}^{\mathsf{QMA}[log]}$-hard and $\mathsf{QCMA}$-hard, respectively, meaning they are l… ▽ More It is known that three fundamental questions regarding local Hamiltonians -- approximating the ground state energy (the Local Hamiltonian problem), simulating local measurements on the ground space (APX-SIM), and deciding if the low energy space has an energy barrier (GSCON) -- are $\mathsf{QMA}$-hard, $\mathsf{P}^{\mathsf{QMA}[log]}$-hard and $\mathsf{QCMA}$-hard, respectively, meaning they are likely intractable even on a quantum computer. Yet while hardness for the Local Hamiltonian problem is known to hold even for translationally-invariant systems, it is not yet known whether APX-SIM and GSCON remain hard in such "simple" systems. In this work, we show that the translationally invariant versions of both APX-SIM and GSCON remain intractable, namely are $\mathsf{P}^{\mathsf{QMA}_{\mathsf{EXP}}}$- and $\mathsf{QCMA}_{\mathsf{EXP}}$-complete, respectively. Each of these results is attained by giving a respective generic "lifting theorem" for producing hardness results. For APX-SIM, for example, we give a framework for "lifting" any abstract local circuit-to-Hamiltonian mapping $H$ (satisfying mild assumptions) to hardness of APX-SIM on the family of Hamiltonians produced by $H$, while preserving the structural and geometric properties of $H$ (e.g. translation invariance, geometry, locality, etc). Each result also leverages counterintuitive properties of our constructions: for APX-SIM, we "compress" the answers to polynomially many parallel queries to a QMA oracle into a single qubit. For GSCON, we give a hardness construction robust against highly non-local unitaries, i.e. even if the adversary acts on all but one qudit in the system in each step. △ Less

Submitted 23 December, 2020; originally announced December 2020.

Comments: 58 pages, 4 figures

arXiv:2005.08792 [pdf, other]

Causal Feature Learning for Utility-Maximizing Agents

Authors: David Kinney, David Watson

Abstract: Discovering high-level causal relations from low-level data is an important and challenging problem that comes up frequently in the natural and social sciences. In a series of papers, Chalupka et al. (2015, 2016a, 2016b, 2017) develop a procedure for causal feature learning (CFL) in an effort to automate this task. We argue that CFL does not recommend coarsening in cases where pragmatic considerat… ▽ More Discovering high-level causal relations from low-level data is an important and challenging problem that comes up frequently in the natural and social sciences. In a series of papers, Chalupka et al. (2015, 2016a, 2016b, 2017) develop a procedure for causal feature learning (CFL) in an effort to automate this task. We argue that CFL does not recommend coarsening in cases where pragmatic considerations rule in favor of it, and recommends coarsening in cases where pragmatic considerations rule against it. We propose a new technique, pragmatic causal feature learning (PCFL), which extends the original CFL algorithm in useful and intuitive ways. We show that PCFL has the same attractive measure-theoretic properties as the original CFL algorithm. We compare the performance of both methods through theoretical analysis and experiments. △ Less

Submitted 27 August, 2020; v1 submitted 18 May, 2020; originally announced May 2020.

Comments: Forthcoming in the Proceedings of the 10th International Conference on Probabilistic Graphical Models

arXiv:1904.01551 [pdf]

A Review of Critical Infrastructure Protection Approaches: Improving Security through Responsiveness to the Dynamic Modelling Landscape

Authors: Uchenna D Ani, Jeremy D McK. Watson, Jason R. C. Nurse, Al Cook, Carsten Maple

Abstract: As new technologies such as the Internet of Things (IoT) are integrated into Critical National Infrastructures (CNI), new cybersecurity threats emerge that require specific security solutions. Approaches used for analysis include the modelling and simulation of critical infrastructure systems using attributes, functionalities, operations, and behaviours to support various security analysis viewpoi… ▽ More As new technologies such as the Internet of Things (IoT) are integrated into Critical National Infrastructures (CNI), new cybersecurity threats emerge that require specific security solutions. Approaches used for analysis include the modelling and simulation of critical infrastructure systems using attributes, functionalities, operations, and behaviours to support various security analysis viewpoints, recognising and appropriately managing associated security risks. With several critical infrastructure protection approaches available, the question of how to effectively model the complex behaviour of interconnected CNI elements and to configure their protection as a system-of-systems remains a challenge. Using a systematic review approach, existing critical infrastructure protection approaches (tools and techniques) are examined to determine their suitability given trends like IoT, and effective security modelling and analysis issues. It is found that empirical-based, agent-based, system dynamics-based, and network-based modelling are more commonly applied than economic-based and equation-based techniques, and empirical-based modelling is the most widely used. The energy and transportation critical infrastructure sectors reflect the most responsive sectors, and no one Critical Infrastructure Protection (CIP) approach - tool, technique, methodology or framework -- provides a fit-for-all capacity for all-round attribute modelling and simulation of security risks. Typically, deciding factors for CIP choices to adopt are often dominated by trade-offs between complexity of use and popularity of approach, as well as between specificity and generality of application in sectors. △ Less

Submitted 2 April, 2019; originally announced April 2019.

Comments: PETRAS/IET Conference Living in the Internet of Things: Cybersecurity of the IoT 2019

arXiv:1901.09917 [pdf, other]

Testing Conditional Independence in Supervised Learning Algorithms

Authors: David S. Watson, Marvin N. Wright

Abstract: We propose the conditional predictive impact (CPI), a consistent and unbiased estimator of the association between one or several features and a given outcome, conditional on a reduced feature set. Building on the knockoff framework of Candès et al. (2018), we develop a novel testing procedure that works in conjunction with any valid knockoff sampler, supervised learning algorithm, and loss functi… ▽ More We propose the conditional predictive impact (CPI), a consistent and unbiased estimator of the association between one or several features and a given outcome, conditional on a reduced feature set. Building on the knockoff framework of Candès et al. (2018), we develop a novel testing procedure that works in conjunction with any valid knockoff sampler, supervised learning algorithm, and loss function. The CPI can be efficiently computed for high-dimensional data without any sparsity constraints. We demonstrate convergence criteria for the CPI and develop statistical inference procedures for evaluating its magnitude, significance, and precision. These tests aid in feature and model selection, extending traditional frequentist and Bayesian techniques to general supervised learning tasks. The CPI may also be applied in causal discovery to identify underlying multivariate graph structures. We test our method using various algorithms, including linear regression, neural networks, random forests, and support vector machines. Empirical results show that the CPI compares favorably to alternative variable importance measures and other nonparametric tests of conditional independence on a diverse array of real and simulated datasets. Simulations confirm that our inference procedures successfully control Type I error and achieve nominal coverage probability. Our method has been implemented in an R package, cpi, which can be downloaded from https://github.com/dswatson/cpi. △ Less

Submitted 13 May, 2021; v1 submitted 28 January, 2019; originally announced January 2019.

arXiv:1811.03416 [pdf]

doi 10.1177/2053951719842540

Are the Dead Taking Over Facebook? A Big Data Approach to the Future of Death Online

Authors: Carl Öhman, David Watson

Abstract: We project the future accumulation of profiles belonging to deceased Facebook users. Our analysis suggests that a minimum of 1.4 billion users will pass away before 2100 if Facebook ceases to attract new users as of 2018. If the network continues expanding at current rates, however, this number will exceed 4.9 billion. In both cases, a majority of the profiles will belong to non-Western users. In… ▽ More We project the future accumulation of profiles belonging to deceased Facebook users. Our analysis suggests that a minimum of 1.4 billion users will pass away before 2100 if Facebook ceases to attract new users as of 2018. If the network continues expanding at current rates, however, this number will exceed 4.9 billion. In both cases, a majority of the profiles will belong to non-Western users. In discussing our findings, we draw on the emerging scholarship on digital preservation and stress the challenges arising from curating the profiles of the deceased. We argue that an exclusively commercial approach to data preservation poses important ethical and political risks that demand urgent consideration. We call for a scalable, sustainable, and dignified curation model that incorporates the interests of multiple stakeholders. △ Less

Submitted 6 May, 2019; v1 submitted 30 October, 2018; originally announced November 2018.

Comments: 22 pages, 4 figures. Big Data & Society (2019)

arXiv:1809.01534 [pdf, other]

Utilizing Character and Word Embeddings for Text Normalization with Sequence-to-Sequence Models

Authors: Daniel Watson, Nasser Zalmout, Nizar Habash

Abstract: Text normalization is an important enabling technology for several NLP tasks. Recently, neural-network-based approaches have outperformed well-established models in this task. However, in languages other than English, there has been little exploration in this direction. Both the scarcity of annotated data and the complexity of the language increase the difficulty of the problem. To address these c… ▽ More Text normalization is an important enabling technology for several NLP tasks. Recently, neural-network-based approaches have outperformed well-established models in this task. However, in languages other than English, there has been little exploration in this direction. Both the scarcity of annotated data and the complexity of the language increase the difficulty of the problem. To address these challenges, we use a sequence-to-sequence model with character-based attention, which in addition to its self-learned character embeddings, uses word embeddings pre-trained with an approach that also models subword information. This provides the neural model with access to more linguistic information especially suitable for text normalization, without large parallel corpora. We show that providing the model with word-level features bridges the gap for the neural network approach to achieve a state-of-the-art F1 score on a standard Arabic language correction shared task dataset. △ Less

Submitted 5 September, 2018; originally announced September 2018.

Comments: Accepted in EMNLP 2018

ACM Class: I.2.6

arXiv:1610.09485 [pdf]

doi 10.1007/s11229-016-1238-2

Crowdsourced science: sociotechnical epistemology in the e-research paradigm

Authors: David Watson, Luciano Floridi

Abstract: Recent years have seen a surge in online collaboration between experts and amateurs on scientific research. In this article, we analyse the epistemological implications of these crowdsourced projects, with a focus on Zooniverse, the world's largest citizen science web portal. We use quantitative methods to evaluate the platform's success in producing large volumes of observation statements and hig… ▽ More Recent years have seen a surge in online collaboration between experts and amateurs on scientific research. In this article, we analyse the epistemological implications of these crowdsourced projects, with a focus on Zooniverse, the world's largest citizen science web portal. We use quantitative methods to evaluate the platform's success in producing large volumes of observation statements and high impact scientific discoveries relative to more conventional means of data processing. Through empirical evidence, Bayesian reasoning, and conceptual analysis, we show how information and communication technologies enhance the reliability, scalability, and connectivity of crowdsourced e-research, giving online citizen science projects powerful epistemic advantages over more traditional modes of scientific investigation. These results highlight the essential role played by technologically mediated social interaction in contemporary knowledge production. We conclude by calling for an explicitly sociotechnical turn in the philosophy of science that combines insights from statistics and logic to analyse the latest developments in scientific research. △ Less

Submitted 29 October, 2016; originally announced October 2016.

Comments: Synthese, October 2016

arXiv:1601.00210 [pdf]

doi 10.1109/BIBE.2008.4696789

Susceptibility of texture measures to noise: an application to lung tumor CT images

Authors: O. S. Al-Kadi, D. Watson

Abstract: Five different texture methods are used to investigate their susceptibility to subtle noise occurring in lung tumor Computed Tomography (CT) images caused by acquisition and reconstruction deficiencies. Noise of Gaussian and Rayleigh distributions with varying mean and variance was encountered in the analyzed CT images. Fisher and Bhattacharyya distance measures were used to differentiate between… ▽ More Five different texture methods are used to investigate their susceptibility to subtle noise occurring in lung tumor Computed Tomography (CT) images caused by acquisition and reconstruction deficiencies. Noise of Gaussian and Rayleigh distributions with varying mean and variance was encountered in the analyzed CT images. Fisher and Bhattacharyya distance measures were used to differentiate between an original extracted lung tumor region of interest (ROI) with a filtered and noisy reconstructed versions. Through examining the texture characteristics of the lung tumor areas by five different texture measures, it was determined that the autocovariance measure was least affected and the gray level co-occurrence matrix was the most affected by noise. Depending on the selected ROI size, it was concluded that the number of extracted features from each texture measure increases susceptibility to noise. △ Less

Submitted 2 January, 2016; originally announced January 2016.

Comments: 8th International Conference on BioInformatics and BioEngineering, Greece, 2008

arXiv:cs/0604019 [pdf, ps, other]

The Case for Modeling Security, Privacy, Usability and Reliability (SPUR) in Automotive Software

Authors: K. Venkatesh Prasad, TJ Giuli, David Watson

Abstract: Over the past five years, there has been considerable growth and established value in the practice of modeling automotive software requirements. Much of this growth has been centered on requirements of software associated with the established functional areas of an automobile, such as those associated with powertrain, chassis, body, safety and infotainment. This paper makes a case for modeling f… ▽ More Over the past five years, there has been considerable growth and established value in the practice of modeling automotive software requirements. Much of this growth has been centered on requirements of software associated with the established functional areas of an automobile, such as those associated with powertrain, chassis, body, safety and infotainment. This paper makes a case for modeling four additional attributes that are increasingly important as vehicles become information conduits: security, privacy, usability, and reliability. These four attributes are important in creating specifications for embedded in-vehicle automotive software. △ Less

Submitted 6 April, 2006; originally announced April 2006.

Comments: 12 pages, 3 figures, presented at the 2006 Automotive Software Workshop, San Diego, CA

ACM Class: D.2.4; K.4.1; H.5.2; K.6.5

Showing 1–40 of 40 results for author: Watson, D