Skip to main content

Showing 1–50 of 63 results for author: Arushi

  1. arXiv:2407.03451  [pdf, other

    cs.CR cs.HC

    The Role of Privacy Guarantees in Voluntary Donation of Private Data for Altruistic Goals

    Authors: Ruizhe Wang, Roberta De Viti, Aarushi Dubey, Elissa M. Redmiles

    Abstract: Voluntary donation of private information for altruistic purposes, such as advancing research, is common. However, concerns about data misuse and leakage may deter individuals from donating their information. While prior research has indicated that Privacy Enhancement Technologies (PETs) can alleviate these concerns, the extent to which these techniques influence willingness to donate data remains… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  2. arXiv:2406.00314  [pdf, other

    cs.CL cs.AI cs.LG

    CASE: Efficient Curricular Data Pre-training for Building Assistive Psychology Expert Models

    Authors: Sarthak Harne, Monjoy Narayan Choudhury, Madhav Rao, TK Srikanth, Seema Mehrotra, Apoorva Vashisht, Aarushi Basu, Manjit Sodhi

    Abstract: The limited availability of psychologists necessitates efficient identification of individuals requiring urgent mental healthcare. This study explores the use of Natural Language Processing (NLP) pipelines to analyze text data from online mental health forums used for consultations. By analyzing forum posts, these pipelines can flag users who may require immediate professional attention. A crucial… ▽ More

    Submitted 16 June, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

  3. arXiv:2405.15152  [pdf, other

    cs.CL cs.AI

    Machine Unlearning in Large Language Models

    Authors: Saaketh Koundinya Gundavarapu, Shreya Agarwal, Arushi Arora, Chandana Thimmalapura Jagadeeshaiah

    Abstract: Machine unlearning, a novel area within artificial intelligence, focuses on addressing the challenge of selectively forgetting or reducing undesirable knowledge or behaviors in machine learning models, particularly in the context of large language models (LLMs). This paper introduces a methodology to align LLMs, such as Open Pre-trained Transformer Language Models, with ethical, privacy, and safet… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 10 pages

  4. arXiv:2405.12842  [pdf, other

    cs.RO cs.CV

    SmartFlow: Robotic Process Automation using LLMs

    Authors: Arushi Jain, Shubham Paliwal, Monika Sharma, Lovekesh Vig, Gautam Shroff

    Abstract: Robotic Process Automation (RPA) systems face challenges in handling complex processes and diverse screen layouts that require advanced human-like decision-making capabilities. These systems typically rely on pixel-level encoding through drag-and-drop or automation frameworks such as Selenium to create navigation workflows, rather than visual understanding of screen elements. In this context, we p… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 32nd ACM International Conference on Information and Knowledge Management

  5. arXiv:2405.12742  [pdf, other

    cs.CV

    Multi-Subject Personalization

    Authors: Arushi Jain, Shubham Paliwal, Monika Sharma, Vikram Jamwal, Lovekesh Vig

    Abstract: Creative story illustration requires a consistent interplay of multiple characters or objects. However, conventional text-to-image models face significant challenges while producing images featuring multiple personalized subjects. For example, they distort the subject rendering, or the text descriptions fail to render coherent subject interactions. We present Multi-Subject Personalization (MSP) to… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 2023 Conference on Neural Information Processing Systems

  6. arXiv:2405.12531  [pdf, other

    cs.CV cs.LG

    CustomText: Customized Textual Image Generation using Diffusion Models

    Authors: Shubham Paliwal, Arushi Jain, Monika Sharma, Vikram Jamwal, Lovekesh Vig

    Abstract: Textual image generation spans diverse fields like advertising, education, product packaging, social media, information visualization, and branding. Despite recent strides in language-guided image synthesis using diffusion models, current models excel in image generation but struggle with accurate text rendering and offer limited control over font attributes. In this paper, we aim to enhance the s… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Accepted by AI for Content Creation (AI4CC) workshop at CVPR 2024

  7. arXiv:2405.07838  [pdf, other

    cs.LG cs.AI

    Adaptive Exploration for Data-Efficient General Value Function Evaluations

    Authors: Arushi Jain, Josiah P. Hanna, Doina Precup

    Abstract: General Value Functions (GVFs) (Sutton et al, 2011) are an established way to represent predictive knowledge in reinforcement learning. Each GVF computes the expected return for a given policy, based on a unique pseudo-reward. Multiple GVFs can be estimated in parallel using off-policy learning from a single stream of data, often sourced from a fixed behavior policy or pre-collected dataset. This… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 20 pages, 9 figures, Under Review

  8. arXiv:2405.07284  [pdf

    cs.CV cs.AI

    Zero Shot Context-Based Object Segmentation using SLIP (SAM+CLIP)

    Authors: Saaketh Koundinya Gundavarapu, Arushi Arora, Shreya Agarwal

    Abstract: We present SLIP (SAM+CLIP), an enhanced architecture for zero-shot object segmentation. SLIP combines the Segment Anything Model (SAM) \cite{kirillov2023segment} with the Contrastive Language-Image Pretraining (CLIP) \cite{radford2021learning}. By incorporating text prompts into SAM using CLIP, SLIP enables object segmentation without prior training on specific classes or categories. We fine-tune… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

    Comments: 5 pages, 3 figures

  9. arXiv:2404.08156  [pdf, other

    cs.CL

    Multimodal Contextual Dialogue Breakdown Detection for Conversational AI Models

    Authors: Md Messal Monem Miah, Ulie Schnaithmann, Arushi Raghuvanshi, Youngseo Son

    Abstract: Detecting dialogue breakdown in real time is critical for conversational AI systems, because it enables taking corrective action to successfully complete a task. In spoken dialog systems, this breakdown can be caused by a variety of unexpected situations including high levels of background noise, causing STT mistranscriptions, or unexpected user flows. In particular, industry settings like healthc… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: Published in NAACL 2024 Industry Track

  10. arXiv:2404.08155  [pdf, other

    cs.CL

    Graph Integrated Language Transformers for Next Action Prediction in Complex Phone Calls

    Authors: Amin Hosseiny Marani, Ulie Schnaithmann, Youngseo Son, Akil Iyer, Manas Paldhe, Arushi Raghuvanshi

    Abstract: Current Conversational AI systems employ different machine learning pipelines, as well as external knowledge sources and business logic to predict the next action. Maintaining various components in dialogue managers' pipeline adds complexity in expansion and updates, increases processing time, and causes additive noise through the pipeline that can lead to incorrect next action prediction. This pa… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: Published in NAACL 2024 Industry Track

  11. arXiv:2404.07616  [pdf, other

    cs.CL cs.SD eess.AS

    Audio Dialogues: Dialogues dataset for audio and music understanding

    Authors: Arushi Goel, Zhifeng Kong, Rafael Valle, Bryan Catanzaro

    Abstract: Existing datasets for audio understanding primarily focus on single-turn interactions (i.e. audio captioning, audio question answering) for describing audio in natural language, thus limiting understanding audio via interactive dialogue. To address this gap, we introduce Audio Dialogues: a multi-turn dialogue dataset containing 163.8k samples for general audio sounds and music. In addition to dial… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: Demo website: https://audiodialogues.github.io/

  12. arXiv:2403.17225  [pdf

    cs.HC cs.CR cs.CY

    Measuring Compliance with the California Consumer Privacy Act Over Space and Time

    Authors: Van Tran, Aarushi Mehrotra, Marshini Chetty, Nick Feamster, Jens Frankenreiter, Lior Strahilevitz

    Abstract: The widespread sharing of consumers personal information with third parties raises significant privacy concerns. The California Consumer Privacy Act (CCPA) mandates that online businesses offer consumers the option to opt out of the sale and sharing of personal information. Our study automatically tracks the presence of the opt-out link longitudinally across multiple states after the California Pr… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  13. arXiv:2402.01831  [pdf, other

    cs.SD cs.LG eess.AS

    Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities

    Authors: Zhifeng Kong, Arushi Goel, Rohan Badlani, Wei Ping, Rafael Valle, Bryan Catanzaro

    Abstract: Augmenting large language models (LLMs) to understand audio -- including non-speech sounds and non-verbal speech -- is critically important for diverse real-world applications of LLMs. In this paper, we propose Audio Flamingo, a novel audio language model with 1) strong audio understanding abilities, 2) the ability to quickly adapt to unseen tasks via in-context learning and retrieval, and 3) stro… ▽ More

    Submitted 28 May, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: ICML 2024

  14. arXiv:2311.05779  [pdf, other

    cs.RO cs.CV

    Language-guided Robot Grasping: CLIP-based Referring Grasp Synthesis in Clutter

    Authors: Georgios Tziafas, Yucheng Xu, Arushi Goel, Mohammadreza Kasaei, Zhibin Li, Hamidreza Kasaei

    Abstract: Robots operating in human-centric environments require the integration of visual grounding and grasping capabilities to effectively manipulate objects based on user instructions. This work focuses on the task of referring grasp synthesis, which predicts a grasp pose for an object referred through natural language in cluttered scenes. Existing approaches often employ multi-stage pipelines that firs… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: Poster CoRL 2023. Dataset and code available here: https://github.com/gtziafas/OCID-VLG

  15. arXiv:2310.17567  [pdf, other

    cs.CL cs.AI cs.LG cs.NE

    Skill-Mix: a Flexible and Expandable Family of Evaluations for AI models

    Authors: Dingli Yu, Simran Kaur, Arushi Gupta, Jonah Brown-Cohen, Anirudh Goyal, Sanjeev Arora

    Abstract: With LLMs shifting their role from statistical modeling of language to serving as general-purpose AI agents, how should LLM evaluations change? Arguably, a key ability of an AI agent is to flexibly combine, as needed, the basic skills it has learned. The capability to combine skills plays an important role in (human) pedagogy and also in a paper on emergence phenomena (Arora & Goyal, 2023). This… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

  16. arXiv:2310.13619  [pdf, other

    cs.CL cs.CV

    Semi-supervised multimodal coreference resolution in image narrations

    Authors: Arushi Goel, Basura Fernando, Frank Keller, Hakan Bilen

    Abstract: In this paper, we study multimodal coreference resolution, specifically where a longer descriptive text, i.e., a narration is paired with an image. This poses significant challenges due to fine-grained image-text alignment, inherent ambiguity present in narrative language, and unavailability of large annotated training sets. To tackle these challenges, we present a data efficient semi-supervised a… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: Long paper at EMNLP'23-Main

  17. arXiv:2310.07093  [pdf, other

    cs.CL

    Argumentative Stance Prediction: An Exploratory Study on Multimodality and Few-Shot Learning

    Authors: Arushi Sharma, Abhibha Gupta, Maneesh Bilalpur

    Abstract: To advance argumentative stance prediction as a multimodal problem, the First Shared Task in Multimodal Argument Mining hosted stance prediction in crucial social topics of gun control and abortion. Our exploratory study attempts to evaluate the necessity of images for stance prediction in tweets and compare out-of-the-box text-based large-language models (LLM) in few-shot settings against fine-tu… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

  18. arXiv:2309.11580  [pdf, other

    cs.RO

    A real-time, hardware agnostic framework for close-up branch reconstruction using RGB data

    Authors: Alexander You, Aarushi Mehta, Luke Strohbehn, Jochen Hemming, Cindy Grimm, Joseph R. Davidson

    Abstract: Creating accurate 3D models of tree topology is an important task for tree pruning. The 3D model is used to decide which branches to prune and then to execute the pruning cuts. Previous methods for creating 3D tree models have typically relied on point clouds, which are often computationally expensive to process and can suffer from data defects, especially with thin branches. In this paper, we pro… ▽ More

    Submitted 18 June, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

  19. arXiv:2309.08687  [pdf, other

    cs.DC physics.plasm-ph

    Speeding up charge exchange recombination spectroscopy analysis in support of NERSC/DIII-D realtime workflow

    Authors: Aarushi Jain, Laurie Stephey, Erik Linsenmayer, Colin Chrystal, Jonathan Dursi, Hannah Ross

    Abstract: We report optimization work made in support of the development of a realtime Superfacility workflow between DIII-D and NERSC. At DIII-D, the ion properties measured by charge exchange recombination (CER) spectroscopy are required inputs for a Superfacility realtime workflow that computes the full plasma kinetic equilibrium. In this workflow, minutes matter since the results must be ready during th… ▽ More

    Submitted 18 September, 2023; v1 submitted 15 September, 2023; originally announced September 2023.

    Comments: 8 pages, 5 figures. Not a preprint- this work was rejected from a conference proceedings, so arXiv will hopefully be the final home Updated to add arXiv link/DOI to header of paper

  20. arXiv:2308.02748  [pdf

    cs.CV

    Discrimination of Radiologists Utilizing Eye-Tracking Technology and Machine Learning: A Case Study

    Authors: Stanford Martinez, Carolina Ramirez-Tamayo, Syed Hasib Akhter Faruqui, Kal L. Clark, Adel Alaeddini, Nicholas Czarnek, Aarushi Aggarwal, Sahra Emamzadeh, Jeffrey R. Mock, Edward J. Golob

    Abstract: Perception-related errors comprise most diagnostic mistakes in radiology. To mitigate this problem, radiologists employ personalized and high-dimensional visual search strategies, otherwise known as search patterns. Qualitative descriptions of these search patterns, which involve the physician verbalizing or annotating the order he/she analyzes the image, can be unreliable due to discrepancies in… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

    Comments: Submitting for Review in "IEEE Journal of Biomedical and Health Informatics"

  21. arXiv:2307.16382  [pdf, other

    cs.LG cs.CL

    Does fine-tuning GPT-3 with the OpenAI API leak personally-identifiable information?

    Authors: Albert Yu Sun, Eliott Zemour, Arushi Saxena, Udith Vaidyanathan, Eric Lin, Christian Lau, Vaikkunth Mugunthan

    Abstract: Machine learning practitioners often fine-tune generative pre-trained models like GPT-3 to improve model performance at specific tasks. Previous works, however, suggest that fine-tuned machine learning models memorize and emit sensitive information from the original fine-tuning dataset. Companies such as OpenAI offer fine-tuning services for their models, but no prior work has conducted a memoriza… ▽ More

    Submitted 15 April, 2024; v1 submitted 30 July, 2023; originally announced July 2023.

  22. arXiv:2306.09224  [pdf, other

    cs.CV

    Encyclopedic VQA: Visual questions about detailed properties of fine-grained categories

    Authors: Thomas Mensink, Jasper Uijlings, Lluis Castrejon, Arushi Goel, Felipe Cadar, Howard Zhou, Fei Sha, André Araujo, Vittorio Ferrari

    Abstract: We propose Encyclopedic-VQA, a large scale visual question answering (VQA) dataset featuring visual questions about detailed properties of fine-grained categories and instances. It contains 221k unique question+answer pairs each matched with (up to) 5 images, resulting in a total of 1M VQA samples. Moreover, our dataset comes with a controlled knowledge base derived from Wikipedia, marking the evi… ▽ More

    Submitted 24 July, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: ICCV'23

  23. arXiv:2306.03959  [pdf, other

    cs.CL cs.IR

    Leveraging Explicit Procedural Instructions for Data-Efficient Action Prediction

    Authors: Julia White, Arushi Raghuvanshi, Yada Pruksachatkun

    Abstract: Task-oriented dialogues often require agents to enact complex, multi-step procedures in order to meet user requests. While large language models have found success automating these dialogues in constrained environments, their widespread deployment is limited by the substantial quantities of task-specific data required for training. The following paper presents a data-efficient solution to construc… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

  24. arXiv:2305.17552  [pdf, other

    cs.LG math.OC

    Online Nonstochastic Model-Free Reinforcement Learning

    Authors: Udaya Ghai, Arushi Gupta, Wenhan Xia, Karan Singh, Elad Hazan

    Abstract: We investigate robust model-free reinforcement learning algorithms designed for environments that may be dynamic or even adversarial. Traditional state-based policies often struggle to accommodate the challenges imposed by the presence of unmodeled disturbances in such settings. Moreover, optimizing linear state-based policies pose an obstacle for efficient optimization, leading to nonconvex objec… ▽ More

    Submitted 31 October, 2023; v1 submitted 27 May, 2023; originally announced May 2023.

    Comments: Camera-ready version for NeurIPS 2023

  25. arXiv:2305.00875  [pdf, other

    cs.SE cs.AI cs.LG

    Redundancy and Concept Analysis for Code-trained Language Models

    Authors: Arushi Sharma, Zefu Hu, Christopher Quinn, Ali Jannesari

    Abstract: Code-trained language models have proven to be highly effective for various code intelligence tasks. However, they can be challenging to train and deploy for many software engineering applications due to computational bottlenecks and memory constraints. Implementing effective strategies to address these issues requires a better understanding of these 'black box' models. In this paper, we perform t… ▽ More

    Submitted 15 February, 2024; v1 submitted 1 May, 2023; originally announced May 2023.

    Comments: 4 figures, 6 tables

  26. arXiv:2303.09608  [pdf, other

    cs.CV

    VEIL: Vetting Extracted Image Labels from In-the-Wild Captions for Weakly-Supervised Object Detection

    Authors: Arushi Rai, Adriana Kovashka

    Abstract: The use of large-scale vision-language datasets is limited for object detection due to the negative impact of label noise on localization. Prior methods have shown how such large-scale datasets can be used for pretraining, which can provide initial signal for localization, but is insufficient without clean bounding-box data for at least some categories. We propose a technique to "vet" labels extra… ▽ More

    Submitted 10 March, 2024; v1 submitted 16 March, 2023; originally announced March 2023.

    Comments: 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL) 2024 camera-ready

  27. arXiv:2303.05323  [pdf, other

    cs.CV

    Controllable Video Generation by Learning the Underlying Dynamical System with Neural ODE

    Authors: Yucheng Xu, Li Nanbo, Arushi Goel, Zijian Guo, Zonghai Yao, Hamidreza Kasaei, Mohammadreze Kasaei, Zhibin Li

    Abstract: Videos depict the change of complex dynamical systems over time in the form of discrete image sequences. Generating controllable videos by learning the dynamical system is an important yet underexplored topic in the computer vision community. This paper presents a novel framework, TiV-ODE, to generate highly controllable videos from a static image and a text caption. Specifically, our framework le… ▽ More

    Submitted 4 April, 2023; v1 submitted 9 March, 2023; originally announced March 2023.

  28. arXiv:2301.02242  [pdf, other

    q-bio.GN cs.LG

    Graph Contrastive Learning for Multi-omics Data

    Authors: Nishant Rajadhyaksha, Aarushi Chitkara

    Abstract: Advancements in technologies related to working with omics data require novel computation methods to fully leverage information and help develop a better understanding of human diseases. This paper studies the effects of introducing graph contrastive learning to help leverage graph structure and information to produce better representations for downstream classification tasks for multi-omics datas… ▽ More

    Submitted 3 January, 2023; originally announced January 2023.

  29. arXiv:2211.14563  [pdf, other

    cs.CV cs.CL

    Who are you referring to? Coreference resolution in image narrations

    Authors: Arushi Goel, Basura Fernando, Frank Keller, Hakan Bilen

    Abstract: Coreference resolution aims to identify words and phrases which refer to same entity in a text, a core task in natural language processing. In this paper, we extend this task to resolving coreferences in long-form narrations of visual scenes. First we introduce a new dataset with annotated coreference chains and their bounding boxes, as most existing image-text datasets only contain short sentence… ▽ More

    Submitted 17 March, 2023; v1 submitted 26 November, 2022; originally announced November 2022.

    Comments: 15 pages

  30. arXiv:2211.02912  [pdf, other

    stat.ML cs.LG

    New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound

    Authors: Arushi Gupta, Nikunj Saunshi, Dingli Yu, Kaifeng Lyu, Sanjeev Arora

    Abstract: Saliency methods compute heat maps that highlight portions of an input that were most {\em important} for the label assigned to it by a deep net. Evaluations of saliency methods convert this heat map into a new {\em masked input} by retaining the $k$ highest-ranked pixels of the original input and replacing the rest with \textquotedblleft uninformative\textquotedblright\ pixels, and checking if th… ▽ More

    Submitted 5 November, 2022; originally announced November 2022.

    Comments: NeurIPS 2022 (Oral)

  31. arXiv:2210.06257  [pdf, other

    cs.CV cs.LG eess.IV

    What can we learn about a generated image corrupting its latent representation?

    Authors: Agnieszka Tomczak, Aarushi Gupta, Slobodan Ilic, Nassir Navab, Shadi Albarqouni

    Abstract: Generative adversarial networks (GANs) offer an effective solution to the image-to-image translation problem, thereby allowing for new possibilities in medical imaging. They can translate images from one imaging modality to another at a low cost. For unpaired datasets, they rely mostly on cycle loss. Despite its effectiveness in learning the underlying data distribution, it can lead to a discrepan… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

  32. arXiv:2210.01072  [pdf, other

    cs.LG cs.AI

    Understanding Influence Functions and Datamodels via Harmonic Analysis

    Authors: Nikunj Saunshi, Arushi Gupta, Mark Braverman, Sanjeev Arora

    Abstract: Influence functions estimate effect of individual data points on predictions of the model on test data and were adapted to deep learning in Koh and Liang [2017]. They have been used for detecting data poisoning, detecting helpful and harmful examples, influence of groups of datapoints, etc. Recently, Ilyas et al. [2022] introduced a linear regression method they termed datamodels to predict the ef… ▽ More

    Submitted 3 October, 2022; originally announced October 2022.

  33. arXiv:2208.11388  [pdf, other

    cs.CV

    WiCV 2022: The Tenth Women In Computer Vision Workshop

    Authors: Doris Antensteiner, Silvia Bucci, Arushi Goel, Marah Halawa, Niveditha Kalavakonda, Tejaswi Kasarla, Miaomiao Liu, Nermin Samet, Ivaxi Sheth

    Abstract: In this paper, we present the details of Women in Computer Vision Workshop - WiCV 2022, organized alongside the hybrid CVPR 2022 in New Orleans, Louisiana. It provides a voice to a minority (female) group in the computer vision community and focuses on increasing the visibility of these researchers, both in academia and industry. WiCV believes that such an event can play an important role in lower… ▽ More

    Submitted 24 August, 2022; originally announced August 2022.

    Comments: Report on WiCV Workshop at CVPR 2022. arXiv admin note: substantial text overlap with arXiv:2203.05825, arXiv:2101.03787

  34. arXiv:2208.01041  [pdf

    eess.AS cs.HC cs.LG cs.MM cs.SD

    Voice Analysis for Stress Detection and Application in Virtual Reality to Improve Public Speaking in Real-time: A Review

    Authors: Arushi, Roberto Dillon, Ai Ni Teoh, Denise Dillon

    Abstract: Stress during public speaking is common and adversely affects performance and self-confidence. Extensive research has been carried out to develop various models to recognize emotional states. However, minimal research has been conducted to detect stress during public speaking in real time using voice analysis. In this context, the current review showed that the application of algorithms was not pr… ▽ More

    Submitted 31 July, 2022; originally announced August 2022.

    Comments: 41 pages, 7 figures, 4 tables

    ACM Class: I.6; K.3; K.4; A.2

  35. arXiv:2208.00235  [pdf

    cs.CR

    'PeriHack': Designing a Serious Game for Cybersecurity Awareness

    Authors: Roberto Dillon, Arushi

    Abstract: This paper describes the design process for the cybersecurity serious game 'PeriHack'. Publicly released under a CC (BY-NC-SA) license, PeriHack is a board and card game for two players or teams that simulates the struggle between a red team (attackers) and a blue team (defenders). The game requires players to explore a sample network looking for vulnerabilities and then chain different attacks to… ▽ More

    Submitted 30 July, 2022; originally announced August 2022.

    Comments: 5 pages, 6 figures, 2 tables. For associated files see https://github.com/rdillon73/PeriHack

  36. arXiv:2204.05176  [pdf, other

    cs.LG cs.AI

    Towards Painless Policy Optimization for Constrained MDPs

    Authors: Arushi Jain, Sharan Vaswani, Reza Babanezhad, Csaba Szepesvari, Doina Precup

    Abstract: We study policy optimization in an infinite horizon, $γ$-discounted constrained Markov decision process (CMDP). Our objective is to return a policy that achieves large expected reward with a small constraint violation. We consider the online setting with linear function approximation and assume global access to the corresponding features. We propose a generic primal-dual framework that allows us t… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

    Comments: Paper under submission. 27 pages, 12 figures

  37. arXiv:2203.06873  [pdf, other

    cs.CV

    TSR-DSAW: Table Structure Recognition via Deep Spatial Association of Words

    Authors: Arushi Jain, Shubham Paliwal, Monika Sharma, Lovekesh Vig

    Abstract: Existing methods for Table Structure Recognition (TSR) from camera-captured or scanned documents perform poorly on complex tables consisting of nested rows / columns, multi-line texts and missing cell data. This is because current data-driven methods work by simply training deep models on large volumes of data and fail to generalize when an unseen table structure is encountered. In this paper, we… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

    Comments: 6 pages, 1 figure, 1 table, ESANN 2021 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Online event, 6-8 October 2021, i6doc.com publ., ISBN 978287587082-7

    Journal ref: In ESANN 2021 proceedings, pages 257-262

  38. arXiv:2203.05825  [pdf, other

    cs.CV

    WiCV 2021: The Eighth Women In Computer Vision Workshop

    Authors: Arushi Goel, Niveditha Kalavakonda, Nour Karessli, Tejaswi Kasarla, Kathryn Leonard, Boyi Li, Nermin Samet and, Ghada Zamzmi

    Abstract: In this paper, we present the details of Women in Computer Vision Workshop - WiCV 2021, organized alongside the virtual CVPR 2021. It provides a voice to a minority (female) group in the computer vision community and focuses on increasing the visibility of these researchers, both in academia and industry. WiCV believes that such an event can play an important role in lowering the gender imbalance… ▽ More

    Submitted 11 March, 2022; originally announced March 2022.

    Comments: Report on WiCV Workshop at CVPR 2021. arXiv admin note: substantial text overlap with arXiv:2101.03787

  39. arXiv:2201.10836  [pdf, other

    cs.CV

    PARS: Pseudo-Label Aware Robust Sample Selection for Learning with Noisy Labels

    Authors: Arushi Goel, Yunlong Jiao, Jordan Massiah

    Abstract: Acquiring accurate labels on large-scale datasets is both time consuming and expensive. To reduce the dependency of deep learning models on learning from clean labeled data, several recent research efforts are focused on learning with noisy labels. These methods typically fall into three design categories to learn a noise robust model: sample selection approaches, noise robust loss functions, or l… ▽ More

    Submitted 26 January, 2022; originally announced January 2022.

    Comments: 16 pages

  40. arXiv:2111.14212  [pdf, other

    cs.LG

    On Predicting Generalization using GANs

    Authors: Yi Zhang, Arushi Gupta, Nikunj Saunshi, Sanjeev Arora

    Abstract: Research on generalization bounds for deep networks seeks to give ways to predict test error using just the training dataset and the network parameters. While generalization bounds can give many insights about architecture design, training algorithms, etc., what they do not currently do is yield good predictions for actual test error. A recently introduced Predicting Generalization in Deep Learnin… ▽ More

    Submitted 17 March, 2022; v1 submitted 28 November, 2021; originally announced November 2021.

  41. arXiv:2111.13517  [pdf, other

    cs.CV

    Not All Relations are Equal: Mining Informative Labels for Scene Graph Generation

    Authors: Arushi Goel, Basura Fernando, Frank Keller, Hakan Bilen

    Abstract: Scene graph generation (SGG) aims to capture a wide variety of interactions between pairs of objects, which is essential for full scene understanding. Existing SGG methods trained on the entire set of relations fail to acquire complex reasoning about visual and textual correlations due to various biases in training data. Learning on trivial relations that indicate generic spatial configuration lik… ▽ More

    Submitted 4 April, 2022; v1 submitted 26 November, 2021; originally announced November 2021.

    Comments: 16 pages

    Journal ref: CVPR 2022

  42. arXiv:2110.09393  [pdf

    cs.CL cs.AI

    Ceasing hate withMoH: Hate Speech Detection in Hindi-English Code-Switched Language

    Authors: Arushi Sharma, Anubha Kabra, Minni Jain

    Abstract: Social media has become a bedrock for people to voice their opinions worldwide. Due to the greater sense of freedom with the anonymity feature, it is possible to disregard social etiquette online and attack others without facing severe consequences, inevitably propagating hate speech. The current measures to sift the online content and offset the hatred spread do not go far enough. One factor cont… ▽ More

    Submitted 18 October, 2021; originally announced October 2021.

    Comments: Accepted in Elsevier Journal of Information Processing and Management. Sharma and Kabra made equal contribution

  43. Digitize-PID: Automatic Digitization of Piping and Instrumentation Diagrams

    Authors: Shubham Paliwal, Arushi Jain, Monika Sharma, Lovekesh Vig

    Abstract: Digitization of scanned Piping and Instrumentation diagrams(P&ID), widely used in manufacturing or mechanical industries such as oil and gas over several decades, has become a critical bottleneck in dynamic inventory management and creation of smart P&IDs that are compatible with the latest CAD tools. Historically, P&ID sheets have been manually generated at the design stage, before being scanned… ▽ More

    Submitted 8 September, 2021; originally announced September 2021.

    Comments: 13 pages

    Journal ref: Trends and Applications in Knowledge Discovery and Data Mining. 168-180, PAKDD 2021

  44. arXiv:2106.15615  [pdf, other

    cs.LG cs.AI

    A Representation Learning Perspective on the Importance of Train-Validation Splitting in Meta-Learning

    Authors: Nikunj Saunshi, Arushi Gupta, Wei Hu

    Abstract: An effective approach in meta-learning is to utilize multiple "train tasks" to learn a good initialization for model parameters that can help solve unseen "test tasks" with very few samples by fine-tuning from this initialization. Although successful in practice, theoretical understanding of such methods is limited. This work studies an important aspect of these methods: splitting the data from ea… ▽ More

    Submitted 29 June, 2021; originally announced June 2021.

    Comments: In proceedings of ICML 2021

  45. arXiv:2105.03494  [pdf, other

    cs.CV

    The iWildCam 2021 Competition Dataset

    Authors: Sara Beery, Arushi Agarwal, Elijah Cole, Vighnesh Birodkar

    Abstract: Camera traps enable the automatic collection of large quantities of image data. Ecologists use camera traps to monitor animal populations all over the world. In order to estimate the abundance of a species from camera trap data, ecologists need to know not just which species were seen, but also how many individuals of each species were seen. Object detection techniques can be used to find the numb… ▽ More

    Submitted 7 May, 2021; originally announced May 2021.

    Comments: FGVC8 Workshop at CVPR 2021. arXiv admin note: substantial text overlap with arXiv:2004.10340

  46. arXiv:2102.01985  [pdf, other

    cs.LG cs.AI

    Variance Penalized On-Policy and Off-Policy Actor-Critic

    Authors: Arushi Jain, Gandharv Patil, Ayush Jain, Khimya Khetarpal, Doina Precup

    Abstract: Reinforcement learning algorithms are typically geared towards optimizing the expected return of an agent. However, in many practical applications, low variance in the return is desired to ensure the reliability of an algorithm. In this paper, we propose on-policy and off-policy actor-critic algorithms that optimize a performance criterion involving both mean and variance in the return. Previous w… ▽ More

    Submitted 3 February, 2021; originally announced February 2021.

    Comments: Accepted to the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21), 2021

  47. DeepGamble: Towards unlocking real-time player intelligence using multi-layer instance segmentation and attribute detection

    Authors: Danish Syed, Naman Gandhi, Arushi Arora, Nilesh Kadam

    Abstract: Annually the gaming industry spends approximately $15 billion in marketing reinvestment. However, this amount is spent without any consideration for the skill and luck of the player. For a casino, an unskilled player could fetch ~4 times more revenue than a skilled player. This paper describes a video recognition system that is based on an extension of the Mask R-CNN model. Our system digitizes th… ▽ More

    Submitted 14 December, 2020; originally announced December 2020.

    Comments: 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA)

  48. arXiv:2011.05910  [pdf, other

    cs.CL cs.AI

    Audrey: A Personalized Open-Domain Conversational Bot

    Authors: Chung Hoon Hong, Yuan Liang, Sagnik Sinha Roy, Arushi Jain, Vihang Agarwal, Ryan Draves, Zhizhuo Zhou, William Chen, Yujian Liu, Martha Miracky, Lily Ge, Nikola Banovic, David Jurgens

    Abstract: Conversational Intelligence requires that a person engage on informational, personal and relational levels. Advances in Natural Language Understanding have helped recent chatbots succeed at dialog on the informational level. However, current techniques still lag for conversing with humans on a personal level and fully relating to them. The University of Michigan's submission to the Alexa Prize Gra… ▽ More

    Submitted 11 November, 2020; originally announced November 2020.

  49. arXiv:2005.12743  [pdf, other

    cs.LG stat.ML

    Inherent Noise in Gradient Based Methods

    Authors: Arushi Gupta

    Abstract: Previous work has examined the ability of larger capacity neural networks to generalize better than smaller ones, even without explicit regularizers, by analyzing gradient based algorithms such as GD and SGD. The presence of noise and its effect on robustness to parameter perturbations has been linked to generalization. We examine a property of GD and SGD, namely that instead of iterating through… ▽ More

    Submitted 26 May, 2020; originally announced May 2020.

  50. arXiv:2003.07074  [pdf

    cs.CY cs.CL cs.LG

    A Machine Learning Application for Raising WASH Awareness in the Times of COVID-19 Pandemic

    Authors: Rohan Pandey, Vaibhav Gautam, Ridam Pal, Harsh Bandhey, Lovedeep Singh Dhingra, Himanshu Sharma, Chirag Jain, Kanav Bhagat, Arushi, Lajjaben Patel, Mudit Agarwal, Samprati Agrawal, Rishabh Jalan, Akshat Wadhwa, Ayush Garg, Vihaan Misra, Yashwin Agrawal, Bhavika Rana, Ponnurangam Kumaraguru, Tavpritesh Sethi

    Abstract: Background: The COVID-19 pandemic has uncovered the potential of digital misinformation in shaping the health of nations. The deluge of unverified information that spreads faster than the epidemic itself is an unprecedented phenomenon that has put millions of lives in danger. Mitigating this Infodemic requires strong health messaging systems that are engaging, vernacular, scalable, effective and c… ▽ More

    Submitted 30 October, 2020; v1 submitted 16 March, 2020; originally announced March 2020.

    Comments: 14 pages, 7 figures