Skip to main content

Showing 1–50 of 122 results for author: Woo, S

  1. arXiv:2407.19102  [pdf, ps, other

    cs.CC

    The Computational Complexity of Factored Graphs

    Authors: Shreya Gupta, Boyang Huang, Russell Impagliazzo, Stanley Woo, Christopher Ye

    Abstract: Computational complexity is traditionally measured with respect to input size. For graphs, this is typically the number of vertices (or edges) of the graph. However, for large graphs even explicitly representing the graph could be prohibitively expensive. Instead, graphs with enough structure could admit more succinct representations. A number of previous works have considered various succinct rep… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

  2. arXiv:2407.15554  [pdf, other

    cs.CV

    Decomposition of Neural Discrete Representations for Large-Scale 3D Mapping

    Authors: Minseong Park, Suhan Woo, Euntai Kim

    Abstract: Learning efficient representations of local features is a key challenge in feature volume-based 3D neural mapping, especially in large-scale environments. In this paper, we introduce Decomposition-based Neural Mapping (DNMap), a storage-efficient large-scale 3D mapping method that employs a discrete representation based on a decomposition strategy. This decomposition strategy aims to efficiently c… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  3. arXiv:2407.11714  [pdf, other

    cs.CV

    Improving Unsupervised Video Object Segmentation via Fake Flow Generation

    Authors: Suhwan Cho, Minhyeok Lee, Jungho Lee, Donghyeong Kim, Seunghoon Lee, Sungmin Woo, Sangyoun Lee

    Abstract: Unsupervised video object segmentation (VOS), also known as video salient object detection, aims to detect the most prominent object in a video at the pixel level. Recently, two-stream approaches that leverage both RGB images and optical flow maps have gained significant attention. However, the limited amount of training data remains a substantial challenge. In this study, we propose a novel data… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  4. arXiv:2407.10784  [pdf, other

    cs.LG cs.AI stat.ML

    AdapTable: Test-Time Adaptation for Tabular Data via Shift-Aware Uncertainty Calibrator and Label Distribution Handler

    Authors: Changhun Kim, Taewon Kim, Seungyeon Woo, June Yong Yang, Eunho Yang

    Abstract: In real-world applications, tabular data often suffer from distribution shifts due to their widespread and abundant nature, leading to erroneous predictions of pre-trained machine learning models. However, addressing such distribution shifts in the tabular domain has been relatively underexplored due to unique challenges such as varying attributes and dataset sizes, as well as the limited represen… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  5. arXiv:2407.10399  [pdf, other

    cs.CV

    Exploring the Impact of Moire Pattern on Deepfake Detectors

    Authors: Razaib Tariq, Shahroz Tariq, Simon S. Woo

    Abstract: Deepfake detection is critical in mitigating the societal threats posed by manipulated videos. While various algorithms have been developed for this purpose, challenges arise when detectors operate externally, such as on smartphones, when users take a photo of deepfake images and upload on the Internet. One significant challenge in such scenarios is the presence of Moiré patterns, which degrade im… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 7 page, 4 figures, 1 table, Accepted for publication in IEEE International Conference on Image Processing (ICIP 2024)

  6. arXiv:2407.10277  [pdf, other

    cs.CV cs.AI cs.LG

    Disrupting Diffusion-based Inpainters with Semantic Digression

    Authors: Geonho Son, Juhun Lee, Simon S. Woo

    Abstract: The fabrication of visual misinformation on the web and social media has increased exponentially with the advent of foundational text-to-image diffusion models. Namely, Stable Diffusion inpainters allow the synthesis of maliciously inpainted images of personal and private figures, and copyrighted contents, also known as deepfakes. To combat such generations, a disruption framework, namely Photogua… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 16 pages, 13 figures, IJCAI 2024

  7. arXiv:2407.09303  [pdf, other

    cs.CV

    ProDepth: Boosting Self-Supervised Multi-Frame Monocular Depth with Probabilistic Fusion

    Authors: Sungmin Woo, Wonjoon Lee, Woo Jin Kim, Dogyoon Lee, Sangyoun Lee

    Abstract: Self-supervised multi-frame monocular depth estimation relies on the geometric consistency between successive frames under the assumption of a static scene. However, the presence of moving objects in dynamic scenes introduces inevitable inconsistencies, causing misaligned multi-frame feature matching and misleading self-supervision during training. In this paper, we propose a novel framework calle… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024. Project Page: https://sungmin-woo.github.io/prodepth/

  8. arXiv:2407.01073  [pdf, other

    cs.RO

    No More Potentially Dynamic Objects: Static Point Cloud Map Generation based on 3D Object Detection and Ground Projection

    Authors: Soojin Woo, Donghwi Jung, Seong-Woo Kim

    Abstract: In this paper, we propose an algorithm to generate a static point cloud map based on LiDAR point cloud data. Our proposed pipeline detects dynamic objects using 3D object detectors and projects points of dynamic objects onto the ground. Typically, point cloud data acquired in real-time serves as a snapshot of the surrounding areas containing both static objects and dynamic objects. The static obje… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  9. arXiv:2406.16860  [pdf, other

    cs.CV

    Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

    Authors: Shengbang Tong, Ellis Brown, Penghao Wu, Sanghyun Woo, Manoj Middepogu, Sai Charitha Akula, Jihan Yang, Shusheng Yang, Adithya Iyer, Xichen Pan, Austin Wang, Rob Fergus, Yann LeCun, Saining Xie

    Abstract: We introduce Cambrian-1, a family of multimodal LLMs (MLLMs) designed with a vision-centric approach. While stronger language models can enhance multimodal capabilities, the design choices for vision components are often insufficiently explored and disconnected from visual representation learning research. This gap hinders accurate sensory grounding in real-world scenarios. Our study uses LLMs and… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Website at https://cambrian-mllm.github.io

  10. arXiv:2405.18012  [pdf, other

    cs.CV eess.IV

    Flow-Assisted Motion Learning Network for Weakly-Supervised Group Activity Recognition

    Authors: Muhammad Adi Nugroho, Sangmin Woo, Sumin Lee, Jinyoung Park, Yooseung Wang, Donguk Kim, Changick Kim

    Abstract: Weakly-Supervised Group Activity Recognition (WSGAR) aims to understand the activity performed together by a group of individuals with the video-level label and without actor-level labels. We propose Flow-Assisted Motion Learning Network (Flaming-Net) for WSGAR, which consists of the motion-aware actor encoder to extract actor features and the two-pathways relation module to infer the interaction… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  11. arXiv:2405.17928  [pdf, other

    cs.CV

    Relational Self-supervised Distillation with Compact Descriptors for Image Copy Detection

    Authors: Juntae Kim, Sungwon Woo, Jongho Nang

    Abstract: Image copy detection is a task of detecting edited copies from any image within a reference database. While previous approaches have shown remarkable progress, the large size of their networks and descriptors remains disadvantage, complicating their practical application. In this paper, we propose a novel method that achieves a competitive performance by using a lightweight network and compact des… ▽ More

    Submitted 16 July, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    ACM Class: I.4.0; I.4.10

  12. arXiv:2405.17825  [pdf, other

    cs.CV cs.AI

    Diffusion Model Patching via Mixture-of-Prompts

    Authors: Seokil Ham, Sangmin Woo, Jin-Young Kim, Hyojun Go, Byeongjun Park, Changick Kim

    Abstract: We present Diffusion Model Patching (DMP), a simple method to boost the performance of pre-trained diffusion models that have already reached convergence, with a negligible increase in parameters. DMP inserts a small, learnable set of prompts into the model's input space while keeping the original model frozen. The effectiveness of DMP is not merely due to the addition of parameters but stems from… ▽ More

    Submitted 30 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: Project page: https://sangminwoo.github.io/DMP/

  13. arXiv:2405.17821  [pdf, other

    cs.CV cs.AI

    RITUAL: Random Image Transformations as a Universal Anti-hallucination Lever in LVLMs

    Authors: Sangmin Woo, Jaehyuk Jang, Donguk Kim, Yubin Choi, Changick Kim

    Abstract: Recent advancements in Large Vision Language Models (LVLMs) have revolutionized how machines understand and generate textual responses based on visual inputs. Despite their impressive capabilities, they often produce "hallucinatory" outputs that do not accurately reflect the visual information, posing challenges in reliability and trustworthiness. Current methods such as contrastive decoding have… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Project page: https://sangminwoo.github.io/RITUAL/

  14. arXiv:2405.17820  [pdf, other

    cs.CV cs.AI

    Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vision Language Models

    Authors: Sangmin Woo, Donguk Kim, Jaehyuk Jang, Yubin Choi, Changick Kim

    Abstract: This study addresses the issue observed in Large Vision Language Models (LVLMs), where excessive attention on a few image tokens, referred to as blind tokens, leads to hallucinatory responses in tasks requiring fine-grained understanding of visual objects. We found that tokens receiving lower attention weights often hold essential information for identifying nuanced object details -- ranging from… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Project page: https://sangminwoo.github.io/AvisC/

  15. arXiv:2405.01934  [pdf, other

    cs.CV cs.AI cs.CR cs.LG

    Impact of Architectural Modifications on Deep Learning Adversarial Robustness

    Authors: Firuz Juraev, Mohammed Abuhamad, Simon S. Woo, George K Thiruvathukal, Tamer Abuhmed

    Abstract: Rapid advancements of deep learning are accelerating adoption in a wide variety of applications, including safety-critical applications such as self-driving vehicles, drones, robots, and surveillance systems. These advancements include applying variations of sophisticated techniques that improve the performance of models. However, such models are not immune to adversarial manipulations, which can… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  16. arXiv:2404.14617  [pdf, other

    cs.AR

    TDRAM: Tag-enhanced DRAM for Efficient Caching

    Authors: Maryam Babaie, Ayaz Akram, Wendy Elsasser, Brent Haukness, Michael Miller, Taeksang Song, Thomas Vogelsang, Steven Woo, Jason Lowe-Power

    Abstract: As SRAM-based caches are hitting a scaling wall, manufacturers are integrating DRAM-based caches into system designs to continue increasing cache sizes. While DRAM caches can improve the performance of memory systems, existing DRAM cache designs suffer from high miss penalties, wasted data movement, and interference between misses and demand requests. In this paper, we propose TDRAM, a novel DRAM… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  17. arXiv:2403.20225  [pdf, other

    cs.CV

    MTMMC: A Large-Scale Real-World Multi-Modal Camera Tracking Benchmark

    Authors: Sanghyun Woo, Kwanyong Park, Inkyu Shin, Myungchul Kim, In So Kweon

    Abstract: Multi-target multi-camera tracking is a crucial task that involves identifying and tracking individuals over time using video streams from multiple cameras. This task has practical applications in various fields, such as visual surveillance, crowd behavior analysis, and anomaly detection. However, due to the difficulty and cost of collecting and labeling data, existing datasets for this task are e… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: Accepted on CVPR 2024

  18. arXiv:2403.14113  [pdf, other

    cs.CV

    Spatio-Temporal Proximity-Aware Dual-Path Model for Panoramic Activity Recognition

    Authors: Sumin Lee, Yooseung Wang, Sangmin Woo, Changick Kim

    Abstract: Panoramic Activity Recognition (PAR) seeks to identify diverse human activities across different scales, from individual actions to social group and global activities in crowded panoramic scenes. PAR presents two major challenges: 1) recognizing the nuanced interactions among numerous individuals and 2) understanding multi-granular human activities. To address these, we propose Social Proximity-aw… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  19. arXiv:2403.11582  [pdf, other

    cs.CV

    OurDB: Ouroboric Domain Bridging for Multi-Target Domain Adaptive Semantic Segmentation

    Authors: Seungbeom Woo, Geonwoo Baek, Taehoon Kim, Jaemin Na, Joong-won Hwang, Wonjun Hwang

    Abstract: Multi-target domain adaptation (MTDA) for semantic segmentation poses a significant challenge, as it involves multiple target domains with varying distributions. The goal of MTDA is to minimize the domain discrepancies among a single source and multi-target domains, aiming to train a single model that excels across all target domains. Previous MTDA approaches typically employ multiple teacher arch… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  20. arXiv:2403.09176  [pdf, other

    cs.CV

    Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts

    Authors: Byeongjun Park, Hyojun Go, Jin-Young Kim, Sangmin Woo, Seokil Ham, Changick Kim

    Abstract: Diffusion models have achieved remarkable success across a range of generative tasks. Recent efforts to enhance diffusion model architectures have reimagined them as a form of multi-task learning, where each task corresponds to a denoising task at a specific noise level. While these efforts have focused on parameter isolation and task routing, they fall short of capturing detailed inter-task relat… ▽ More

    Submitted 10 July, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: Project Page: https://byeongjun-park.github.io/Switch-DiT/

  21. arXiv:2403.04981  [pdf, other

    cs.ET

    Paving the Way for Pass Disturb Free Vertical NAND Storage via A Dedicated and String-Compatible Pass Gate

    Authors: Zijian Zhao, Sola Woo, Khandker Akif Aabrar, Sharadindu Gopal Kirtania, Zhouhang Jiang, Shan Deng, Yi Xiao, Halid Mulaosmanovic, Stefan Duenkel, Dominik Kleimaier, Steven Soss, Sven Beyer, Rajiv Joshi, Scott Meninger, Mohamed Mohamed, Kijoon Kim, Jongho Woo, Suhwan Lim, Kwangsoo Kim, Wanki Kim, Daewon Ha, Vijaykrishnan Narayanan, Suman Datta, Shimeng Yu, Kai Ni

    Abstract: In this work, we propose a dual-port cell design to address the pass disturb in vertical NAND storage, which can pass signals through a dedicated and string-compatible pass gate. We demonstrate that: i) the pass disturb-free feature originates from weakening of the depolarization field by the pass bias at the high-${V}_{TH}$ (HVT) state and the screening of the applied field by channel at the low-… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: 29 pages, 7 figures

  22. arXiv:2402.18848  [pdf, other

    cs.CV

    SwitchLight: Co-design of Physics-driven Architecture and Pre-training Framework for Human Portrait Relighting

    Authors: Hoon Kim, Minje Jang, Wonjun Yoon, Jisoo Lee, Donghyun Na, Sanghyun Woo

    Abstract: We introduce a co-designed approach for human portrait relighting that combines a physics-guided architecture with a pre-training framework. Drawing on the Cook-Torrance reflectance model, we have meticulously configured the architecture design to precisely simulate light-surface interactions. Furthermore, to overcome the limitation of scarce high-quality lightstage data, we have developed a self-… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: CVPR2024. Live demos available at https://www.beeble.ai/

  23. arXiv:2402.18817  [pdf, other

    cs.CV

    Gradient Alignment for Cross-Domain Face Anti-Spoofing

    Authors: Binh M. Le, Simon S. Woo

    Abstract: Recent advancements in domain generalization (DG) for face anti-spoofing (FAS) have garnered considerable attention. Traditional methods have focused on designing learning objectives and additional modules to isolate domain-specific features while retaining domain-invariant characteristics in their representations. However, such approaches often lack guarantees of consistent maintenance of domain-… ▽ More

    Submitted 11 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Journal ref: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024

  24. arXiv:2402.18293  [pdf, other

    cs.CV

    Continuous Memory Representation for Anomaly Detection

    Authors: Joo Chan Lee, Taejune Kim, Eunbyung Park, Simon S. Woo, Jong Hwan Ko

    Abstract: There have been significant advancements in anomaly detection in an unsupervised manner, where only normal images are available for training. Several recent methods aim to detect anomalies based on a memory, comparing or reconstructing the input with directly stored normal features (or trained features with normal images). However, such memory-based approaches operate on a discrete feature space i… ▽ More

    Submitted 24 July, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: Project page: https://tae-mo.github.io/crad/

  25. arXiv:2402.17812  [pdf, other

    cs.LG cs.CL

    DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation

    Authors: Sunghyeon Woo, Baeseong Park, Byeongwook Kim, Minjung Jo, Sejung Kwon, Dongsuk Jeon, Dongsoo Lee

    Abstract: Training deep neural networks typically involves substantial computational costs during both forward and backward propagation. The conventional layer dropping techniques drop certain layers during training for reducing the computations burden. However, dropping layers during forward propagation adversely affects the training process by degrading accuracy. In this paper, we propose Dropping Backwar… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  26. arXiv:2401.17690  [pdf, other

    eess.AS cs.AI cs.SD

    EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning

    Authors: Jaeyeon Kim, Jaeyoon Jung, Jinjoo Lee, Sang Hoon Woo

    Abstract: We propose EnCLAP, a novel framework for automated audio captioning. EnCLAP employs two acoustic representation models, EnCodec and CLAP, along with a pretrained language model, BART. We also introduce a new training objective called masked codec modeling that improves acoustic awareness of the pretrained language model. Experimental results on AudioCaps and Clotho demonstrate that our model surpa… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: Accepted to ICASSP 2024

  27. arXiv:2401.16189  [pdf, other

    cs.CV cs.RO

    FIMP: Future Interaction Modeling for Multi-Agent Motion Prediction

    Authors: Sungmin Woo, Minjung Kim, Donghyeong Kim, Sungjun Jang, Sangyoun Lee

    Abstract: Multi-agent motion prediction is a crucial concern in autonomous driving, yet it remains a challenge owing to the ambiguous intentions of dynamic agents and their intricate interactions. Existing studies have attempted to capture interactions between road entities by using the definite data in history timesteps, as future information is not available and involves high uncertainty. However, without… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: Accepted by ICRA 2024

  28. arXiv:2401.04364  [pdf, other

    cs.CV cs.CR cs.LG

    SoK: Facial Deepfake Detectors

    Authors: Binh M. Le, Jiwon Kim, Shahroz Tariq, Kristen Moore, Alsharif Abuadbba, Simon S. Woo

    Abstract: Deepfakes have rapidly emerged as a profound and serious threat to society, primarily due to their ease of creation and dissemination. This situation has triggered an accelerated development of deepfake detection technologies. However, many existing detectors rely heavily on lab-generated datasets for validation, which may not effectively prepare them for novel, emerging, and real-world deepfake t… ▽ More

    Submitted 25 June, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: 18 pages, 6 figures, 5 table, under peer-review

  29. arXiv:2401.02113  [pdf, other

    cs.CV

    Source-Free Online Domain Adaptive Semantic Segmentation of Satellite Images under Image Degradation

    Authors: Fahim Faisal Niloy, Kishor Kumar Bhaumik, Simon S. Woo

    Abstract: Online adaptation to distribution shifts in satellite image segmentation stands as a crucial yet underexplored problem. In this paper, we address source-free and online domain adaptation, i.e., test-time adaptation (TTA), for satellite images, with the focus on mitigating distribution shifts caused by various forms of image degradation. Towards achieving this goal, we propose a novel TTA approach… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: ICASSP 2024

  30. arXiv:2312.16823  [pdf, other

    cs.LG cs.CR

    Layer Attack Unlearning: Fast and Accurate Machine Unlearning via Layer Level Attack and Knowledge Distillation

    Authors: Hyunjune Kim, Sangyong Lee, Simon S. Woo

    Abstract: Recently, serious concerns have been raised about the privacy issues related to training datasets in machine learning algorithms when including personal data. Various regulations in different countries, including the GDPR grant individuals to have personal data erased, known as 'the right to be forgotten' or 'the right to erasure'. However, there has been less research on effectively and practical… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

  31. arXiv:2312.15980  [pdf, other

    cs.CV cs.AI

    HarmonyView: Harmonizing Consistency and Diversity in One-Image-to-3D

    Authors: Sangmin Woo, Byeongjun Park, Hyojun Go, Jin-Young Kim, Changick Kim

    Abstract: Recent progress in single-image 3D generation highlights the importance of multi-view coherency, leveraging 3D priors from large-scale diffusion models pretrained on Internet-scale images. However, the aspect of novel-view diversity remains underexplored within the research landscape due to the ambiguity in converting a 2D image into 3D content, where numerous potential shapes can emerge. Here, we… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: Project page: https://byeongjun-park.github.io/HarmonyView/

  32. arXiv:2312.12807  [pdf, other

    cs.CV cs.AI

    All but One: Surgical Concept Erasing with Model Preservation in Text-to-Image Diffusion Models

    Authors: Seunghoo Hong, Juhun Lee, Simon S. Woo

    Abstract: Text-to-Image models such as Stable Diffusion have shown impressive image generation synthesis, thanks to the utilization of large-scale datasets. However, these datasets may contain sexually explicit, copyrighted, or undesirable content, which allows the model to directly generate them. Given that retraining these large models on individual concept deletion requests is infeasible, fine-tuning alg… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

    Comments: Main paper with supplementary materials

  33. Blind-Touch: Homomorphic Encryption-Based Distributed Neural Network Inference for Privacy-Preserving Fingerprint Authentication

    Authors: Hyunmin Choi, Simon Woo, Hyoungshick Kim

    Abstract: Fingerprint authentication is a popular security mechanism for smartphones and laptops. However, its adoption in web and cloud environments has been limited due to privacy concerns over storing and processing biometric data on servers. This paper introduces Blind-Touch, a novel machine learning-based fingerprint authentication system leveraging homomorphic encryption to address these privacy conce… ▽ More

    Submitted 1 April, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: The 38th Annual AAAI Conference on Artificial Intelligence (AAAI) 2024

  34. arXiv:2311.12344  [pdf, other

    cs.CV

    Modality Mixer Exploiting Complementary Information for Multi-modal Action Recognition

    Authors: Sumin Lee, Sangmin Woo, Muhammad Adi Nugroho, Changick Kim

    Abstract: Due to the distinctive characteristics of sensors, each modality exhibits unique physical properties. For this reason, in the context of multi-modal action recognition, it is important to consider not only the overall action content but also the complementary nature of different modalities. In this paper, we propose a novel network, named Modality Mixer (M-Mixer) network, which effectively leverag… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2208.11314

  35. arXiv:2310.16354  [pdf

    cs.AR

    RAMPART: RowHammer Mitigation and Repair for Server Memory Systems

    Authors: Steven C. Woo, Wendy Elsasser, Mike Hamburg, Eric Linstadt, Michael R. Miller, Taeksang Song, James Tringali

    Abstract: RowHammer attacks are a growing security and reliability concern for DRAMs and computer systems as they can induce many bit errors that overwhelm error detection and correction capabilities. System-level solutions are needed as process technology and circuit improvements alone are unlikely to provide complete protection against RowHammer attacks in the future. This paper introduces RAMPART, a nove… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: 16 pages, 13 figures. A version of this paper will appear in the Proceedings of MEMSYS23

    ACM Class: B.3.1; B.3.4

  36. arXiv:2310.07138  [pdf, other

    cs.CV cs.AI

    Denoising Task Routing for Diffusion Models

    Authors: Byeongjun Park, Sangmin Woo, Hyojun Go, Jin-Young Kim, Changick Kim

    Abstract: Diffusion models generate highly realistic images by learning a multi-step denoising process, naturally embodying the principles of multi-task learning (MTL). Despite the inherent connection between diffusion models and MTL, there remains an unexplored area in designing neural architectures that explicitly incorporate MTL into the framework of diffusion models. In this paper, we present Denoising… ▽ More

    Submitted 20 February, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: ICLR 2024

  37. arXiv:2309.05911  [pdf, other

    cs.CV cs.AI

    Quality-Agnostic Deepfake Detection with Intra-model Collaborative Learning

    Authors: Binh M. Le, Simon S. Woo

    Abstract: Deepfake has recently raised a plethora of societal concerns over its possible security threats and dissemination of fake information. Much research on deepfake detection has been undertaken. However, detecting low quality as well as simultaneously detecting different qualities of deepfakes still remains a grave challenge. Most SOTA approaches are limited by using a single specific model for detec… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Journal ref: International Conference on Computer Vision 2023

  38. Towards Understanding of Deepfake Videos in the Wild

    Authors: Beomsang Cho, Binh M. Le, Jiwon Kim, Simon Woo, Shahroz Tariq, Alsharif Abuadbba, Kristen Moore

    Abstract: Deepfakes have become a growing concern in recent years, prompting researchers to develop benchmark datasets and detection algorithms to tackle the issue. However, existing datasets suffer from significant drawbacks that hamper their effectiveness. Notably, these datasets fail to encompass the latest deepfake videos produced by state-of-the-art methods that are being shared across various platform… ▽ More

    Submitted 6 September, 2023; v1 submitted 4 September, 2023; originally announced September 2023.

    Journal ref: 32nd ACM International Conference on Information & Knowledge Management (CIKM), UK, 2023

  39. arXiv:2308.09322  [pdf, other

    cs.CV cs.AI cs.MM

    Audio-Visual Glance Network for Efficient Video Recognition

    Authors: Muhammad Adi Nugroho, Sangmin Woo, Sumin Lee, Changick Kim

    Abstract: Deep learning has made significant strides in video understanding tasks, but the computation required to classify lengthy and massive videos using clip-level video classifiers remains impractical and prohibitively expensive. To address this issue, we propose Audio-Visual Glance Network (AVGN), which leverages the commonly available audio and visual modalities to efficiently process the spatio-temp… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

    Comments: ICCV 2023

  40. arXiv:2307.11906  [pdf, other

    cs.CV cs.CR cs.LG

    Unveiling Vulnerabilities in Interpretable Deep Learning Systems with Query-Efficient Black-box Attacks

    Authors: Eldor Abdukhamidov, Mohammed Abuhamad, Simon S. Woo, Eric Chan-Tin, Tamer Abuhmed

    Abstract: Deep learning has been rapidly employed in many applications revolutionizing many industries, but it is known to be vulnerable to adversarial attacks. Such attacks pose a serious threat to deep learning-based systems compromising their integrity, reliability, and trust. Interpretable Deep Learning Systems (IDLSes) are designed to make the system more transparent and explainable, but they are also… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

    Comments: arXiv admin note: text overlap with arXiv:2307.06496

  41. arXiv:2307.11052  [pdf, other

    cs.CV

    HRFNet: High-Resolution Forgery Network for Localizing Satellite Image Manipulation

    Authors: Fahim Faisal Niloy, Kishor Kumar Bhaumik, Simon S. Woo

    Abstract: Existing high-resolution satellite image forgery localization methods rely on patch-based or downsampling-based training. Both of these training methods have major drawbacks, such as inaccurate boundaries between pristine and forged regions, the generation of unwanted artifacts, etc. To tackle the aforementioned challenges, inspired by the high-resolution image segmentation literature, we propose… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Comments: ICIP 2023

  42. arXiv:2307.06496  [pdf, other

    cs.CV cs.AI cs.CR cs.LG

    Microbial Genetic Algorithm-based Black-box Attack against Interpretable Deep Learning Systems

    Authors: Eldor Abdukhamidov, Mohammed Abuhamad, Simon S. Woo, Eric Chan-Tin, Tamer Abuhmed

    Abstract: Deep learning models are susceptible to adversarial samples in white and black-box environments. Although previous studies have shown high attack success rates, coupling DNN models with interpretation models could offer a sense of security when a human expert is involved, who can identify whether a given sample is benign or malicious. However, in white-box environments, interpretable deep learning… ▽ More

    Submitted 12 July, 2023; originally announced July 2023.

  43. arXiv:2307.03558  [pdf, other

    cs.RO

    We, Vertiport 6, are temporarily closed: Interactional Ontological Methods for Changing the Destination

    Authors: Seungwan Woo, Jeongseok Kim, Kangjin Kim

    Abstract: This paper presents a continuation of the previous research on the interaction between a human traffic manager and the UATMS. In particular, we focus on the automation of the process of handling a vertiport outage, which was partially covered in the previous work. Once the manager reports that a vertiport is out of service, which means landings for all corresponding agents are prohibited, the air… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

    Comments: 8 pages, 1 figure, submitted to IEEERO-MAN (RO-MAN 2023) Workshop on Ontologies for Autonomous Robotics (RobOntics)

  44. Integrating Psychometrics and Computing Perspectives on Bias and Fairness in Affective Computing: A Case Study of Automated Video Interviews

    Authors: Brandon M Booth, Louis Hickman, Shree Krishna Subburaj, Louis Tay, Sang Eun Woo, Sidney K. DMello

    Abstract: We provide a psychometric-grounded exposition of bias and fairness as applied to a typical machine learning pipeline for affective computing. We expand on an interpersonal communication framework to elucidate how to identify sources of bias that may arise in the process of inferring human emotions and other psychological constructs from observed behavior. Various methods and metrics for measuring… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: 21 pages, 4 figures

    Journal ref: IEEE Signal Processing Magazine 38.6 (2021): 84-95

  45. arXiv:2304.00450  [pdf, other

    cs.CV

    Sketch-based Video Object Localization

    Authors: Sangmin Woo, So-Yeong Jeon, Jinyoung Park, Minji Son, Sumin Lee, Changick Kim

    Abstract: We introduce Sketch-based Video Object Localization (SVOL), a new task aimed at localizing spatio-temporal object boxes in video queried by the input sketch. We first outline the challenges in the SVOL task and build the Sketch-Video Attention Network (SVANet) with the following design principles: (i) to consider temporal information of video and bridge the domain gap between sketch and video; (ii… ▽ More

    Submitted 29 November, 2023; v1 submitted 2 April, 2023; originally announced April 2023.

    Comments: WACV 2024; Code: https://github.com/sangminwoo/SVOL

  46. arXiv:2303.11793  [pdf, other

    cs.CV

    Bridging Optimal Transport and Jacobian Regularization by Optimal Trajectory for Enhanced Adversarial Defense

    Authors: Binh M. Le, Shahroz Tariq, Simon S. Woo

    Abstract: Deep neural networks, particularly in vision tasks, are notably susceptible to adversarial perturbations. To overcome this challenge, developing a robust classifier is crucial. In light of the recent advancements in the robustness of classifiers, we delve deep into the intricacies of adversarial training and Jacobian regularization, two pivotal defenses. Our work is the first carefully analyzes an… ▽ More

    Submitted 12 February, 2024; v1 submitted 21 March, 2023; originally announced March 2023.

  47. arXiv:2303.09779  [pdf, other

    cs.CV

    Bidirectional Domain Mixup for Domain Adaptive Semantic Segmentation

    Authors: Daehan Kim, Minseok Seo, Kwanyong Park, Inkyu Shin, Sanghyun Woo, In-So Kweon, Dong-Geol Choi

    Abstract: Mixup provides interpolated training samples and allows the model to obtain smoother decision boundaries for better generalization. The idea can be naturally applied to the domain adaptation task, where we can mix the source and target samples to obtain domain-mixed samples for better adaptation. However, the extension of the idea from classification to segmentation (i.e., structured output) is no… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

    Comments: 10 pages, 3 figures, Accepted on AAAI 2023

  48. Why Do Facial Deepfake Detectors Fail?

    Authors: Binh Le, Shahroz Tariq, Alsharif Abuadbba, Kristen Moore, Simon Woo

    Abstract: Recent rapid advancements in deepfake technology have allowed the creation of highly realistic fake media, such as video, image, and audio. These materials pose significant challenges to human authentication, such as impersonation, misinformation, or even a threat to national security. To keep pace with these rapid advancements, several deepfake detection algorithms have been proposed, leading to… ▽ More

    Submitted 10 September, 2023; v1 submitted 25 February, 2023; originally announced February 2023.

    Comments: 5 pages, ACM ASIACCS 2023

  49. arXiv:2301.04333  [pdf, other

    cs.LG cs.AI

    Learnable Path in Neural Controlled Differential Equations

    Authors: Sheo Yon Jhin, Minju Jo, Seungji Kook, Noseong Park, Sungpil Woo, Sunhwan Lim

    Abstract: Neural controlled differential equations (NCDEs), which are continuous analogues to recurrent neural networks (RNNs), are a specialized model in (irregular) time-series processing. In comparison with similar models, e.g., neural ordinary differential equations (NODEs), the key distinctive characteristics of NCDEs are i) the adoption of the continuous path created by an interpolation algorithm from… ▽ More

    Submitted 11 January, 2023; originally announced January 2023.

    Comments: Accepted by AAAI 2023

  50. arXiv:2301.00808  [pdf, other

    cs.CV

    ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

    Authors: Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So Kweon, Saining Xie

    Abstract: Driven by improved architectures and better representation learning frameworks, the field of visual recognition has enjoyed rapid modernization and performance boost in the early 2020s. For example, modern ConvNets, represented by ConvNeXt, have demonstrated strong performance in various scenarios. While these models were originally designed for supervised learning with ImageNet labels, they can a… ▽ More

    Submitted 2 January, 2023; originally announced January 2023.

    Comments: Code and models available at https://github.com/facebookresearch/ConvNeXt-V2