Skip to main content

Showing 1–50 of 645 results for author: Wu, D

  1. arXiv:2407.20170  [pdf, other

    cs.IT

    Propagation of Uncertainty with the Koopman Operator

    Authors: Simone Servadio, Giovanni Lavezzi, Christian Hofmann, Di Wu, Richard Linares

    Abstract: This paper proposes a new method to propagate uncertainties undergoing nonlinear dynamics using the Koopman Operator (KO). Probability density functions are propagated directly using the Koopman approximation of the solution flow of the system, where the dynamics have been projected on a well-defined set of basis functions. The prediction technique is derived following both the analytical (Galerki… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

    Comments: 27th Conference of Information Fusion ID 14

  2. arXiv:2407.19828  [pdf

    cs.LG cs.CR

    Federated Learning based Latent Factorization of Tensors for Privacy-Preserving QoS Prediction

    Authors: Shuai Zhong, Zengtong Tang, Di Wu

    Abstract: In applications related to big data and service computing, dynamic connections tend to be encountered, especially the dynamic data of user-perspective quality of service (QoS) in Web services. They are transformed into high-dimensional and incomplete (HDI) tensors which include abundant temporal pattern information. Latent factorization of tensors (LFT) is an extremely efficient and typical approa… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  3. arXiv:2407.19414  [pdf, other

    cs.AI

    Appformer: A Novel Framework for Mobile App Usage Prediction Leveraging Progressive Multi-Modal Data Fusion and Feature Extraction

    Authors: Chuike Sun, Junzhou Chen, Yue Zhao, Hao Han, Ruihai Jing, Guang Tan, Di Wu

    Abstract: This article presents Appformer, a novel mobile application prediction framework inspired by the efficiency of Transformer-like architectures in processing sequential data through self-attention mechanisms. Combining a Multi-Modal Data Progressive Fusion Module with a sophisticated Feature Extraction Module, Appformer leverages the synergies of multi-modal data fusion and data mining techniques wh… ▽ More

    Submitted 28 July, 2024; originally announced July 2024.

  4. arXiv:2407.14498  [pdf

    cs.CV eess.IV

    Enhancing Layout Hotspot Detection Efficiency with YOLOv8 and PCA-Guided Augmentation

    Authors: Dongyang Wu, Siyang Wang, Mehdi Kamal, Massoud Pedram

    Abstract: In this paper, we present a YOLO-based framework for layout hotspot detection, aiming to enhance the efficiency and performance of the design rule checking (DRC) process. Our approach leverages the YOLOv8 vision model to detect multiple hotspots within each layout image, even when dealing with large layout image sizes. Additionally, to enhance pattern-matching effectiveness, we introduce a novel a… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  5. arXiv:2407.14211  [pdf, other

    cs.LG

    Enhanced Mortality Prediction in ICU Stroke Patients via Deep Learning

    Authors: Armin Abdollahi, Xinghong Ma, Jiahao Zhang, Daijia Wu, Tongshou Wu, Zizheng Ye, Maryam Pishgar

    Abstract: Background: Stroke is second-leading cause of disability and death among adults. Approximately 17 million people suffer from a stroke annually, with about 85% being ischemic strokes. Predicting mortality of ischemic stroke patients in intensive care unit (ICU) is crucial for optimizing treatment strategies, allocating resources, and improving survival rates. Methods: We acquired data on ICU ischem… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  6. arXiv:2407.14073  [pdf, other

    cs.AR cs.AI cs.NE

    LoAS: Fully Temporal-Parallel Datatflow for Dual-Sparse Spiking Neural Networks

    Authors: Ruokai Yin, Youngeun Kim, Di Wu, Priyadarshini Panda

    Abstract: Spiking Neural Networks (SNNs) have gained significant research attention in the last decade due to their potential to drive resource-constrained edge devices. Though existing SNN accelerators offer high efficiency in processing sparse spikes with dense weights, opportunities are less explored in SNNs with sparse weights, i.e., dual-sparsity. In this work, we study the acceleration of dual-sparse… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

    Comments: Accepted to MICRO 2024. Will update with the camera-ready version once ready

  7. arXiv:2407.13338  [pdf, other

    cs.CV

    Learn to Memorize and to Forget: A Continual Learning Perspective of Dynamic SLAM

    Authors: Baicheng Li, Zike Yan, Dong Wu, Hanqing Jiang, Hongbin Zha

    Abstract: Simultaneous localization and mapping (SLAM) with implicit neural representations has received extensive attention due to the expressive representation power and the innovative paradigm of continual learning. However, deploying such a system within a dynamic environment has not been well-studied. Such challenges are intractable even for conventional algorithms since observations from different vie… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  8. arXiv:2407.11626  [pdf

    cs.LG cs.NE

    Dynamic Dimension Wrapping (DDW) Algorithm: A Novel Approach for Efficient Cross-Dimensional Search in Dynamic Multidimensional Spaces

    Authors: Dongnan Jin, Yali Liu, Qiuzhi Song, Xunju Ma, Yue Liu, Dehao Wu

    Abstract: In the real world, as the complexity of optimization problems continues to increase, there is an urgent need to research more efficient optimization methods. Current optimization algorithms excel in solving problems with a fixed number of dimensions. However, their efficiency in searching dynamic multi-dimensional spaces is unsatisfactory. In response to the challenge of cross-dimensional search i… ▽ More

    Submitted 18 July, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

  9. arXiv:2407.11027  [pdf, other

    cs.LG cs.AI

    A robust three-way classifier with shadowed granular-balls based on justifiable granularity

    Authors: Jie Yang, Lingyun Xiaodiao, Guoyin Wang, Witold Pedrycz, Shuyin Xia, Qinghua Zhang, Di Wu

    Abstract: The granular-ball (GB)-based classifier introduced by Xia, exhibits adaptability in creating coarse-grained information granules for input, thereby enhancing its generality and flexibility. Nevertheless, the current GB-based classifiers rigidly assign a specific class label to each data instance and lacks of the necessary strategies to address uncertain instances. These far-fetched certain classif… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  10. arXiv:2407.08457  [pdf, other

    cs.CV

    Neural Poisson Solver: A Universal and Continuous Framework for Natural Signal Blending

    Authors: Delong Wu, Hao Zhu, Qi Zhang, You Li, Zhan Ma, Xun Cao

    Abstract: Implicit Neural Representation (INR) has become a popular method for representing visual signals (e.g., 2D images and 3D scenes), demonstrating promising results in various downstream applications. Given its potential as a medium for visual signals, exploring the development of a neural blending method that utilizes INRs is a natural progression. Neural blending involves merging two INRs to create… ▽ More

    Submitted 11 July, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

    Comments: accepted by ECCV 2024

  11. arXiv:2407.08127  [pdf, other

    cs.CV

    Prediction Exposes Your Face: Black-box Model Inversion via Prediction Alignment

    Authors: Yufan Liu, Wanqian Zhang, Dayan Wu, Zheng Lin, Jingzi Gu, Weiping Wang

    Abstract: Model inversion (MI) attack reconstructs the private training data of a target model given its output, posing a significant threat to deep learning models and data privacy. On one hand, most of existing MI methods focus on searching for latent codes to represent the target identity, yet this iterative optimization-based scheme consumes a huge number of queries to the target model, making it unreal… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  12. arXiv:2407.07026  [pdf, other

    cs.CV cs.CL cs.MM cs.SI

    Resolving Sentiment Discrepancy for Multimodal Sentiment Detection via Semantics Completion and Decomposition

    Authors: Daiqing Wu, Dongbao Yang, Huawen Shen, Can Ma, Yu Zhou

    Abstract: With the proliferation of social media posts in recent years, the need to detect sentiments in multimodal (image-text) content has grown rapidly. Since posts are user-generated, the image and text from the same post can express different or even contradictory sentiments, leading to potential \textbf{sentiment discrepancy}. However, existing works mainly adopt a single-branch fusion structure that… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 8 pages, 6 figures

  13. arXiv:2407.04162  [pdf, other

    eess.IV cs.CV

    Measurement Embedded Schrödinger Bridge for Inverse Problems

    Authors: Yuang Wang, Pengfei Jin, Siyeop Yoon, Matthew Tivnan, Quanzheng Li, Li Zhang, Dufan Wu

    Abstract: Score-based diffusion models are frequently employed as structural priors in inverse problems. However, their iterative denoising process, initiated from Gaussian noise, often results in slow inference speeds. The Image-to-Image Schrödinger Bridge (I$^2$SB), which begins with the corrupted image, presents a promising alternative as a prior for addressing inverse problems. In this work, we introduc… ▽ More

    Submitted 22 May, 2024; originally announced July 2024.

    Comments: 14 pages, 2 figures, Neurips preprint

  14. arXiv:2407.02208  [pdf, other

    cs.CL cs.AI

    How to Learn in a Noisy World? Self-Correcting the Real-World Data Noise on Machine Translation

    Authors: Yan Meng, Di Wu, Christof Monz

    Abstract: The massive amounts of web-mined parallel data contain large amounts of noise. Semantic misalignment, as the primary source of the noise, poses a challenge for training machine translation systems. In this paper, we first study the impact of real-world hard-to-detect misalignment noise by proposing a process to simulate the realistic misalignment controlled by semantic similarity. After quantitati… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  15. arXiv:2407.01511  [pdf, other

    cs.AI

    CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents

    Authors: Tianqi Xu, Linyao Chen, Dai-Jie Wu, Yanjun Chen, Zecheng Zhang, Xiang Yao, Zhiqiang Xie, Yongchao Chen, Shilong Liu, Bochen Qian, Philip Torr, Bernard Ghanem, Guohao Li

    Abstract: The development of autonomous agents increasingly relies on Multimodal Language Models (MLMs) to perform tasks described in natural language with GUI environments, such as websites, desktop computers, or mobile phones. Existing benchmarks for MLM agents in interactive environments are limited by their focus on a single environment, lack of detailed and generalized evaluation methods, and the compl… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  16. arXiv:2407.00610  [pdf, other

    cs.LG

    Diff-BBO: Diffusion-Based Inverse Modeling for Black-Box Optimization

    Authors: Dongxia Wu, Nikki Lijing Kuang, Ruijia Niu, Yi-An Ma, Rose Yu

    Abstract: Black-box optimization (BBO) aims to optimize an objective function by iteratively querying a black-box oracle. This process demands sample-efficient optimization due to the high computational cost of function evaluations. While prior studies focus on forward approaches to learn surrogates for the unknown objective function, they struggle with high-dimensional inputs where valid inputs form a smal… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  17. arXiv:2407.00377  [pdf, other

    cs.CL cs.AI cs.CV cs.CY

    The Factuality Tax of Diversity-Intervened Text-to-Image Generation: Benchmark and Fact-Augmented Intervention

    Authors: Yixin Wan, Di Wu, Haoran Wang, Kai-Wei Chang

    Abstract: Prompt-based "diversity interventions" are commonly adopted to improve the diversity of Text-to-Image (T2I) models depicting individuals with various racial or gender traits. However, will this strategy result in nonfactual demographic distribution, especially when generating real historical figures? In this work, we propose DemOgraphic FActualIty Representation (DoFaiR), a benchmark to systematic… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  18. arXiv:2407.00191  [pdf, other

    cs.CL

    MetaKP: On-Demand Keyphrase Generation

    Authors: Di Wu, Xiaoxian Shen, Kai-Wei Chang

    Abstract: Traditional keyphrase prediction methods predict a single set of keyphrases per document, failing to cater to the diverse needs of users and downstream applications. To bridge the gap, we introduce on-demand keyphrase generation, a novel paradigm that requires keyphrases that conform to specific high-level goals or intents. For this task, we present MetaKP, a large-scale benchmark comprising four… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  19. arXiv:2407.00167  [pdf, other

    cs.CL cs.AI cs.ET cs.HC cs.SI

    Can GPT-4 Help Detect Quit Vaping Intentions? An Exploration of Automatic Data Annotation Approach

    Authors: Sai Krishna Revanth Vuruma, Dezhi Wu, Saborny Sen Gupta, Lucas Aust, Valerie Lookingbill, Wyatt Bellamy, Yang Ren, Erin Kasson, Li-Shiun Chen, Patricia Cavazos-Rehg, Dian Hu, Ming Huang

    Abstract: In recent years, the United States has witnessed a significant surge in the popularity of vaping or e-cigarette use, leading to a notable rise in cases of e-cigarette and vaping use-associated lung injury (EVALI) that caused hospitalizations and fatalities during the EVALI outbreak in 2019, highlighting the urgency to comprehend vaping behaviors and develop effective strategies for cessation. Due… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: Accepted for the AI Applications in Public Health and Social Services workshop at the 22nd International Conference on Artificial Intelligence in Medicine (AIME 2024)

  20. arXiv:2406.18137  [pdf, ps, other

    stat.ML cs.LG

    Sparse deep neural networks for nonparametric estimation in high-dimensional sparse regression

    Authors: Dongya Wu, Xin Li

    Abstract: Generalization theory has been established for sparse deep neural networks under high-dimensional regime. Beyond generalization, parameter estimation is also important since it is crucial for variable selection and interpretability of deep neural networks. Current theoretical studies concerning parameter estimation mainly focus on two-layer neural networks, which is due to the fact that the conver… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  21. arXiv:2406.17456  [pdf, other

    cs.CL cs.AI

    Improving Grammatical Error Correction via Contextual Data Augmentation

    Authors: Yixuan Wang, Baoxin Wang, Yijun Liu, Qingfu Zhu, Dayong Wu, Wanxiang Che

    Abstract: Nowadays, data augmentation through synthetic data has been widely used in the field of Grammatical Error Correction (GEC) to alleviate the problem of data scarcity. However, these synthetic data are mainly used in the pre-training phase rather than the data-limited fine-tuning phase due to inconsistent error distribution and noisy labels. In this paper, we propose a synthetic data construction me… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Accepted as Findings of ACL 2024

  22. arXiv:2406.13692  [pdf, other

    cs.CL

    Synchronous Faithfulness Monitoring for Trustworthy Retrieval-Augmented Generation

    Authors: Di Wu, Jia-Chen Gu, Fan Yin, Nanyun Peng, Kai-Wei Chang

    Abstract: Retrieval-augmented language models (RALMs) have shown strong performance and wide applicability in knowledge-intensive tasks. However, there are significant trustworthiness concerns as RALMs are prone to generating unfaithful outputs, including baseless information or contradictions with the retrieved context. This paper proposes SynCheck, a lightweight monitor that leverages fine-grained decodin… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  23. arXiv:2406.12783  [pdf, ps, other

    cs.NE cs.DC eess.SY math.NA

    Zeroing neural dynamics solving time-variant complex conjugate matrix equation

    Authors: Jiakuang He, Dongqing Wu

    Abstract: Complex conjugate matrix equations (CCME) have aroused the interest of many researchers because of computations and antilinear systems. Existing research is dominated by its time-invariant solving methods, but lacks proposed theories for solving its time-variant version. Moreover, artificial neural networks are rarely studied for solving CCME. In this paper, starting with the earliest CCME, zeroin… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  24. arXiv:2406.11828  [pdf, other

    cs.LG stat.ML

    Learning sum of diverse features: computational hardness and efficient gradient-based training for ridge combinations

    Authors: Kazusato Oko, Yujin Song, Taiji Suzuki, Denny Wu

    Abstract: We study the computational and sample complexity of learning a target function $f_*:\mathbb{R}^d\to\mathbb{R}$ with additive structure, that is, $f_*(x) = \frac{1}{\sqrt{M}}\sum_{m=1}^M f_m(\langle x, v_m\rangle)$, where $f_1,f_2,...,f_M:\mathbb{R}\to\mathbb{R}$ are nonlinear link functions of single-index models (ridge functions) with diverse and near-orthogonal index features $\{v_m\}_{m=1}^M$,… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: COLT 2024

  25. arXiv:2406.11551  [pdf, other

    cs.CV

    Simple Yet Efficient: Towards Self-Supervised FG-SBIR with Unified Sample Feature Alignment

    Authors: Jianan Jiang, Di Wu, Zhilin Jiang, Weiren Yu

    Abstract: Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) aims to minimize the distance between sketches and corresponding images in the embedding space. However, scalability is hindered by the growing complexity of solutions, mainly due to the abstract nature of fine-grained sketches. In this paper, we propose a simple yet efficient approach to narrow the gap between the two modes. It mainly facilitate… ▽ More

    Submitted 22 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: 10 pages,8 figures, 4 tables

  26. arXiv:2406.09829  [pdf, other

    cs.CV

    Open-Vocabulary Semantic Segmentation with Image Embedding Balancing

    Authors: Xiangheng Shan, Dongyue Wu, Guilin Zhu, Yuanjie Shao, Nong Sang, Changxin Gao

    Abstract: Open-vocabulary semantic segmentation is a challenging task, which requires the model to output semantic masks of an image beyond a close-set vocabulary. Although many efforts have been made to utilize powerful CLIP models to accomplish this task, they are still easily overfitting to training classes due to the natural gaps in semantic information between training and new classes. To overcome this… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: CVPR2024

  27. arXiv:2406.07880  [pdf, other

    cs.CV eess.IV

    A Comprehensive Survey on Machine Learning Driven Material Defect Detection: Challenges, Solutions, and Future Prospects

    Authors: Jun Bai, Di Wu, Tristan Shelley, Peter Schubel, David Twine, John Russell, Xuesen Zeng, Ji Zhang

    Abstract: Material defects (MD) represent a primary challenge affecting product performance and giving rise to safety issues in related products. The rapid and accurate identification and localization of MD constitute crucial research endeavours in addressing contemporary challenges associated with MD. Although conventional non-destructive testing methods such as ultrasonic and X-ray approaches have mitigat… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  28. arXiv:2406.05498  [pdf, other

    cs.CR cs.AI

    SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner

    Authors: Xunguang Wang, Daoyuan Wu, Zhenlan Ji, Zongjie Li, Pingchuan Ma, Shuai Wang, Yingjiu Li, Yang Liu, Ning Liu, Juergen Rahmel

    Abstract: Jailbreaking is an emerging adversarial attack that bypasses the safety alignment deployed in off-the-shelf large language models (LLMs) and has evolved into four major categories: optimization-based attacks such as Greedy Coordinate Gradient (GCG), jailbreak template-based attacks such as "Do-Anything-Now", advanced indirect attacks like DrAttack, and multilingual jailbreaks. However, delivering… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: This paper completes its earlier vision paper, available at arXiv:2402.15727

  29. arXiv:2406.05039  [pdf, other

    cs.CV cs.CL

    Bootstrapping Referring Multi-Object Tracking

    Authors: Yani Zhang, Dongming Wu, Wencheng Han, Xingping Dong

    Abstract: Referring multi-object tracking (RMOT) aims at detecting and tracking multiple objects following human instruction represented by a natural language expression. Existing RMOT benchmarks are usually formulated through manual annotations, integrated with static regulations. This approach results in a dearth of notable diversity and a constrained scope of implementation. In this work, our key idea is… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  30. arXiv:2406.02059  [pdf, other

    cs.LG

    Graph Adversarial Diffusion Convolution

    Authors: Songtao Liu, Jinghui Chen, Tianfan Fu, Lu Lin, Marinka Zitnik, Dinghao Wu

    Abstract: This paper introduces a min-max optimization formulation for the Graph Signal Denoising (GSD) problem. In this formulation, we first maximize the second term of GSD by introducing perturbations to the graph structure based on Laplacian distance and then minimize the overall loss of the GSD. By solving the min-max optimization problem, we derive a new variant of the Graph Diffusion Convolution (GDC… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted by ICML 2024

  31. arXiv:2406.01581  [pdf, other

    cs.LG stat.ML

    Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit

    Authors: Jason D. Lee, Kazusato Oko, Taiji Suzuki, Denny Wu

    Abstract: We study the problem of gradient descent learning of a single-index target function $f_*(\boldsymbol{x}) = \textstyleσ_*\left(\langle\boldsymbol{x},\boldsymbolθ\rangle\right)$ under isotropic Gaussian data in $\mathbb{R}^d$, where the link function $σ_*:\mathbb{R}\to\mathbb{R}$ is an unknown degree $q$ polynomial with information exponent $p$ (defined as the lowest degree in the Hermite expansion)… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 34 pages

  32. arXiv:2406.00714  [pdf, other

    cs.CV

    A Survey of Deep Learning Based Radar and Vision Fusion for 3D Object Detection in Autonomous Driving

    Authors: Di Wu, Feng Yang, Benlian Xu, Pan Liao, Bo Liu

    Abstract: With the rapid advancement of autonomous driving technology, there is a growing need for enhanced safety and efficiency in the automatic environmental perception of vehicles during their operation. In modern vehicle setups, cameras and mmWave radar (radar), being the most extensively employed sensors, demonstrate complementary characteristics, inherently rendering them conducive to fusion and faci… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  33. arXiv:2406.00645  [pdf, other

    cs.LG cs.AI cs.CV

    FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning

    Authors: Yuwei Fu, Haichao Zhang, Di Wu, Wei Xu, Benoit Boulet

    Abstract: In this work, we investigate how to leverage pre-trained visual-language models (VLM) for online Reinforcement Learning (RL). In particular, we focus on sparse reward tasks with pre-defined textual task descriptions. We first identify the problem of reward misalignment when applying VLM as a reward in RL tasks. To address this issue, we introduce a lightweight fine-tuning method, named Fuzzy VLM r… ▽ More

    Submitted 4 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  34. arXiv:2406.00262  [pdf, other

    cs.LG cs.AI

    Contrastive Learning Via Equivariant Representation

    Authors: Sifan Song, Jinfeng Wang, Qiaochu Zhao, Xiang Li, Dufan Wu, Angelos Stefanidis, Jionglong Su, S. Kevin Zhou, Quanzheng Li

    Abstract: Invariant-based Contrastive Learning (ICL) methods have achieved impressive performance across various domains. However, the absence of latent space representation for distortion (augmentation)-related information in the latent space makes ICL sub-optimal regarding training efficiency and robustness in downstream tasks. Recent studies suggest that introducing equivariance into Contrastive Learning… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: Preprint. Under review

  35. arXiv:2405.20849  [pdf, ps, other

    cs.DS math.PR

    Locally Stationary Distributions: A Framework for Analyzing Slow-Mixing Markov Chains

    Authors: Kuikui Liu, Sidhanth Mohanty, Prasad Raghavendra, Amit Rajaraman, David X. Wu

    Abstract: Many natural Markov chains fail to mix to their stationary distribution in polynomially many steps. Often, this slow mixing is inevitable since it is computationally intractable to sample from their stationary measure. Nevertheless, Markov chains can be shown to always converge quickly to measures that are *locally stationary*, i.e., measures that don't change over a small number of steps. These… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: 34 pages

  36. arXiv:2405.20614  [pdf, other

    cs.CV

    EPIDetect: Video-based convulsive seizure detection in chronic epilepsy mouse model for anti-epilepsy drug screening

    Authors: Junming Ren, Zhoujian Xiao, Yujia Zhang, Yujie Yang, Ling He, Ezra Yoon, Stephen Temitayo Bello, Xi Chen, Dapeng Wu, Micky Tortorella, Jufang He

    Abstract: In the preclinical translational studies, drug candidates with remarkable anti-epileptic efficacy demonstrate long-term suppression of spontaneous recurrent seizures (SRSs), particularly convulsive seizures (CSs), in mouse models of chronic epilepsy. However, the current methods for monitoring CSs have limitations in terms of invasiveness, specific laboratory settings, high cost, and complex opera… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  37. arXiv:2405.20584  [pdf, other

    cs.CV cs.AI

    Disrupting Diffusion: Token-Level Attention Erasure Attack against Diffusion-based Customization

    Authors: Yisu Liu, Jinyang An, Wanqian Zhang, Dayan Wu, Jingzi Gu, Zheng Lin, Weiping Wang

    Abstract: With the development of diffusion-based customization methods like DreamBooth, individuals now have access to train the models that can generate their personalized images. Despite the convenience, malicious users have misused these techniques to create fake images, thereby triggering a privacy security crisis. In light of this, proactive adversarial attacks are proposed to protect users against cu… ▽ More

    Submitted 25 July, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted by ACM MM2024

    ACM Class: I.2.10

  38. arXiv:2405.19630  [pdf

    cs.RO

    The use of a humanoid robot for older people with dementia in aged care facilities

    Authors: Dongjun Wu, Lihui Pu, Jun Jo, Rene Hexel, Wendy Moyle

    Abstract: This paper presents an interdisciplinary PhD project using a humanoid robot to encourage interactive activities for people with dementia living in two aged care facilities. The aim of the project was to develop software and use technologies to achieve successful robot-led engagement with older people with dementia. This paper outlines the qualitative findings from the project's feasibility stage.… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted for the Second Workshop on Care Robots for Older Adults (CROA), RO-MAN 2023, Busan, Korea

  39. arXiv:2405.18361  [pdf, other

    cs.CV

    Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving?

    Authors: Yifan Bai, Dongming Wu, Yingfei Liu, Fan Jia, Weixin Mao, Ziheng Zhang, Yucheng Zhao, Jianbing Shen, Xing Wei, Tiancai Wang, Xiangyu Zhang

    Abstract: Rapid advancements in Autonomous Driving (AD) tasks turned a significant shift toward end-to-end fashion, particularly in the utilization of vision-language models (VLMs) that integrate robust logical reasoning and cognitive abilities to enable comprehensive end-to-end planning. However, these VLM-based approaches tend to integrate 2D vision tokenizers and a large language model (LLM) for ego-car… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  40. arXiv:2405.16789  [pdf, other

    cs.IR

    NoteLLM-2: Multimodal Large Representation Models for Recommendation

    Authors: Chao Zhang, Haoxin Zhang, Shiwei Wu, Di Wu, Tong Xu, Yan Gao, Yao Hu, Enhong Chen

    Abstract: Large Language Models (LLMs) have demonstrated exceptional text understanding. Existing works explore their application in text embedding tasks. However, there are few works utilizing LLMs to assist multimodal representation tasks. In this work, we investigate the potential of LLMs to enhance multimodal representation in multimodal item-to-item (I2I) recommendations. One feasible method is the tra… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: 19 pages, 5 figures

  41. arXiv:2405.15176  [pdf, other

    cs.CV

    MonoDETRNext: Next-generation Accurate and Efficient Monocular 3D Object Detection Method

    Authors: Pan Liao, Feng Yang, Di Wu, Liu Bo

    Abstract: Monocular vision-based 3D object detection is crucial in various sectors, yet existing methods face significant challenges in terms of accuracy and computational efficiency. Building on the successful strategies in 2D detection and depth estimation, we propose MonoDETRNext, which seeks to optimally balance precision and processing speed. Our methodology includes the development of an efficient hyb… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  42. arXiv:2405.14691  [pdf, other

    cs.AI cs.MA

    CityGPT: Towards Urban IoT Learning, Analysis and Interaction with Multi-Agent System

    Authors: Qinghua Guan, Jinhui Ouyang, Di Wu, Weiren Yu

    Abstract: The spatiotemporal data generated by massive sensors in the Internet of Things (IoT) is extremely dynamic, heterogeneous, large scale and time-dependent. It poses great challenges (e.g. accuracy, reliability, and stability) in real-time analysis and decision making for different IoT applications. The complexity of IoT data prevents the common people from gaining a deeper understanding of it. Agent… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  43. arXiv:2405.10825  [pdf, other

    eess.SY cs.LG

    Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities

    Authors: Hao Zhou, Chengming Hu, Ye Yuan, Yufei Cui, Yili Jin, Can Chen, Haolun Wu, Dun Yuan, Li Jiang, Di Wu, Xue Liu, Charlie Zhang, Xianbin Wang, Jiangchuan Liu

    Abstract: Large language models (LLMs) have received considerable attention recently due to their outstanding comprehension and reasoning capabilities, leading to great progress in many fields. The advancement of LLM techniques also offers promising opportunities to automate many tasks in the telecommunication (telecom) field. After pre-training and fine-tuning, LLMs can perform diverse downstream tasks bas… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  44. arXiv:2405.10812  [pdf, other

    q-bio.GN cs.AI

    VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling

    Authors: Siyuan Li, Zedong Wang, Zicheng Liu, Di Wu, Cheng Tan, Jiangbin Zheng, Yufei Huang, Stan Z. Li

    Abstract: Similar to natural language models, pre-trained genome language models are proposed to capture the underlying intricacies within genomes with unsupervised sequence modeling. They have become essential tools for researchers and practitioners in biology. However, the hand-crafted tokenization policies used in these models may not encode the most discriminative patterns from the limited vocabulary of… ▽ More

    Submitted 2 June, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

    Comments: ICML 2024. Preprint V2 with 17 pages and 5 figures

  45. arXiv:2405.07744  [pdf, other

    cs.SE

    MoCo: Fuzzing Deep Learning Libraries via Assembling Code

    Authors: Pin Ji, Yang Feng, Duo Wu, Lingyue Yan, Pengling Chen, Jia Liu, Zhihong Zhao

    Abstract: The rapidly developing deep learning (DL) techniques have been applied in software systems with various application scenarios. However, they could also pose new safety threats with potentially serious consequences, especially in safety-critical domains. DL libraries serve as the underlying foundation for DL systems, and bugs in them can have unpredictable impacts that directly affect the behaviors… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  46. arXiv:2405.06616  [pdf, ps, other

    math.PR cs.DS math.CO

    Fast Mixing in Sparse Random Ising Models

    Authors: Kuikui Liu, Sidhanth Mohanty, Amit Rajaraman, David X. Wu

    Abstract: Motivated by the community detection problem in Bayesian inference, as well as the recent explosion of interest in spin glasses from statistical physics, we study the classical Glauber dynamics for sampling from Ising models with sparse random interactions. It is now well-known that when the interaction matrix has spectral diameter less than $1$, Glauber dynamics mixes in $O(n\log n)$ steps. Unfor… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: 66 pages, 4 figures

  47. arXiv:2405.05985  [pdf, other

    cs.LG cs.AI

    TrafficGPT: Towards Multi-Scale Traffic Analysis and Generation with Spatial-Temporal Agent Framework

    Authors: Jinhui Ouyang, Yijie Zhu, Xiang Yuan, Di Wu

    Abstract: The precise prediction of multi-scale traffic is a ubiquitous challenge in the urbanization process for car owners, road administrators, and governments. In the case of complex road networks, current and past traffic information from both upstream and downstream roads are crucial since various road networks have different semantic information about traffic. Rationalizing the utilization of semanti… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  48. arXiv:2405.02580  [pdf, other

    cs.SE cs.AI

    PropertyGPT: LLM-driven Formal Verification of Smart Contracts through Retrieval-Augmented Property Generation

    Authors: Ye Liu, Yue Xue, Daoyuan Wu, Yuqiang Sun, Yi Li, Miaolei Shi, Yang Liu

    Abstract: With recent advances in large language models (LLMs), this paper explores the potential of leveraging state-of-the-art LLMs, such as GPT-4, to transfer existing human-written properties (e.g., those from Certora auditing reports) and automatically generate customized properties for unknown code. To this end, we embed existing properties into a vector database and retrieve a reference property for… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  49. arXiv:2405.01844  [pdf, other

    cs.NI cs.CR cs.DC

    A Survey on Privacy-Preserving Caching at Network Edge: Classification, Solutions, and Challenges

    Authors: Xianzhi Zhang, Yipeng Zhou, Di Wu, Shazia Riaz, Quan Z. Sheng, Miao Hu, Linchang Xiao

    Abstract: Caching content at the network edge is a popular and effective technique widely deployed to alleviate the burden of network backhaul, shorten service delay and improve service quality. However, there has been some controversy over privacy violations in caching content at the network edge. On the one hand, the multi-access open edge network provides an ideal surface for external attackers to obtain… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  50. arXiv:2405.00699  [pdf, other

    cs.NE cs.AI cs.LG

    Direct Training Needs Regularisation: Anytime Optimal Inference Spiking Neural Network

    Authors: Dengyu Wu, Yi Qi, Kaiwen Cai, Gaojie Jin, Xinping Yi, Xiaowei Huang

    Abstract: Spiking Neural Network (SNN) is acknowledged as the next generation of Artificial Neural Network (ANN) and hold great promise in effectively processing spatial-temporal information. However, the choice of timestep becomes crucial as it significantly impacts the accuracy of the neural network training. Specifically, a smaller timestep indicates better performance in efficient computing, resulting i… ▽ More

    Submitted 15 April, 2024; originally announced May 2024.