Skip to main content

Showing 1–50 of 506 results for author: Jiang, L

  1. arXiv:2407.20018  [pdf, other

    cs.DC

    Efficient Training of Large Language Models on Distributed Infrastructures: A Survey

    Authors: Jiangfei Duan, Shuo Zhang, Zerui Wang, Lijuan Jiang, Wenwen Qu, Qinghao Hu, Guoteng Wang, Qizhen Weng, Hang Yan, Xingcheng Zhang, Xipeng Qiu, Dahua Lin, Yonggang Wen, Xin Jin, Tianwei Zhang, Peng Sun

    Abstract: Large Language Models (LLMs) like GPT and LLaMA are revolutionizing the AI industry with their sophisticated capabilities. Training these models requires vast GPU clusters and significant computing time, posing major challenges in terms of scalability, efficiency, and reliability. This survey explores recent advancements in training systems for LLMs, including innovations in training infrastructur… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  2. arXiv:2407.19397  [pdf, other

    cs.CV

    Domain Adaptive Lung Nodule Detection in X-ray Image

    Authors: Haifeng Zhao, Lixiang Jiang, Leilei Ma, Dengdi Sun, Yanping Fu

    Abstract: Medical images from different healthcare centers exhibit varied data distributions, posing significant challenges for adapting lung nodule detection due to the domain shift between training and application phases. Traditional unsupervised domain adaptive detection methods often struggle with this shift, leading to suboptimal outcomes. To overcome these challenges, we introduce a novel domain adapt… ▽ More

    Submitted 28 July, 2024; originally announced July 2024.

    Comments: This paper will submit to IEEE SMC 2024

  3. arXiv:2407.18449  [pdf, other

    eess.IV cs.CV cs.LG

    Towards A Generalizable Pathology Foundation Model via Unified Knowledge Distillation

    Authors: Jiabo Ma, Zhengrui Guo, Fengtao Zhou, Yihui Wang, Yingxue Xu, Yu Cai, Zhengjie Zhu, Cheng Jin, Yi Lin Xinrui Jiang, Anjia Han, Li Liang, Ronald Cheong Kin Chan, Jiguang Wang, Kwang-Ting Cheng, Hao Chen

    Abstract: Foundation models pretrained on large-scale datasets are revolutionizing the field of computational pathology (CPath). The generalization ability of foundation models is crucial for the success in various downstream clinical tasks. However, current foundation models have only been evaluated on a limited type and number of tasks, leaving their generalization ability and overall performance unclear.… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Report number: I.2.10

  4. Adaptive Differentially Private Structural Entropy Minimization for Unsupervised Social Event Detection

    Authors: Zhiwei Yang, Yuecen Wei, Haoran Li, Qian Li, Lei Jiang, Li Sun, Xiaoyan Yu, Chunming Hu, Hao Peng

    Abstract: Social event detection refers to extracting relevant message clusters from social media data streams to represent specific events in the real world. Social event detection is important in numerous areas, such as opinion analysis, social safety, and decision-making. Most current methods are supervised and require access to large amounts of data. These methods need prior knowledge of the events and… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: Accepted to ACM CIKM 2024

  5. arXiv:2407.17468  [pdf, other

    cs.CL cs.AI

    WildHallucinations: Evaluating Long-form Factuality in LLMs with Real-World Entity Queries

    Authors: Wenting Zhao, Tanya Goyal, Yu Ying Chiu, Liwei Jiang, Benjamin Newman, Abhilasha Ravichander, Khyathi Chandu, Ronan Le Bras, Claire Cardie, Yuntian Deng, Yejin Choi

    Abstract: While hallucinations of large language models (LLMs) prevail as a major challenge, existing evaluation benchmarks on factuality do not cover the diverse domains of knowledge that the real-world users of LLMs seek information about. To bridge this gap, we introduce WildHallucinations, a benchmark that evaluates factuality. It does so by prompting LLMs to generate information about entities mined fr… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  6. arXiv:2407.16394  [pdf, other

    cs.CV

    SEDS: Semantically Enhanced Dual-Stream Encoder for Sign Language Retrieval

    Authors: Longtao Jiang, Min Wang, Zecheng Li, Yao Fang, Wengang Zhou, Houqiang Li

    Abstract: Different from traditional video retrieval, sign language retrieval is more biased towards understanding the semantic information of human actions contained in video clips. Previous works typically only encode RGB videos to obtain high-level semantic features, resulting in local action details drowned in a large amount of visual information redundancy. Furthermore, existing RGB-based sign retrieva… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: Accepted to ACM International Conference on Multimedia (MM) 2024

  7. arXiv:2407.16165  [pdf, other

    eess.IV cs.CV cs.LG

    Advanced AI Framework for Enhanced Detection and Assessment of Abdominal Trauma: Integrating 3D Segmentation with 2D CNN and RNN Models

    Authors: Liheng Jiang, Xuechun yang, Chang Yu, Zhizhong Wu, Yuting Wang

    Abstract: Trauma is a significant cause of mortality and disability, particularly among individuals under forty. Traditional diagnostic methods for traumatic injuries, such as X-rays, CT scans, and MRI, are often time-consuming and dependent on medical expertise, which can delay critical interventions. This study explores the application of artificial intelligence (AI) and machine learning (ML) to improve t… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: 6 Pages

  8. arXiv:2407.15708  [pdf, other

    cs.CV cs.AI

    SwinSF: Image Reconstruction from Spatial-Temporal Spike Streams

    Authors: Liangyan Jiang, Chuang Zhu, Yanxu Chen

    Abstract: The spike camera, with its high temporal resolution, low latency, and high dynamic range, addresses high-speed imaging challenges like motion blur. It captures photons at each pixel independently, creating binary spike streams rich in temporal information but challenging for image reconstruction. Current algorithms, both traditional and deep learning-based, still need to be improved in the utiliza… ▽ More

    Submitted 24 July, 2024; v1 submitted 22 July, 2024; originally announced July 2024.

  9. arXiv:2407.11518  [pdf, other

    stat.ML cs.LG stat.OT

    Ensemble Transport Filter via Optimized Maximum Mean Discrepancy

    Authors: Dengfei Zeng, Lijian Jiang

    Abstract: In this paper, we present a new ensemble-based filter method by reconstructing the analysis step of the particle filter through a transport map, which directly transports prior particles to posterior particles. The transport map is constructed through an optimization problem described by the Maximum Mean Discrepancy loss function, which matches the expectation information of the approximated poste… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 27 pages, 14 figures

  10. Style Alignment based Dynamic Observation Method for UAV-View Geo-localization

    Authors: Jie Shao, LingHao Jiang

    Abstract: The task of UAV-view geo-localization is to estimate the localization of a query satellite/drone image by matching it against a reference dataset consisting of drone/satellite images. Though tremendous strides have been made in feature alignment between satellite and drone views, vast differences in both inter and intra-class due to changes in viewpoint, altitude, and lighting remain a huge challe… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: has published on IEEE Transactions on Geoscience and Remote Sensing, 2023

  11. arXiv:2406.19465  [pdf, other

    cs.CL

    Can Large Language Models Generate High-quality Patent Claims?

    Authors: Lekang Jiang, Caiqi Zhang, Pascal A Scherz, Stephan Goetz

    Abstract: Large language models (LLMs) have shown exceptional performance across various text generation tasks but remain under-explored in the patent domain, which offers highly structured and precise language. This paper constructs a dataset to investigate the performance of current LLMs in patent claim generation. Our results demonstrate that generating claims based on patent descriptions outperforms pre… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 13 pages

  12. arXiv:2406.18510  [pdf, other

    cs.CL

    WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

    Authors: Liwei Jiang, Kavel Rao, Seungju Han, Allyson Ettinger, Faeze Brahman, Sachin Kumar, Niloofar Mireshghallah, Ximing Lu, Maarten Sap, Yejin Choi, Nouha Dziri

    Abstract: We introduce WildTeaming, an automatic LLM safety red-teaming framework that mines in-the-wild user-chatbot interactions to discover 5.7K unique clusters of novel jailbreak tactics, and then composes multiple tactics for systematic exploration of novel jailbreaks. Compared to prior work that performed red-teaming via recruited human workers, gradient-based optimization, or iterative revision with… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  13. arXiv:2406.18495  [pdf, other

    cs.CL

    WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs

    Authors: Seungju Han, Kavel Rao, Allyson Ettinger, Liwei Jiang, Bill Yuchen Lin, Nathan Lambert, Yejin Choi, Nouha Dziri

    Abstract: We introduce WildGuard -- an open, light-weight moderation tool for LLM safety that achieves three goals: (1) identifying malicious intent in user prompts, (2) detecting safety risks of model responses, and (3) determining model refusal rate. Together, WildGuard serves the increasing needs for automatic safety moderation and evaluation of LLM interactions, providing a one-stop tool with enhanced a… ▽ More

    Submitted 9 July, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: First two authors contributed equally. Third and fourth authors contributed equally

  14. arXiv:2406.17443  [pdf, other

    cs.CV

    Using joint angles based on the international biomechanical standards for human action recognition and related tasks

    Authors: Kevin Schlegel, Lei Jiang, Hao Ni

    Abstract: Keypoint data has received a considerable amount of attention in machine learning for tasks like action detection and recognition. However, human experts in movement such as doctors, physiotherapists, sports scientists and coaches use a notion of joint angles standardised by the International Society of Biomechanics to precisely and efficiently communicate static body poses and movements. In this… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  15. arXiv:2406.16328  [pdf, other

    cs.CE

    Convolutional neural network based reduced order modeling for multiscale problems

    Authors: Xuhan Zhang, Lijian Jiang

    Abstract: In this paper, we combine convolutional neural networks (CNNs) with reduced order modeling (ROM) for efficient simulations of multiscale problems. These problems are modeled by partial differential equations with high-dimensional random inputs. The proposed method involves two separate CNNs: Basis CNNs and Coefficient CNNs (Coef CNNs), which correspond to two main parts of ROM. The method is calle… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 35 pages, 29 figures

  16. arXiv:2406.13987  [pdf

    cs.CV cs.LG

    Image anomaly detection and prediction scheme based on SSA optimized ResNet50-BiGRU model

    Authors: Qianhui Wan, Zecheng Zhang, Liheng Jiang, Zhaoqi Wang, Yan Zhou

    Abstract: Image anomaly detection is a popular research direction, with many methods emerging in recent years due to rapid advancements in computing. The use of artificial intelligence for image anomaly detection has been widely studied. By analyzing images of athlete posture and movement, it is possible to predict injury status and suggest necessary adjustments. Most existing methods rely on convolutional… ▽ More

    Submitted 20 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

  17. arXiv:2406.10744  [pdf, other

    cs.CV

    Technique Report of CVPR 2024 PBDL Challenges

    Authors: Ying Fu, Yu Li, Shaodi You, Boxin Shi, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li, Yuze Han, Yingkai Zhang, Jianan Wang, Qinglin Liu, Wei Yu, Xiaoqian Lv, Jianing Li, Shengping Zhang, Xiangyang Ji, Yuanpei Chen, Yuhan Zhang, Weihang Peng, Liwen Zhang, Zhe Xu, Dingyong Gou, Cong Li, Senyan Xu , et al. (75 additional authors not shown)

    Abstract: The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, a… ▽ More

    Submitted 12 July, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: CVPR 2024 PBDL Challenges: https://pbdl-ws.github.io/pbdl2024/challenge/index.html

  18. arXiv:2406.10148  [pdf, other

    math.OC cs.LG stat.ML

    A Primal-Dual-Assisted Penalty Approach to Bilevel Optimization with Coupled Constraints

    Authors: Liuyuan Jiang, Quan Xiao, Victor M. Tenorio, Fernando Real-Rojas, Antonio Marques, Tianyi Chen

    Abstract: Interest in bilevel optimization has grown in recent years, partially due to its applications to tackle challenging machine-learning problems. Several exciting recent works have been centered around developing efficient gradient-based algorithms that can solve bilevel optimization problems with provable guarantees. However, the existing literature mainly focuses on bilevel problems either without… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  19. arXiv:2406.07961  [pdf, other

    cs.CV cs.AI

    Accurate Explanation Model for Image Classifiers using Class Association Embedding

    Authors: Ruitao Xie, Jingbang Chen, Limai Jiang, Rui Xiao, Yi Pan, Yunpeng Cai

    Abstract: Image classification is a primary task in data analysis where explainable models are crucially demanded in various applications. Although amounts of methods have been proposed to obtain explainable knowledge from the black-box classifiers, these approaches lack the efficiency of extracting global knowledge regarding the classification task, thus is vulnerable to local traps and often leads to poor… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 40th IEEE International Conference on Data Engineering

  20. arXiv:2406.05673  [pdf, other

    cs.AI cs.CL

    Flow of Reasoning: Efficient Training of LLM Policy with Divergent Thinking

    Authors: Fangxu Yu, Lai Jiang, Haoqiang Kang, Shibo Hao, Lianhui Qin

    Abstract: Divergent thinking, the cognitive process of generating diverse solutions, is a hallmark of human creativity and problem-solving. For machines, sampling diverse solution trajectories in complex reasoning problems is crucial for robust outcomes, data augmentation, and enhanced model generalization. Large language models (LLMs) often struggle with generating high-quality, diverse reasoning. While su… ▽ More

    Submitted 24 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

  21. arXiv:2406.05637  [pdf, ps, other

    math.OC cs.LG math.PR stat.ML

    A Generalized Version of Chung's Lemma and its Applications

    Authors: Li Jiang, Xiao Li, Andre Milzarek, Junwen Qiu

    Abstract: Chung's lemma is a classical tool for establishing asymptotic convergence rates of (stochastic) optimization methods under strong convexity-type assumptions and appropriate polynomial diminishing step sizes. In this work, we develop a generalized version of Chung's lemma, which provides a simple non-asymptotic convergence framework for a more general family of step size rules. We demonstrate broad… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: 43 pages, 5 figures

    MSC Class: 90C15; 90C30; 90C26

  22. arXiv:2406.03271  [pdf, other

    cs.CV

    Image Copy-Move Forgery Detection and Localization Scheme: How to Avoid Missed Detection and False Alarm

    Authors: Li Jiang, Zhaowei Lu, Yuebing Gao, Yifan Wang

    Abstract: Image copy-move is an operation that replaces one part of the image with another part of the same image, which can be used for illegal purposes due to the potential semantic changes. Recent studies have shown that keypoint-based algorithms achieved excellent and robust localization performance even when small or smooth tampered areas were involved. However, when the input image is low-resolution,… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  23. arXiv:2406.02856  [pdf, other

    cs.CL cs.AI

    Xmodel-LM Technical Report

    Authors: Yichuan Wang, Yang Liu, Yu Yan, Qun Wang, Xucheng Huang, Ling Jiang

    Abstract: We introduce Xmodel-LM, a compact and efficient 1.1B language model pre-trained on around 2 trillion tokens. Trained on our self-built dataset (Xdata), which balances Chinese and English corpora based on downstream task optimization, Xmodel-LM exhibits remarkable performance despite its smaller size. It notably surpasses existing open-source language models of similar scale. Our model checkpoints… ▽ More

    Submitted 26 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  24. arXiv:2406.02746  [pdf, other

    cs.CL

    RATT: A Thought Structure for Coherent and Correct LLM Reasoning

    Authors: Jinghan Zhang, Xiting Wang, Weijieying Ren, Lu Jiang, Dongjie Wang, Kunpeng Liu

    Abstract: Large Language Models (LLMs) gain substantial reasoning and decision-making capabilities from thought structures. However, existing methods such as Tree of Thought and Retrieval Augmented Thoughts often fall short in complex tasks due to the limitations of insufficient local retrieval of factual knowledge and inadequate global selection of strategies. These limitations make it challenging for thes… ▽ More

    Submitted 11 July, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  25. arXiv:2406.01281  [pdf

    physics.med-ph cs.HC

    Noninvasive Extraction of Maternal and Fetal ECG using Periodic Progressive FastICA Peel-off

    Authors: Yao Li, Xuanyu Luo, Haowen Zhao, Jiawen Cui, Yangfan She, Dongfang Li, Lai Jiang, Xu Zhang

    Abstract: The abdominal electrocardiogram (AECG) gives a safe and non-invasive way to monitor fetal well-being during pregnancy. However, due to the overlap with maternal ECG (MECG) as well as significant external noise, it is challenging to extract weak fetal ECG (FECG) using surface electrodes. In this study, we introduce a novel periodic progressive FastICA peel-off (PPFP) method for noninvasive extracti… ▽ More

    Submitted 22 July, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  26. arXiv:2405.14231  [pdf, other

    cs.CL

    From Role-Play to Drama-Interaction: An LLM Solution

    Authors: Weiqi Wu, Hongqiu Wu, Lai Jiang, Xingyuan Liu, Jiale Hong, Hai Zhao, Min Zhang

    Abstract: Drama is a form of storytelling inspired by human creativity, proceeding with a predefined storyline, carrying emotions and thoughts. This paper introduces \emph{LLM-based interactive drama}, which endows traditional drama with an unprecedented immersion, where a person is allowed to walk into it and interact with the characters and scenes. We define this new artistic genre by 6 essential elements… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Accepted by ACL 2024 Findings

  27. arXiv:2405.13762  [pdf, other

    cs.CV cs.LG cs.MM cs.SD eess.AS

    A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation

    Authors: Gwanghyun Kim, Alonso Martinez, Yu-Chuan Su, Brendan Jou, José Lezama, Agrim Gupta, Lijun Yu, Lu Jiang, Aren Jansen, Jacob Walker, Krishna Somandepalli

    Abstract: Training diffusion models for audiovisual sequences allows for a range of generation tasks by learning conditional distributions of various input-output combinations of the two modalities. Nevertheless, this strategy often requires training a separate model for each task which is expensive. Here, we propose a novel training approach to effectively learn arbitrary conditional distributions in the a… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  28. arXiv:2405.11647  [pdf, other

    cs.AI cs.LG

    Hummer: Towards Limited Competitive Preference Dataset

    Authors: Li Jiang, Yusen Wu, Junwu Xiong, Jingqing Ruan, Yichuan Ding, Qingpei Guo, Zujie Wen, Jun Zhou, Xiaotie Deng

    Abstract: Preference datasets are essential for incorporating human preferences into pre-trained language models, playing a key role in the success of Reinforcement Learning from Human Feedback. However, these datasets often demonstrate conflicting alignment objectives, leading to increased vulnerability to jailbreak attacks and challenges in adapting downstream tasks to prioritize specific alignment object… ▽ More

    Submitted 20 May, 2024; v1 submitted 19 May, 2024; originally announced May 2024.

    Comments: 9 pages, 5 figures

  29. arXiv:2405.11607  [pdf, other

    cs.CR cs.AR

    OFHE: An Electro-Optical Accelerator for Discretized TFHE

    Authors: Mengxin Zheng, Cheng Chu, Qian Lou, Nathan Youngblood, Mo Li, Sajjad Moazeni, Lei Jiang

    Abstract: This paper presents \textit{OFHE}, an electro-optical accelerator designed to process Discretized TFHE (DTFHE) operations, which encrypt multi-bit messages and support homomorphic multiplications, lookup table operations and full-domain functional bootstrappings. While DTFHE is more efficient and versatile than other fully homomorphic encryption schemes, it requires 32-, 64-, and 128-bit polynomia… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  30. arXiv:2405.11464  [pdf, other

    cs.CL cs.AI cs.LG

    Efficient Prompt Tuning by Multi-Space Projection and Prompt Fusion

    Authors: Pengxiang Lan, Enneng Yang, Yuting Liu, Guibing Guo, Linying Jiang, Jianzhe Zhao, Xingwei Wang

    Abstract: Prompt tuning is a promising method to fine-tune a pre-trained language model without retraining its large-scale parameters. Instead, it attaches a soft prompt to the input text, whereby downstream tasks can be well adapted by merely learning the embeddings of prompt tokens. Nevertheless, existing methods still suffer from two challenges: (i) they are hard to balance accuracy and efficiency. A lon… ▽ More

    Submitted 1 July, 2024; v1 submitted 19 May, 2024; originally announced May 2024.

  31. arXiv:2405.10825  [pdf, other

    eess.SY cs.LG

    Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities

    Authors: Hao Zhou, Chengming Hu, Ye Yuan, Yufei Cui, Yili Jin, Can Chen, Haolun Wu, Dun Yuan, Li Jiang, Di Wu, Xue Liu, Charlie Zhang, Xianbin Wang, Jiangchuan Liu

    Abstract: Large language models (LLMs) have received considerable attention recently due to their outstanding comprehension and reasoning capabilities, leading to great progress in many fields. The advancement of LLM techniques also offers promising opportunities to automate many tasks in the telecommunication (telecom) field. After pre-training and fine-tuning, LLMs can perform diverse downstream tasks bas… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  32. arXiv:2405.09215  [pdf, other

    cs.CV cs.AI

    Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model

    Authors: Wanting Xu, Yang Liu, Langping He, Xucheng Huang, Ling Jiang

    Abstract: We introduce Xmodel-VLM, a cutting-edge multimodal vision language model. It is designed for efficient deployment on consumer GPU servers. Our work directly confronts a pivotal industry issue by grappling with the prohibitive service costs that hinder the broad adoption of large-scale multimodal systems. Through rigorous training, we have developed a 1B-scale language model from the ground up, emp… ▽ More

    Submitted 20 June, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

  33. arXiv:2405.08403  [pdf, other

    cs.LG

    TFWT: Tabular Feature Weighting with Transformer

    Authors: Xinhao Zhang, Zaitian Wang, Lu Jiang, Wanfu Gao, Pengfei Wang, Kunpeng Liu

    Abstract: In this paper, we propose a novel feature weighting method to address the limitation of existing feature processing methods for tabular data. Typically the existing methods assume equal importance across all samples and features in one dataset. This simplified processing methods overlook the unique contributions of each feature, and thus may miss important feature information. As a result, it lead… ▽ More

    Submitted 17 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

    Comments: Accepted by IJCAI 2024

  34. arXiv:2405.07530  [pdf, other

    cs.SE

    Prompt-based Code Completion via Multi-Retrieval Augmented Generation

    Authors: Hanzhuo Tan, Qi Luo, Ling Jiang, Zizheng Zhan, Jing Li, Haotian Zhang, Yuqun Zhang

    Abstract: Automated code completion, aiming at generating subsequent tokens from unfinished code, has been significantly benefited from recent progress in pre-trained Large Language Models (LLMs). However, these models often suffer from coherence issues and hallucinations when dealing with complex code logic or extrapolating beyond their training data. Existing Retrieval Augmented Generation (RAG) technique… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  35. arXiv:2405.04032  [pdf, other

    cs.CR cs.AI

    Locally Differentially Private In-Context Learning

    Authors: Chunyan Zheng, Keke Sun, Wenhao Zhao, Haibo Zhou, Lixin Jiang, Shaoyang Song, Chunlai Zhou

    Abstract: Large pretrained language models (LLMs) have shown surprising In-Context Learning (ICL) ability. An important application in deploying large language models is to augment LLMs with a private database for some specific task. The main problem with this promising commercial use is that LLMs have been shown to memorize their training data and their prompt data are vulnerable to membership inference at… ▽ More

    Submitted 8 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: This paper was published at LREC-Coling 2024

  36. arXiv:2405.03280  [pdf, other

    cs.CV cs.AI

    Animate Your Thoughts: Decoupled Reconstruction of Dynamic Natural Vision from Slow Brain Activity

    Authors: Yizhuo Lu, Changde Du, Chong Wang, Xuanliu Zhu, Liuyun Jiang, Huiguang He

    Abstract: Reconstructing human dynamic vision from brain activity is a challenging task with great scientific significance. The difficulty stems from two primary issues: (1) vision-processing mechanisms in the brain are highly intricate and not fully revealed, making it challenging to directly learn a mapping between fMRI and video; (2) the temporal resolution of fMRI is significantly lower than that of nat… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  37. arXiv:2405.02798  [pdf, other

    cs.SI

    Structural Balance in Real-World Social Networks: Incorporating Direction and Transitivity in Measuring Partial Balance

    Authors: Rezvaneh Rezapour, Ly Dinh, Lan Jiang, Jana Diesner

    Abstract: Structural balance theory predicts that triads in networks gravitate towards stable configurations. The theory has been verified for undirected graphs. Since real-world networks are often directed, we introduce a novel method for considering both transitivity and sign consistency for evaluating partial balance in signed digraphs. We test our approach on graphs constructed by using different method… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2006.02565

  38. arXiv:2405.02155  [pdf, other

    cs.CV

    Multi-method Integration with Confidence-based Weighting for Zero-shot Image Classification

    Authors: Siqi Yin, Lifan Jiang

    Abstract: This paper introduces a novel framework for zero-shot learning (ZSL), i.e., to recognize new categories that are unseen during training, by using a multi-model and multi-alignment integration method. Specifically, we propose three strategies to enhance the model's performance to handle ZSL: 1) Utilizing the extensive knowledge of ChatGPT and the powerful image generation capabilities of DALL-E to… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  39. arXiv:2404.17199  [pdf, other

    cs.CV

    Few-shot Calligraphy Style Learning

    Authors: Fangda Chen, Jiacheng Nie, Lichuan Jiang, Zhuoer Zeng

    Abstract: We introduced "Presidifussion," a novel approach to learning and replicating the unique style of calligraphy of President Xu, using a pretrained diffusion model adapted through a two-stage training process. Initially, our model is pretrained on a diverse dataset containing works from various calligraphers. This is followed by fine-tuning on a smaller, specialized dataset of President Xu's calligra… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  40. arXiv:2404.13285  [pdf, other

    cs.HC

    ARtivism: AR-Enabled Accessible Public Art and Advocacy

    Authors: Lucy Jiang

    Abstract: Activism can take a multitude of forms, including protests, social media campaigns, and even public art. The uniqueness of public art lies in that both the act of creation and the artifacts created can serve as activism. Furthermore, public art is often site-specific and can be created with (e.g., commissioned murals) or without permission (e.g., graffiti art) of the site's owner. However, the maj… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: Presented at CHI 2024 (arXiv:2404.05889)

    Report number: ARSJ/2024/04

  41. arXiv:2404.10199  [pdf, other

    cs.CL cs.AI

    CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting

    Authors: Huihan Li, Liwei Jiang, Jena D. Huang, Hyunwoo Kim, Sebastin Santy, Taylor Sorensen, Bill Yuchen Lin, Nouha Dziri, Xiang Ren, Yejin Choi

    Abstract: As the utilization of large language models (LLMs) has proliferated worldwide, it is crucial for them to have adequate knowledge and fair representation for diverse global cultures. In this work, we uncover culture perceptions of three SOTA models on 110 countries and regions on 8 culture-related topics through culture-conditioned generations, and extract symbols from these generations that are as… ▽ More

    Submitted 26 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

  42. arXiv:2404.07773  [pdf, other

    cs.CV

    ConsistencyDet: A Robust Object Detector with a Denoising Paradigm of Consistency Model

    Authors: Lifan Jiang, Zhihui Wang, Changmiao Wang, Ming Li, Jiaxu Leng, Xindong Wu

    Abstract: Object detection, a quintessential task in the realm of perceptual computing, can be tackled using a generative methodology. In the present study, we introduce a novel framework designed to articulate object detection as a denoising diffusion process, which operates on the perturbed bounding boxes of annotated entities. This framework, termed ConsistencyDet, leverages an innovative denoising conce… ▽ More

    Submitted 14 May, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

  43. arXiv:2404.07577  [pdf, other

    cs.LG eess.SP

    Generating Comprehensive Lithium Battery Charging Data with Generative AI

    Authors: Lidang Jiang, Changyan Hu, Sibei Ji, Hang Zhao, Junxiong Chen, Ge He

    Abstract: In optimizing performance and extending the lifespan of lithium batteries, accurate state prediction is pivotal. Traditional regression and classification methods have achieved some success in battery state prediction. However, the efficacy of these data-driven approaches heavily relies on the availability and quality of public datasets. Additionally, generating electrochemical data predominantly… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  44. arXiv:2404.07155  [pdf, other

    cs.CV

    Unified Language-driven Zero-shot Domain Adaptation

    Authors: Senqiao Yang, Zhuotao Tian, Li Jiang, Jiaya Jia

    Abstract: This paper introduces Unified Language-driven Zero-shot Domain Adaptation (ULDA), a novel task setting that enables a single model to adapt to diverse target domains without explicit domain-ID knowledge. We identify the constraints in the existing language-driven zero-shot domain adaptation task, particularly the requirement for domain IDs and domain-specific models, which may restrict flexibility… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024

  45. arXiv:2404.06664  [pdf, other

    cs.CL cs.AI cs.HC

    CulturalTeaming: AI-Assisted Interactive Red-Teaming for Challenging LLMs' (Lack of) Multicultural Knowledge

    Authors: Yu Ying Chiu, Liwei Jiang, Maria Antoniak, Chan Young Park, Shuyue Stella Li, Mehar Bhatia, Sahithya Ravi, Yulia Tsvetkov, Vered Shwartz, Yejin Choi

    Abstract: Frontier large language models (LLMs) are developed by researchers and practitioners with skewed cultural backgrounds and on datasets with skewed sources. However, LLMs' (lack of) multicultural knowledge cannot be effectively assessed with current methods for developing benchmarks. Existing multicultural evaluations primarily rely on expensive and restricted human annotations or potentially outdat… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: Preprint (under review)

  46. arXiv:2403.17898  [pdf, other

    cs.CV

    Octree-GS: Towards Consistent Real-time Rendering with LOD-Structured 3D Gaussians

    Authors: Kerui Ren, Lihan Jiang, Tao Lu, Mulin Yu, Linning Xu, Zhangkai Ni, Bo Dai

    Abstract: The recent 3D Gaussian splatting (3D-GS) has shown remarkable rendering fidelity and efficiency compared to NeRF-based neural scene representations. While demonstrating the potential for real-time rendering, 3D-GS encounters rendering bottlenecks in large scenes with complex details due to an excessive number of Gaussian primitives located within the viewing frustum. This limitation is particularl… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Project page: https://city-super.github.io/octree-gs/

  47. arXiv:2403.16964  [pdf, other

    cs.CV

    GSDF: 3DGS Meets SDF for Improved Rendering and Reconstruction

    Authors: Mulin Yu, Tao Lu, Linning Xu, Lihan Jiang, Yuanbo Xiangli, Bo Dai

    Abstract: Presenting a 3D scene from multiview images remains a core and long-standing challenge in computer vision and computer graphics. Two main requirements lie in rendering and reconstruction. Notably, SOTA rendering quality is usually achieved with neural volumetric rendering techniques, which rely on aggregated point/primitive-wise color and neglect the underlying scene geometry. Learning of neural i… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Project page: https://city-super.github.io/GSDF

  48. arXiv:2403.15212  [pdf, other

    cs.CV

    GCN-DevLSTM: Path Development for Skeleton-Based Action Recognition

    Authors: Lei Jiang, Weixin Yang, Xin Zhang, Hao Ni

    Abstract: Skeleton-based action recognition (SAR) in videos is an important but challenging task in computer vision. The recent state-of-the-art (SOTA) models for SAR are primarily based on graph convolutional neural networks (GCNs), which are powerful in extracting the spatial information of skeleton data. However, it is yet clear that such GCN-based models can effectively capture the temporal dynamics of… ▽ More

    Submitted 26 May, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

  49. arXiv:2403.14791  [pdf, other

    cs.CY cs.AI

    Particip-AI: A Democratic Surveying Framework for Anticipating Future AI Use Cases, Harms and Benefits

    Authors: Jimin Mun, Liwei Jiang, Jenny Liang, Inyoung Cheong, Nicole DeCario, Yejin Choi, Tadayoshi Kohno, Maarten Sap

    Abstract: General purpose AI, such as ChatGPT, seems to have lowered the barriers for the public to use AI and harness its power. However, the governance and development of AI still remain in the hands of a few, and the pace of development is accelerating without a comprehensive assessment of risks. As a first step towards democratic risk assessment and design of general purpose AI, we introduce PARTICIP-AI… ▽ More

    Submitted 26 July, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: AIES 2024, 34 pages, 4 figures, 23 tables

  50. arXiv:2403.14418  [pdf, other

    cs.CV

    OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation

    Authors: Bohao Peng, Xiaoyang Wu, Li Jiang, Yukang Chen, Hengshuang Zhao, Zhuotao Tian, Jiaya Jia

    Abstract: The booming of 3D recognition in the 2020s began with the introduction of point cloud transformers. They quickly overwhelmed sparse CNNs and became state-of-the-art models, especially in 3D semantic segmentation. However, sparse CNNs are still valuable networks, due to their efficiency treasure, and ease of application. In this work, we reexamine the design distinctions and test the limits of what… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: CVPR 2024