Skip to main content

Showing 1–50 of 460 results for author: Ren, J

  1. arXiv:2407.18035  [pdf, other

    cs.CV cs.AI cs.CL

    RestoreAgent: Autonomous Image Restoration Agent via Multimodal Large Language Models

    Authors: Haoyu Chen, Wenbo Li, Jinjin Gu, Jingjing Ren, Sixiang Chen, Tian Ye, Renjing Pei, Kaiwen Zhou, Fenglong Song, Lei Zhu

    Abstract: Natural images captured by mobile devices often suffer from multiple types of degradation, such as noise, blur, and low light. Traditional image restoration methods require manual selection of specific tasks, algorithms, and execution sequences, which is time-consuming and may yield suboptimal results. All-in-one models, though capable of handling multiple tasks, typically support only a limited r… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  2. arXiv:2407.15431  [pdf, other

    cs.SI cs.AI cs.LG

    Pre-Training and Prompting for Few-Shot Node Classification on Text-Attributed Graphs

    Authors: Huanjing Zhao, Beining Yang, Yukuo Cen, Junyu Ren, Chenhui Zhang, Yuxiao Dong, Evgeny Kharlamov, Shu Zhao, Jie Tang

    Abstract: The text-attributed graph (TAG) is one kind of important real-world graph-structured data with each node associated with raw texts. For TAGs, traditional few-shot node classification methods directly conduct training on the pre-processed node features and do not consider the raw texts. The performance is highly dependent on the choice of the feature pre-processing method. In this paper, we propose… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: Accepted to KDD'24

  3. arXiv:2407.15320  [pdf, other

    cs.DC cs.AI cs.LG cs.NI

    Edge Graph Intelligence: Reciprocally Empowering Edge Networks with Graph Intelligence

    Authors: Liekang Zeng, Shengyuan Ye, Xu Chen, Xiaoxi Zhang, Ju Ren, Jian Tang, Yang Yang, Xuemin, Shen

    Abstract: Recent years have witnessed a thriving growth of computing facilities connected at the network edge, cultivating edge computing networks as a fundamental infrastructure for supporting miscellaneous intelligent services. Meanwhile, Artificial Intelligence frontiers have extrapolated Machine Learning to the graph domain and promoted Graph Intelligence (GI), which unlocks unprecedented ability in lea… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: 38 pages, 14 figures

  4. arXiv:2407.11966  [pdf, other

    cs.CV cs.AI cs.LG

    Efficient Training with Denoised Neural Weights

    Authors: Yifan Gong, Zheng Zhan, Yanyu Li, Yerlan Idelbayev, Andrey Zharkov, Kfir Aberman, Sergey Tulyakov, Yanzhi Wang, Jian Ren

    Abstract: Good weight initialization serves as an effective measure to reduce the training cost of a deep neural network (DNN) model. The choice of how to initialize parameters is challenging and may require manual tuning, which can be time-consuming and prone to human error. To overcome such limitations, this work takes a novel step towards building a weight generator to synthesize the neural weights for i… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: ECCV 2024. Project Page: https://yifanfanfanfan.github.io/denoised-weights/

  5. arXiv:2407.07850  [pdf, other

    cs.DC

    Harnessing Integrated CPU-GPU System Memory for HPC: a first look into Grace Hopper

    Authors: Gabin Schieffer, Jacob Wahlgren, Jie Ren, Jennifer Faj, Ivy Peng

    Abstract: Memory management across discrete CPU and GPU physical memory is traditionally achieved through explicit GPU allocations and data copy or unified virtual memory. The Grace Hopper Superchip, for the first time, supports an integrated CPU-GPU system page table, hardware-level addressing of system allocated memory, and cache-coherent NVLink-C2C interconnect, bringing an alternative solution for enabl… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Accepted to ICPP '24 (The 53rd International Conference on Parallel Processing)

  6. arXiv:2407.05680  [pdf, other

    cs.CV cs.AI

    Fine-Grained Multi-View Hand Reconstruction Using Inverse Rendering

    Authors: Qijun Gan, Wentong Li, Jinwei Ren, Jianke Zhu

    Abstract: Reconstructing high-fidelity hand models with intricate textures plays a crucial role in enhancing human-object interaction and advancing real-world applications. Despite the state-of-the-art methods excelling in texture generation and image rendering, they often face challenges in accurately capturing geometric details. Learning-based approaches usually offer better robustness and faster inferenc… ▽ More

    Submitted 8 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted by AAAI 2024

  7. arXiv:2407.05232  [pdf, other

    cs.LG

    PAPM: A Physics-aware Proxy Model for Process Systems

    Authors: Pengwei Liu, Zhongkai Hao, Xingyu Ren, Hangjie Yuan, Jiayang Ren, Dong Ni

    Abstract: In the context of proxy modeling for process systems, traditional data-driven deep learning approaches frequently encounter significant challenges, such as substantial training costs induced by large amounts of data, and limited generalization capabilities. As a promising alternative, physics-aware models incorporate partial physics knowledge to ameliorate these challenges. Although demonstrating… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: ICML 2024

  8. arXiv:2407.02158  [pdf, other

    cs.CV

    UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks

    Authors: Jingjing Ren, Wenbo Li, Haoyu Chen, Renjing Pei, Bin Shao, Yong Guo, Long Peng, Fenglong Song, Lei Zhu

    Abstract: Ultra-high-resolution image generation poses great challenges, such as increased semantic planning complexity and detail synthesis difficulties, alongside substantial training resource demands. We present UltraPixel, a novel architecture utilizing cascade diffusion models to generate high-quality images at multiple resolutions (\textit{e.g.}, 1K to 6K) within a single model, while maintaining comp… ▽ More

    Submitted 4 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: Project page https://jingjingrenabc.github.io/ultrapixel

  9. arXiv:2407.00867  other

    cs.DC

    Proceedings of 3rd Workshop on Heterogeneous Composable and Disaggregated Systems

    Authors: Christian Pinto, Dong Li, Thaleia Dimitra Doudali, Christina Giannoula, Jie Ren

    Abstract: The future of computing systems is inevitably embracing a disaggregated and composable pattern: from clusters of computers to pools of resources that can be dynamically combined together and tailored around applications requirements. Transitioning to this new paradigm requires ground-breaking research, ranging from new hardware architectures up to new models and abstractions at all levels of the s… ▽ More

    Submitted 22 April, 2024; originally announced July 2024.

    Comments: Proceedings of 3rd Workshop on Heterogeneous Composable and Disaggregated Systems

  10. arXiv:2406.19434  [pdf, other

    cs.GR cs.AI

    Lightweight Predictive 3D Gaussian Splats

    Authors: Junli Cao, Vidit Goel, Chaoyang Wang, Anil Kag, Ju Hu, Sergei Korolev, Chenfanfu Jiang, Sergey Tulyakov, Jian Ren

    Abstract: Recent approaches representing 3D objects and scenes using Gaussian splats show increased rendering speed across a variety of platforms and devices. While rendering such representations is indeed extremely efficient, storing and transmitting them is often prohibitively expensive. To represent large-scale scenes, one often needs to store millions of 3D Gaussians, occupying gigabytes of disk space.… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Project Page: https://plumpuddings.github.io/LPGS//

  11. arXiv:2406.14855  [pdf, other

    cs.CV cs.CR

    Six-CD: Benchmarking Concept Removals for Benign Text-to-image Diffusion Models

    Authors: Jie Ren, Kangrui Chen, Yingqian Cui, Shenglai Zeng, Hui Liu, Yue Xing, Jiliang Tang, Lingjuan Lyu

    Abstract: Text-to-image (T2I) diffusion models have shown exceptional capabilities in generating images that closely correspond to textual prompts. However, the advancement of T2I diffusion models presents significant risks, as the models could be exploited for malicious purposes, such as generating images with violence or nudity, or creating unauthorized portraits of public figures in inappropriate context… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  12. arXiv:2406.14773  [pdf, other

    cs.CR

    Mitigating the Privacy Issues in Retrieval-Augmented Generation (RAG) via Pure Synthetic Data

    Authors: Shenglai Zeng, Jiankun Zhang, Pengfei He, Jie Ren, Tianqi Zheng, Hanqing Lu, Han Xu, Hui Liu, Yue Xing, Jiliang Tang

    Abstract: Retrieval-augmented generation (RAG) enhances the outputs of language models by integrating relevant information retrieved from external knowledge sources. However, when the retrieval process involves private data, RAG systems may face severe privacy risks, potentially leading to the leakage of sensitive information. To address this issue, we propose using synthetic data as a privacy-preserving al… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  13. arXiv:2406.13933  [pdf, other

    cs.CR

    EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations

    Authors: Jie Ren, Yingqian Cui, Chen Chen, Vikash Sehwag, Yue Xing, Jiliang Tang, Lingjuan Lyu

    Abstract: Generative models, especially text-to-image diffusion models, have significantly advanced in their ability to generate images, benefiting from enhanced architectures, increased computational power, and large-scale datasets. While the datasets play an important role, their protection has remained as an unsolved issue. Current protection strategies, such as watermarks and membership inference, are e… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  14. arXiv:2406.10667  [pdf, other

    cs.LG

    UniZero: Generalized and Efficient Planning with Scalable Latent World Models

    Authors: Yuan Pu, Yazhe Niu, Jiyuan Ren, Zhenjie Yang, Hongsheng Li, Yu Liu

    Abstract: Learning predictive world models is essential for enhancing the planning capabilities of reinforcement learning agents. Notably, the MuZero-style algorithms, based on the value equivalence principle and Monte Carlo Tree Search (MCTS), have achieved superhuman performance in various domains. However, in environments that require capturing long-term dependencies, MuZero's performance deteriorates ra… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 32 pages, 16 figures

  15. arXiv:2406.10324  [pdf, other

    cs.CV cs.LG

    L4GM: Large 4D Gaussian Reconstruction Model

    Authors: Jiawei Ren, Kevin Xie, Ashkan Mirzaei, Hanxue Liang, Xiaohui Zeng, Karsten Kreis, Ziwei Liu, Antonio Torralba, Sanja Fidler, Seung Wook Kim, Huan Ling

    Abstract: We present L4GM, the first 4D Large Reconstruction Model that produces animated objects from a single-view video input -- in a single feed-forward pass that takes only a second. Key to our success is a novel dataset of multiview videos containing curated, rendered animated objects from Objaverse. This dataset depicts 44K diverse objects with 110K animations rendered in 48 viewpoints, resulting in… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Project page: https://research.nvidia.com/labs/toronto-ai/l4gm

  16. arXiv:2406.07255  [pdf, other

    cs.CV eess.IV

    Towards Realistic Data Generation for Real-World Super-Resolution

    Authors: Long Peng, Wenbo Li, Renjing Pei, Jingjing Ren, Xueyang Fu, Yang Wang, Yang Cao, Zheng-Jun Zha

    Abstract: Existing image super-resolution (SR) techniques often fail to generalize effectively in complex real-world settings due to the significant divergence between training data and practical scenarios. To address this challenge, previous efforts have either manually simulated intricate physical-based degradations or utilized learning-based techniques, yet these approaches remain inadequate for producin… ▽ More

    Submitted 11 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

  17. arXiv:2406.05700  [pdf, other

    cs.CV eess.IV

    HDMba: Hyperspectral Remote Sensing Imagery Dehazing with State Space Model

    Authors: Hang Fu, Genyun Sun, Yinhe Li, Jinchang Ren, Aizhu Zhang, Cheng Jing, Pedram Ghamisi

    Abstract: Haze contamination in hyperspectral remote sensing images (HSI) can lead to spatial visibility degradation and spectral distortion. Haze in HSI exhibits spatial irregularity and inhomogeneous spectral distribution, with few dehazing networks available. Current CNN and Transformer-based dehazing methods fail to balance global scene recovery, local detail retention, and computational efficiency. Ins… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  18. arXiv:2406.04333  [pdf, other

    cs.CV

    BitsFusion: 1.99 bits Weight Quantization of Diffusion Model

    Authors: Yang Sui, Yanyu Li, Anil Kag, Yerlan Idelbayev, Junli Cao, Ju Hu, Dhritiman Sagar, Bo Yuan, Sergey Tulyakov, Jian Ren

    Abstract: Diffusion-based image generation models have achieved great success in recent years by showing the capability of synthesizing high-quality content. However, these models contain a huge number of parameters, resulting in a significantly large model size. Saving and transferring them is a major bottleneck for various applications, especially those running on resource-constrained devices. In this wor… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Project Page: https://snap-research.github.io/BitsFusion

  19. arXiv:2406.04324  [pdf, other

    cs.CV eess.IV

    SF-V: Single Forward Video Generation Model

    Authors: Zhixing Zhang, Yanyu Li, Yushu Wu, Yanwu Xu, Anil Kag, Ivan Skorokhodov, Willi Menapace, Aliaksandr Siarohin, Junli Cao, Dimitris Metaxas, Sergey Tulyakov, Jian Ren

    Abstract: Diffusion-based video generation models have demonstrated remarkable success in obtaining high-fidelity videos through the iterative denoising process. However, these models require multiple denoising steps during sampling, resulting in high computational costs. In this work, we propose a novel approach to obtain single-step video generation models by leveraging adversarial training to fine-tune p… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Project Page: https://snap-research.github.io/SF-V

  20. arXiv:2405.20614  [pdf, other

    cs.CV

    EPIDetect: Video-based convulsive seizure detection in chronic epilepsy mouse model for anti-epilepsy drug screening

    Authors: Junming Ren, Zhoujian Xiao, Yujia Zhang, Yujie Yang, Ling He, Ezra Yoon, Stephen Temitayo Bello, Xi Chen, Dapeng Wu, Micky Tortorella, Jufang He

    Abstract: In the preclinical translational studies, drug candidates with remarkable anti-epileptic efficacy demonstrate long-term suppression of spontaneous recurrent seizures (SRSs), particularly convulsive seizures (CSs), in mouse models of chronic epilepsy. However, the current methods for monitoring CSs have limitations in terms of invasiveness, specific laboratory settings, high cost, and complex opera… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  21. arXiv:2405.18737  [pdf

    cs.CV

    WLC-Net: a robust and fast deep-learning wood-leaf classification method

    Authors: Hanlong Li, Pei Wang, Yuhan Wu, Jing Ren, Yuhang Gao, Lingyun Zhang, Mingtai Zhang, Wenxin Chen

    Abstract: Wood-leaf classification is an essential and fundamental prerequisite in the analysis and estimation of forest attributes from terrestrial laser scanning (TLS) point clouds,including critical measurements such as diameter at breast height(DBH),above-ground biomass(AGB),wood volume.To address this,we introduce the Wood-Leaf Classification Network(WLC-Net),a deep learning model derived from PointNet… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 41 pages, 14 figures, 5 tables

    ACM Class: I.4.6

  22. arXiv:2405.17477  [pdf, other

    cs.LG cs.AI

    OLLIE: Imitation Learning from Offline Pretraining to Online Finetuning

    Authors: Sheng Yue, Xingyuan Hua, Ju Ren, Sen Lin, Junshan Zhang, Yaoxue Zhang

    Abstract: In this paper, we study offline-to-online Imitation Learning (IL) that pretrains an imitation policy from static demonstration data, followed by fast finetuning with minimal environmental interaction. We find the naïve combination of existing offline IL and online IL methods tends to behave poorly in this context, because the initial discriminator (often used in online IL) operates randomly and di… ▽ More

    Submitted 30 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: International Conference on Machine Learning (ICML)

  23. arXiv:2405.17476  [pdf, other

    cs.LG cs.AI

    How to Leverage Diverse Demonstrations in Offline Imitation Learning

    Authors: Sheng Yue, Jiani Liu, Xingyuan Hua, Ju Ren, Sen Lin, Junshan Zhang, Yaoxue Zhang

    Abstract: Offline Imitation Learning (IL) with imperfect demonstrations has garnered increasing attention owing to the scarcity of expert data in many real-world domains. A fundamental problem in this scenario is how to extract positive behaviors from noisy data. In general, current approaches to the problem select data building on state-action similarity to given expert demonstrations, neglecting precious… ▽ More

    Submitted 30 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: International Conference on Machine Learning (ICML)

  24. arXiv:2405.17474  [pdf, other

    cs.LG cs.AI

    Federated Offline Policy Optimization with Dual Regularization

    Authors: Sheng Yue, Zerui Qin, Xingyuan Hua, Yongheng Deng, Ju Ren

    Abstract: Federated Reinforcement Learning (FRL) has been deemed as a promising solution for intelligent decision-making in the era of Artificial Internet of Things. However, existing FRL approaches often entail repeated interactions with the environment during local updating, which can be prohibitively expensive or even infeasible in many real-world domains. To overcome this challenge, this paper proposes… ▽ More

    Submitted 28 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: IEEE International Conference on Computer Communications (INFOCOM)

  25. arXiv:2405.17471  [pdf, other

    cs.LG cs.AI

    Momentum-Based Federated Reinforcement Learning with Interaction and Communication Efficiency

    Authors: Sheng Yue, Xingyuan Hua, Lili Chen, Ju Ren

    Abstract: Federated Reinforcement Learning (FRL) has garnered increasing attention recently. However, due to the intrinsic spatio-temporal non-stationarity of data distributions, the current approaches typically suffer from high interaction and communication costs. In this paper, we introduce a new FRL algorithm, named $\texttt{MFPO}$, that utilizes momentum, importance sampling, and additional server-side… ▽ More

    Submitted 28 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: IEEE International Conference on Computer Communications (INFOCOM)

  26. arXiv:2405.17426  [pdf, other

    cs.CV cs.RO

    Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving

    Authors: Shaoyuan Xie, Lingdong Kong, Wenwei Zhang, Jiawei Ren, Liang Pan, Kai Chen, Ziwei Liu

    Abstract: Recent advancements in bird's eye view (BEV) representations have shown remarkable promise for in-vehicle 3D perception. However, while these methods have achieved impressive results on standard benchmarks, their robustness in varied conditions remains insufficiently assessed. In this study, we present RoboBEV, an extensive benchmark suite designed to evaluate the resilience of BEV algorithms. Thi… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Preprint; 17 pages, 13 figures, 11 tables; Code at this https URL: https://github.com/Daniel-xsy/RoboBEV

  27. arXiv:2405.15705  [pdf, other

    cs.AR eess.SY

    Sums: Sniffing Unknown Multiband Signals under Low Sampling Rates

    Authors: Jinbo Peng, Zhe Chen, Zheng Lin, Haoxuan Yuan, Zihan Fang, Lingzhong Bao, Zihang Song, Ying Li, Jing Ren, Yue Gao

    Abstract: Due to sophisticated deployments of all kinds of wireless networks (e.g., 5G, Wi-Fi, Bluetooth, LEO satellite, etc.), multiband signals distribute in a large bandwidth (e.g., from 70 MHz to 8 GHz). Consequently, for network monitoring and spectrum sharing applications, a sniffer for extracting physical layer information, such as structure of packet, with low sampling rate (especially, sub-Nyquist… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 12 pages, 9 figures

  28. arXiv:2405.14209  [pdf, other

    cs.PF cs.AR

    Exploring and Evaluating Real-world CXL: Use Cases and System Adoption

    Authors: Jie Liu, Xi Wang, Jianbo Wu, Shuangyan Yang, Jie Ren, Bhanu Shankar, Dong Li

    Abstract: Compute eXpress Link (CXL) is emerging as a promising memory interface technology. Because of the common unavailiability of CXL devices, the performance of the CXL memory is largely unknown. What are the use cases for the CXL memory? What are the impacts of the CXL memory on application performance? How to use the CXL memory in combination with existing memory components? In this work, we study th… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  29. arXiv:2405.14018  [pdf, other

    cs.CR cs.LG stat.AP

    Watermarking Generative Tabular Data

    Authors: Hengzhi He, Peiyu Yu, Junpeng Ren, Ying Nian Wu, Guang Cheng

    Abstract: In this paper, we introduce a simple yet effective tabular data watermarking mechanism with statistical guarantees. We show theoretically that the proposed watermark can be effectively detected, while faithfully preserving the data fidelity, and also demonstrates appealing robustness against additive noise attack. The general idea is to achieve the watermarking through a strategic embedding based… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  30. arXiv:2405.10288  [pdf, other

    cs.CL cs.AI

    Timeline-based Sentence Decomposition with In-Context Learning for Temporal Fact Extraction

    Authors: Jianhao Chen, Haoyuan Ouyang, Junyang Ren, Wentao Ding, Wei Hu, Yuzhong Qu

    Abstract: Facts extraction is pivotal for constructing knowledge graphs. Recently, the increasing demand for temporal facts in downstream tasks has led to the emergence of the task of temporal fact extraction. In this paper, we specifically address the extraction of temporal facts from natural language text. Previous studies fail to handle the challenge of establishing time-to-fact correspondences in comple… ▽ More

    Submitted 18 June, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

    Comments: Accepted to ACL2024 main conference

  31. arXiv:2405.05258  [pdf, other

    cs.CV cs.LG cs.RO

    Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving

    Authors: Lingdong Kong, Xiang Xu, Jiawei Ren, Wenwei Zhang, Liang Pan, Kai Chen, Wei Tsang Ooi, Ziwei Liu

    Abstract: Efficient data utilization is crucial for advancing 3D scene understanding in autonomous driving, where reliance on heavily human-annotated LiDAR point clouds challenges fully supervised methods. Addressing this, our study extends into semi-supervised learning for LiDAR semantic segmentation, leveraging the intrinsic spatial priors of driving scenes and multi-sensor complements to augment the effi… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Preprint; 17 pages, 6 figures, 8 tables; Code at https://github.com/ldkong1205/LaserMix

  32. arXiv:2405.04867  [pdf, other

    eess.IV cs.CV

    MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and Results

    Authors: Yaqi Wu, Zhihao Fan, Xiaofeng Chu, Jimmy S. Ren, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangcheng Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Senyan Xu, Zhijing Sun, Jiaying Zhu, Yurui Zhu, Xueyang Fu, Zheng-Jun Zha, Jun Cao, Cheng Li, Shu Chen, Liang Ma, Shiyang Zhou, Haijin Zeng, Kai Feng , et al. (24 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: MIPI@CVPR2024. Website: https://mipi-challenge.org/MIPI2024/

  33. arXiv:2405.02191  [pdf

    cs.CV cs.LG eess.IV

    Non-Destructive Peat Analysis using Hyperspectral Imaging and Machine Learning

    Authors: Yijun Yan, Jinchang Ren, Barry Harrison, Oliver Lewis, Yinhe Li, Ping Ma

    Abstract: Peat, a crucial component in whisky production, imparts distinctive and irreplaceable flavours to the final product. However, the extraction of peat disrupts ancient ecosystems and releases significant amounts of carbon, contributing to climate change. This paper aims to address this issue by conducting a feasibility study on enhancing peat use efficiency in whisky manufacturing through non-destru… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: 4 pages,4 figures

  34. arXiv:2405.00736  [pdf, other

    eess.SP cs.LG

    Joint Signal Detection and Automatic Modulation Classification via Deep Learning

    Authors: Huijun Xing, Xuhui Zhang, Shuo Chang, Jinke Ren, Zixun Zhang, Jie Xu, Shuguang Cui

    Abstract: Signal detection and modulation classification are two crucial tasks in various wireless communication systems. Different from prior works that investigate them independently, this paper studies the joint signal detection and automatic modulation classification (AMC) by considering a realistic and complex scenario, in which multiple signals with different modulation schemes coexist at different ca… ▽ More

    Submitted 29 April, 2024; originally announced May 2024.

  35. arXiv:2404.17997  [pdf, other

    cs.LG stat.ML

    Optimal Initialization of Batch Bayesian Optimization

    Authors: Jiuge Ren, David Sweet

    Abstract: Field experiments and computer simulations are effective but time-consuming methods of measuring the quality of engineered systems at different settings. To reduce the total time required, experimenters may employ Bayesian optimization, which is parsimonious with measurements, and take measurements of multiple settings simultaneously, in a batch. In practice, experimenters use very few batches, th… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: 10 pages, 8 figures

  36. arXiv:2404.17845  [pdf, other

    cs.CV

    Instance-free Text to Point Cloud Localization with Relative Position Awareness

    Authors: Lichao Wang, Zhihao Yuan, Jinke Ren, Shuguang Cui, Zhen Li

    Abstract: Text-to-point-cloud cross-modal localization is an emerging vision-language task critical for future robot-human collaboration. It seeks to localize a position from a city-scale point cloud scene based on a few natural language instructions. In this paper, we address two key limitations of existing approaches: 1) their reliance on ground-truth instances as input; and 2) their neglect of the relati… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: 12 pages, 10 figures, conference

  37. arXiv:2404.17484  [pdf, other

    cs.CV eess.IV

    Sparse Reconstruction of Optical Doppler Tomography Based on State Space Model

    Authors: Zhenghong Li, Jiaxiang Ren, Wensheng Cheng, Congwu Du, Yingtian Pan, Haibin Ling

    Abstract: Optical Doppler Tomography (ODT) is a blood flow imaging technique popularly used in bioengineering applications. The fundamental unit of ODT is the 1D frequency response along the A-line (depth), named raw A-scan. A 2D ODT image (B-scan) is obtained by first sensing raw A-scans along the B-line (width), and then constructing the B-scan from these raw A-scans via magnitude-phase analysis and post-… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: 19 pages, 5 figures

  38. arXiv:2404.15256  [pdf, other

    cs.RO cs.AI cs.CV eess.SY

    TOP-Nav: Legged Navigation Integrating Terrain, Obstacle and Proprioception Estimation

    Authors: Junli Ren, Yikai Liu, Yingru Dai, Junfeng Long, Guijin Wang

    Abstract: Legged navigation is typically examined within open-world, off-road, and challenging environments. In these scenarios, estimating external disturbances requires a complex synthesis of multi-modal information. This underlines a major limitation in existing works that primarily focus on avoiding obstacles. In this work, we propose TOP-Nav, a novel legged navigation framework that integrates a compre… ▽ More

    Submitted 12 July, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  39. arXiv:2404.14546  [pdf, other

    cs.RO

    Closing the Perception-Action Loop for Semantically Safe Navigation in Semi-Static Environments

    Authors: Jingxing Qian, Siqi Zhou, Nicholas Jianrui Ren, Veronica Chatrath, Angela P. Schoellig

    Abstract: Autonomous robots navigating in changing environments demand adaptive navigation strategies for safe long-term operation. While many modern control paradigms offer theoretical guarantees, they often assume known extrinsic safety constraints, overlooking challenges when deployed in real-world environments where objects can appear, disappear, and shift over time. In this paper, we present a closed-l… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Manuscript accepted to ICRA 2024

  40. arXiv:2404.12636  [pdf, other

    cs.SE

    Multi-Objective Fine-Tuning for Enhanced Program Repair with LLMs

    Authors: Boyang Yang, Haoye Tian, Jiadong Ren, Hongyu Zhang, Jacques Klein, Tegawendé F. Bissyandé, Claire Le Goues, Shunfu Jin

    Abstract: Large language models (LLMs) have demonstrated remarkable capabilities on a broad spectrum of downstream tasks. Within the realm of software engineering, specialized tasks on code, such as program repair, present unique challenges, necessitating fine-tuning to unlock state-of-the-art performance. Fine-tuning approaches proposed in the literature for LLMs on program repair tasks are however general… ▽ More

    Submitted 22 April, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

  41. arXiv:2404.07178  [pdf, other

    cs.CV

    Move Anything with Layered Scene Diffusion

    Authors: Jiawei Ren, Mengmeng Xu, Jui-Chieh Wu, Ziwei Liu, Tao Xiang, Antoine Toisoul

    Abstract: Diffusion models generate images with an unprecedented level of quality, but how can we freely rearrange image layouts? Recent works generate controllable scenes via learning spatially disentangled latent codes, but these methods do not apply to diffusion models due to their fixed forward process. In this work, we propose SceneDiffusion to optimize a layered scene representation during the diffusi… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 camera-ready

  42. arXiv:2404.04098  [pdf, other

    cs.CR

    You Can Use But Cannot Recognize: Preserving Visual Privacy in Deep Neural Networks

    Authors: Qiushi Li, Yan Zhang, Ju Ren, Qi Li, Yaoxue Zhang

    Abstract: Image data have been extensively used in Deep Neural Network (DNN) tasks in various scenarios, e.g., autonomous driving and medical image analysis, which incurs significant privacy concerns. Existing privacy protection techniques are unable to efficiently protect such data. For example, Differential Privacy (DP) that is an emerging technique protects data with strong privacy guarantee cannot effec… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: 18 pages, 11 figures

  43. arXiv:2404.03190  [pdf, other

    cs.CV cs.AI cs.RO

    Adaptive Discrete Disparity Volume for Self-supervised Monocular Depth Estimation

    Authors: Jianwei Ren

    Abstract: In self-supervised monocular depth estimation tasks, discrete disparity prediction has been proven to attain higher quality depth maps than common continuous methods. However, current discretization strategies often divide depth ranges of scenes into bins in a handcrafted and rigid manner, limiting model performance. In this paper, we propose a learnable module, Adaptive Discrete Disparity Volume… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  44. arXiv:2404.03080  [pdf, other

    cs.CL cs.AI

    Construction and Application of Materials Knowledge Graph in Multidisciplinary Materials Science via Large Language Model

    Authors: Yanpeng Ye, Jie Ren, Shaozhou Wang, Yuwei Wan, Haofen Wang, Imran Razzak, Tong Xie, Wenjie Zhang

    Abstract: Knowledge in materials science is widely dispersed across extensive scientific literature, posing significant challenges for efficient discovery and integration of new materials. Traditional methods, often reliant on costly and time-consuming experimental approaches, further complicate rapid innovation. Addressing these challenges, the integration of artificial intelligence with materials science… ▽ More

    Submitted 4 June, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: 13 pages, 7 figures, 3 tables

  45. arXiv:2404.01614  [pdf, other

    cs.CV

    LR-FPN: Enhancing Remote Sensing Object Detection with Location Refined Feature Pyramid Network

    Authors: Hanqian Li, Ruinan Zhang, Ye Pan, Junchi Ren, Fei Shen

    Abstract: Remote sensing target detection aims to identify and locate critical targets within remote sensing images, finding extensive applications in agriculture and urban planning. Feature pyramid networks (FPNs) are commonly used to extract multi-scale features. However, existing FPNs often overlook extracting low-level positional information and fine-grained context interaction. To address this, we prop… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  46. arXiv:2404.01219  [pdf, other

    cs.RO cs.FL

    LTL-D*: Incrementally Optimal Replanning for Feasible and Infeasible Tasks in Linear Temporal Logic Specifications

    Authors: Jiming Ren, Haris Miller, Karen M. Feigh, Samuel Coogan, Ye Zhao

    Abstract: This paper presents an incremental replanning algorithm, dubbed LTL-D*, for temporal-logic-based task planning in a dynamically changing environment. Unexpected changes in the environment may lead to failures in satisfying a task specification in the form of a Linear Temporal Logic (LTL). In this study, the considered failures are categorized into two classes: (i) the desired LTL specification can… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 8 pages,9 figures

  47. arXiv:2403.18978  [pdf, other

    cs.CV cs.AI cs.LG

    TextCraftor: Your Text Encoder Can be Image Quality Controller

    Authors: Yanyu Li, Xian Liu, Anil Kag, Ju Hu, Yerlan Idelbayev, Dhritiman Sagar, Yanzhi Wang, Sergey Tulyakov, Jian Ren

    Abstract: Diffusion-based text-to-image generative models, e.g., Stable Diffusion, have revolutionized the field of content generation, enabling significant advancements in areas like image editing and video synthesis. Despite their formidable capabilities, these models are not without their limitations. It is still challenging to synthesize an image that aligns well with the input text, and multiple runs w… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  48. arXiv:2403.18760  [pdf, other

    cs.RO

    MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model

    Authors: Yike Wu, Jiatao Zhang, Nan Hu, LanLing Tang, Guilin Qi, Jun Shao, Jie Ren, Wei Song

    Abstract: In the realm of data-driven AI technology, the application of open-source large language models (LLMs) in robotic task planning represents a significant milestone. Recent robotic task planning methods based on open-source LLMs typically leverage vast task planning datasets to enhance models' planning abilities. While these methods show promise, they struggle with complex long-horizon tasks, which… ▽ More

    Submitted 1 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

  49. arXiv:2403.15733  [pdf, other

    cs.SI cs.CY

    Spatio-Temporal Graph Convolutional Network Combined Large Language Model: A Deep Learning Framework for Bike Demand Forecasting

    Authors: Peisen Li, Yizhe Pang, Junyu Ren

    Abstract: This study presents a new deep learning framework, combining Spatio-Temporal Graph Convolutional Network (STGCN) with a Large Language Model (LLM), for bike demand forecasting. Addressing challenges in transforming discrete datasets and integrating unstructured language data, the framework leverages LLMs to extract insights from Points of Interest (POI) text data. The proposed STGCN-L model demons… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

    Comments: ISNN 2024

  50. arXiv:2403.13563  [pdf

    cs.CR cs.AR cs.LG

    DL2Fence: Integrating Deep Learning and Frame Fusion for Enhanced Detection and Localization of Refined Denial-of-Service in Large-Scale NoCs

    Authors: Haoyu Wang, Basel Halak, Jianjie Ren, Ahmad Atamli

    Abstract: This study introduces a refined Flooding Injection Rate-adjustable Denial-of-Service (DoS) model for Network-on-Chips (NoCs) and more importantly presents DL2Fence, a novel framework utilizing Deep Learning (DL) and Frame Fusion (2F) for DoS detection and localization. Two Convolutional Neural Networks models for classification and segmentation were developed to detect and localize DoS respectivel… ▽ More

    Submitted 23 May, 2024; v1 submitted 20 March, 2024; originally announced March 2024.