Skip to main content

Showing 1–50 of 186 results for author: Qi, H

  1. arXiv:2407.18902  [pdf, other

    cs.RO cs.AI cs.LG

    Lessons from Learning to Spin "Pens"

    Authors: Jun Wang, Ying Yuan, Haichuan Che, Haozhi Qi, Yi Ma, Jitendra Malik, Xiaolong Wang

    Abstract: In-hand manipulation of pen-like objects is an important skill in our daily lives, as many tools such as hammers and screwdrivers are similarly shaped. However, current learning-based methods struggle with this task due to a lack of high-quality demonstrations and the significant gap between simulation and the real world. In this work, we push the boundaries of learning-based in-hand manipulation… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Comments: Website: https://penspin.github.io/

  2. arXiv:2407.17535  [pdf, other

    cs.AI cs.LG cs.SE

    LAMBDA: A Large Model Based Data Agent

    Authors: Maojun Sun, Ruijian Han, Binyan Jiang, Houduo Qi, Defeng Sun, Yancheng Yuan, Jian Huang

    Abstract: We introduce ``LAMBDA," a novel open-source, code-free multi-agent data analysis system that that harnesses the power of large models. LAMBDA is designed to address data analysis challenges in complex data-driven applications through the use of innovatively designed data agents that operate iteratively and generatively using natural language. At the core of LAMBDA are two key agent roles: the prog… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: 30 pages, 21 figures and 5 tables

    MSC Class: 62-04; 62-08; 68T01; 68T09

  3. arXiv:2407.07885  [pdf, other

    cs.RO cs.LG

    Learning In-Hand Translation Using Tactile Skin With Shear and Normal Force Sensing

    Authors: Jessica Yin, Haozhi Qi, Jitendra Malik, James Pikul, Mark Yim, Tess Hellebrekers

    Abstract: Recent progress in reinforcement learning (RL) and tactile sensing has significantly advanced dexterous manipulation. However, these methods often utilize simplified tactile signals due to the gap between tactile simulation and the real world. We introduce a sensor model for tactile skin that enables zero-shot sim-to-real transfer of ternary shear and binary normal forces. Using this model, we dev… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Website: https://jessicayin.github.io/tactile-skin-rl/

  4. arXiv:2407.05657  [pdf, other

    cs.CV

    DMSD-CDFSAR: Distillation from Mixed-Source Domain for Cross-Domain Few-shot Action Recognition

    Authors: Fei Guo, YiKang Wang, Han Qi, Li Zhu, Jing Sun

    Abstract: Few-shot action recognition is an emerging field in computer vision, primarily focused on meta-learning within the same domain. However, challenges arise in real-world scenario deployment, as gathering extensive labeled data within a specific domain is laborious and time-intensive. Thus, attention shifts towards cross-domain few-shot action recognition, requiring the model to generalize across dom… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  5. arXiv:2407.04608  [pdf, other

    math.OC cs.GT cs.MA

    A Multi-Player Potential Game Approach for Sensor Network Localization with Noisy Measurements

    Authors: Gehui Xu, Guanpu Chen, Baris Fidan, Yiguang Hong, Hongsheng Qi, Thomas Parisini, Karl H. Johansson

    Abstract: Sensor network localization (SNL) is a challenging problem due to its inherent non-convexity and the effects of noise in inter-node ranging measurements and anchor node position. We formulate a non-convex SNL problem as a multi-player non-convex potential game and investigate the existence and uniqueness of a Nash equilibrium (NE) in both the ideal setting without measurement noise and the practic… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2311.03326, arXiv:2401.02471

  6. arXiv:2406.08848  [pdf, other

    cs.CL cs.AI

    An Approach to Build Zero-Shot Slot-Filling System for Industry-Grade Conversational Assistants

    Authors: G P Shrivatsa Bhargav, Sumit Neelam, Udit Sharma, Shajith Ikbal, Dheeraj Sreedhar, Hima Karanam, Sachindra Joshi, Pankaj Dhoolia, Dinesh Garg, Kyle Croutwater, Haode Qi, Eric Wayne, J William Murdock

    Abstract: We present an approach to build Large Language Model (LLM) based slot-filling system to perform Dialogue State Tracking in conversational assistants serving across a wide variety of industry-grade applications. Key requirements of this system include: 1) usage of smaller-sized models to meet low latency requirements and to enable convenient and cost-effective cloud and customer premise deployments… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  7. arXiv:2405.11171  [pdf, other

    cs.LG

    Graph Feedback Bandits with Similar Arms

    Authors: Han Qi, Guo Fei, Li Zhu

    Abstract: In this paper, we study the stochastic multi-armed bandit problem with graph feedback. Motivated by the clinical trials and recommendation problem, we assume that two arms are connected if and only if they are similar (i.e., their means are close enough). We establish a regret lower bound for this novel feedback structure and introduce two UCB-based algorithms: D-UCB with problem-independent regre… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  8. arXiv:2404.16823  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Learning Visuotactile Skills with Two Multifingered Hands

    Authors: Toru Lin, Yu Zhang, Qiyang Li, Haozhi Qi, Brent Yi, Sergey Levine, Jitendra Malik

    Abstract: Aiming to replicate human-like dexterity, perceptual experiences, and motion patterns, we explore learning from human demonstrations using a bimanual system with multifingered hands and visuotactile data. Two significant challenges exist: the lack of an affordable and accessible teleoperation system suitable for a dual-arm setup with multifingered hands, and the scarcity of multifingered hand hard… ▽ More

    Submitted 22 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: Code and Project Website: https://toruowo.github.io/hato/

  9. arXiv:2404.12659  [pdf, ps, other

    cs.CL

    SOS-1K: A Fine-grained Suicide Risk Classification Dataset for Chinese Social Media Analysis

    Authors: Hongzhi Qi, Hanfei Liu, Jianqiang Li, Qing Zhao, Wei Zhai, Dan Luo, Tian Yu He, Shuo Liu, Bing Xiang Yang, Guanghui Fu

    Abstract: In the social media, users frequently express personal emotions, a subset of which may indicate potential suicidal tendencies. The implicit and varied forms of expression in internet language complicate accurate and rapid identification of suicidal intent on social media, thus creating challenges for timely intervention efforts. The development of deep learning models for suicide risk detection is… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  10. arXiv:2404.11449  [pdf, other

    cs.CL cs.LG

    AI-Enhanced Cognitive Behavioral Therapy: Deep Learning and Large Language Models for Extracting Cognitive Pathways from Social Media Texts

    Authors: Meng Jiang, Yi Jing Yu, Qing Zhao, Jianqiang Li, Changwei Song, Hongzhi Qi, Wei Zhai, Dan Luo, Xiaoqin Wang, Guanghui Fu, Bing Xiang Yang

    Abstract: Cognitive Behavioral Therapy (CBT) is an effective technique for addressing the irrational thoughts stemming from mental illnesses, but it necessitates precise identification of cognitive pathways to be successfully implemented in patient care. In current society, individuals frequently express negative emotions on social media on specific topics, often exhibiting cognitive distortions, including… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  11. arXiv:2403.11163  [pdf, ps, other

    stat.ME cs.LG math.ST stat.CO

    A Selective Review on Statistical Methods for Massive Data Computation: Distributed Computing, Subsampling, and Minibatch Techniques

    Authors: Xuetong Li, Yuan Gao, Hong Chang, Danyang Huang, Yingying Ma, Rui Pan, Haobo Qi, Feifei Wang, Shuyuan Wu, Ke Xu, Jing Zhou, Xuening Zhu, Yingqiu Zhu, Hansheng Wang

    Abstract: This paper presents a selective review of statistical computation methods for massive data analysis. A huge amount of statistical methods for massive data computation have been rapidly developed in the past decades. In this work, we focus on three categories of statistical computation methods: (1) distributed computing, (2) subsampling methods, and (3) minibatch gradient techniques. The first clas… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

  12. arXiv:2403.09990  [pdf, other

    cs.RO

    CLOSURE: Fast Quantification of Pose Uncertainty Sets

    Authors: Yihuai Gao, Yukai Tang, Han Qi, Heng Yang

    Abstract: We investigate uncertainty quantification of 6D pose estimation from learned noisy measurements (e.g. keypoints and pose hypotheses). Assuming unknown-but-bounded measurement noises, a pose uncertainty set (PURSE) is a subset of SE(3) that contains all possible 6D poses compatible with the measurements. Despite being simple to formulate and its ability to embed uncertainty, the PURSE is difficult… ▽ More

    Submitted 27 May, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  13. arXiv:2403.02338  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Twisting Lids Off with Two Hands

    Authors: Toru Lin, Zhao-Heng Yin, Haozhi Qi, Pieter Abbeel, Jitendra Malik

    Abstract: Manipulating objects with two multi-fingered hands has been a long-standing challenge in robotics, attributed to the contact-rich nature of many manipulation tasks and the complexity inherent in coordinating a high-dimensional bimanual system. In this work, we consider the problem of twisting lids of various bottle-like objects with two hands, and demonstrate that policies trained in simulation us… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: Project page can be found at https://toruowo.github.io/bimanual-twist

  14. arXiv:2402.18527  [pdf, other

    cs.CV cs.LG eess.IV

    Defect Detection in Tire X-Ray Images: Conventional Methods Meet Deep Structures

    Authors: Andrei Cozma, Landon Harris, Hairong Qi, Ping Ji, Wenpeng Guo, Song Yuan

    Abstract: This paper introduces a robust approach for automated defect detection in tire X-ray images by harnessing traditional feature extraction methods such as Local Binary Pattern (LBP) and Gray Level Co-Occurrence Matrix (GLCM) features, as well as Fourier and Wavelet-based features, complemented by advanced machine learning techniques. Recognizing the challenges inherent in the complex patterns and te… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 7 pages, 2 figures, 3 tables, submitted to ICIP2024

    ACM Class: I.4.7; I.4.9; I.4.0

  15. arXiv:2402.17062  [pdf, other

    cs.CV

    HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed Distance Fields

    Authors: Haozhe Qi, Chen Zhao, Mathieu Salzmann, Alexander Mathis

    Abstract: Human hands are highly articulated and versatile at handling objects. Jointly estimating the 3D poses of a hand and the object it manipulates from a monocular camera is challenging due to frequent occlusions. Thus, existing methods often rely on intermediate 3D shape representations to increase performance. These representations are typically explicit, such as 3D point clouds or meshes, and thus p… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: Accepted at CVPR 2024. 9 figures, many tables

  16. arXiv:2402.09151  [pdf, other

    cs.CL cs.LG

    Chinese MentalBERT: Domain-Adaptive Pre-training on Social Media for Chinese Mental Health Text Analysis

    Authors: Wei Zhai, Hongzhi Qi, Qing Zhao, Jianqiang Li, Ziqi Wang, Han Wang, Bing Xiang Yang, Guanghui Fu

    Abstract: In the current environment, psychological issues are prevalent and widespread, with social media serving as a key outlet for individuals to share their feelings. This results in the generation of vast quantities of data daily, where negative emotions have the potential to precipitate crisis situations. There is a recognized need for models capable of efficient analysis. While pre-trained language… ▽ More

    Submitted 12 June, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  17. arXiv:2402.03631  [pdf, other

    cs.CV

    CAT-SAM: Conditional Tuning for Few-Shot Adaptation of Segment Anything Model

    Authors: Aoran Xiao, Weihao Xuan, Heli Qi, Yun Xing, Ruijie Ren, Xiaoqin Zhang, Ling Shao, Shijian Lu

    Abstract: The recent Segment Anything Model (SAM) has demonstrated remarkable zero-shot capability and flexible geometric prompting in general image segmentation. However, SAM often struggles when handling various unconventional images, such as aerial, medical, and non-RGB images. This paper presents CAT-SAM, a ConditionAl Tuning network that adapts SAM toward various unconventional target tasks with just f… ▽ More

    Submitted 15 July, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: ECCV 2024

  18. arXiv:2401.15855  [pdf, other

    cs.CV

    Cross-Scale MAE: A Tale of Multi-Scale Exploitation in Remote Sensing

    Authors: Maofeng Tang, Andrei Cozma, Konstantinos Georgiou, Hairong Qi

    Abstract: Remote sensing images present unique challenges to image analysis due to the extensive geographic coverage, hardware limitations, and misaligned multi-scale images. This paper revisits the classical multi-scale representation learning problem but under the general framework of self-supervised learning for remote sensing image understanding. We present Cross-Scale MAE, a self-supervised model built… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

  19. BugsInPy: A Database of Existing Bugs in Python Programs to Enable Controlled Testing and Debugging Studies

    Authors: Ratnadira Widyasari, Sheng Qin Sim, Camellia Lok, Haodi Qi, Jack Phan, Qijin Tay, Constance Tan, Fiona Wee, Jodie Ethelda Tan, Yuheng Yieh, Brian Goh, Ferdian Thung, Hong Jin Kang, Thong Hoang, David Lo, Eng Lieh Ouh

    Abstract: The 2019 edition of Stack Overflow developer survey highlights that, for the first time, Python outperformed Java in terms of popularity. The gap between Python and Java further widened in the 2020 edition of the survey. Unfortunately, despite the rapid increase in Python's popularity, there are not many testing and debugging tools that are designed for Python. This is in stark contrast with the a… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

    Journal ref: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (2020) 1556-1560

  20. arXiv:2401.08345  [pdf, other

    cs.CV

    Multi-view Distillation based on Multi-modal Fusion for Few-shot Action Recognition(CLIP-$\mathrm{M^2}$DF)

    Authors: Fei Guo, YiKang Wang, Han Qi, WenPing Jin, Li Zhu

    Abstract: In recent years, few-shot action recognition has attracted increasing attention. It generally adopts the paradigm of meta-learning. In this field, overcoming the overlapping distribution of classes and outliers is still a challenging problem based on limited samples. We believe the combination of Multi-modal and Multi-view can improve this issue depending on information complementarity. Therefore,… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  21. arXiv:2312.14991  [pdf, other

    cs.CV

    FoodLMM: A Versatile Food Assistant using Large Multi-modal Model

    Authors: Yuehao Yin, Huiyan Qi, Bin Zhu, Jingjing Chen, Yu-Gang Jiang, Chong-Wah Ngo

    Abstract: Large Multi-modal Models (LMMs) have made impressive progress in many vision-language tasks. Nevertheless, the performance of general LMMs in specific domains is still far from satisfactory. This paper proposes FoodLMM, a versatile food assistant based on LMMs with various capabilities, including food recognition, ingredient recognition, recipe generation, nutrition estimation, food segmentation a… ▽ More

    Submitted 12 April, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

  22. arXiv:2312.13469  [pdf, other

    cs.RO cs.CV cs.LG

    Neural feels with neural fields: Visuo-tactile perception for in-hand manipulation

    Authors: Sudharshan Suresh, Haozhi Qi, Tingfan Wu, Taosha Fan, Luis Pineda, Mike Lambeta, Jitendra Malik, Mrinal Kalakrishnan, Roberto Calandra, Michael Kaess, Joseph Ortiz, Mustafa Mukadam

    Abstract: To achieve human-level dexterity, robots must infer spatial awareness from multimodal sensing to reason over contact interactions. During in-hand manipulation of novel objects, such spatial awareness involves estimating the object's pose and shape. The status quo for in-hand perception primarily employs vision, and restricts to tracking a priori known objects. Moreover, visual occlusion of objects… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

    Comments: 43 pages, 20 figures, 1 table; https://suddhu.github.io/neural-feels/

  23. arXiv:2312.07285  [pdf, other

    cs.LG stat.ML

    Forced Exploration in Bandit Problems

    Authors: Han Qi, Fei Guo, Li Zhu

    Abstract: The multi-armed bandit(MAB) is a classical sequential decision problem. Most work requires assumptions about the reward distribution (e.g., bounded), while practitioners may have difficulty obtaining information about these distributions to design models for their problems, especially in non-stationary MAB problems. This paper aims to design a multi-armed bandit algorithm that can be implemented w… ▽ More

    Submitted 12 December, 2023; v1 submitted 12 December, 2023; originally announced December 2023.

  24. arXiv:2312.04578  [pdf, other

    cs.AI cs.CL cs.LG

    Towards a Psychological Generalist AI: A Survey of Current Applications of Large Language Models and Future Prospects

    Authors: Tianyu He, Guanghui Fu, Yijing Yu, Fan Wang, Jianqiang Li, Qing Zhao, Changwei Song, Hongzhi Qi, Dan Luo, Huijing Zou, Bing Xiang Yang

    Abstract: The complexity of psychological principles underscore a significant societal challenge, given the vast social implications of psychological problems. Bridging the gap between understanding these principles and their actual clinical and real-world applications demands rigorous exploration and adept implementation. In recent times, the swift advancement of highly adaptive and reusable artificial int… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

  25. Bridge the Present and Future: A Cross-Layer Matching Game in Dynamic Cloud-Aided Mobile Edge Networks

    Authors: Houyi Qi, Minghui Liwang, Xianbin Wang, Li Li, Wei Gong, Jian Jin, Zhenzhen Jiao

    Abstract: Cloud-aided mobile edge networks (CAMENs) allow edge servers (ESs) to purchase resources from remote cloud servers (CSs), while overcoming resource shortage when handling computation-intensive tasks of mobile users (MUs). Conventional trading mechanisms (e.g., onsite trading) confront many challenges, including decision-making overhead (e.g., latency) and potential trading failures. This paper inv… ▽ More

    Submitted 8 June, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Journal ref: IEEE Transactions on Mobile Computing,2024

  26. arXiv:2312.01083  [pdf, other

    cs.CV

    Consistency Prototype Module and Motion Compensation for Few-Shot Action Recognition (CLIP-CP$\mathbf{M^2}$C)

    Authors: Fei Guo, Li Zhu, YiKang Wang, Han Qi

    Abstract: Recently, few-shot action recognition has significantly progressed by learning the feature discriminability and designing suitable comparison methods. Still, there are the following restrictions. (a) Previous works are mainly based on visual mono-modal. Although some multi-modal works use labels as supplementary to construct prototypes of support videos, they can not use this information for query… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  27. arXiv:2311.14689  [pdf

    cs.CY

    Analyze Factors Influencing Drivers' Cell Phone Online Ride-hailing Software Using While driving: A Case Study in China

    Authors: Xiangnan Song, Xianghong Li, Kai Yin, Huimin Qi, Xufei Fang

    Abstract: The road safety of traffic is greatly affected by the driving performance of online ride-hailing, which has become an increasingly popular travel option for many people. Little attention has been paid to the fact that the use of cell phone online ride-hailing software by drivers to accept orders while driving is one of the causes of traffic accidents involving online ride-hailing. This paper, adop… ▽ More

    Submitted 5 November, 2023; originally announced November 2023.

    Comments: 17 pages,7 tables and 2 figures

    ACM Class: F.1.0

  28. arXiv:2311.11383  [pdf, other

    cs.CV

    A Survey of Emerging Applications of Diffusion Probabilistic Models in MRI

    Authors: Yuheng Fan, Hanxi Liao, Shiqi Huang, Yimin Luo, Huazhu Fu, Haikun Qi

    Abstract: Diffusion probabilistic models (DPMs) which employ explicit likelihood characterization and a gradual sampling process to synthesize data, have gained increasing research interest. Despite their huge computational burdens due to the large number of steps involved during sampling, DPMs are widely appreciated in various medical imaging tasks for their high-quality and diversity of generation. Magnet… ▽ More

    Submitted 7 May, 2024; v1 submitted 19 November, 2023; originally announced November 2023.

  29. arXiv:2311.08746  [pdf, other

    eess.IV cs.CV

    A Diffusion Model Based Quality Enhancement Method for HEVC Compressed Video

    Authors: Zheng Liu, Honggang Qi

    Abstract: Video post-processing methods can improve the quality of compressed videos at the decoder side. Most of the existing methods need to train corresponding models for compressed videos with different quantization parameters to improve the quality of compressed videos. However, in most cases, the quantization parameters of the decoded video are unknown. This makes existing methods have their limitatio… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: 10 pages, conference

  30. Computational synthesis of locomotive soft robots by topology optimization

    Authors: Hiroki Kobayashi, Farzad Gholami, S. Macrae Montgomery, Masato Tanaka, Liang Yue, Changyoung Yuhn, Yuki Sato, Atsushi Kawamoto, H. Jerry Qi, Tsuyoshi Nomura

    Abstract: Locomotive soft robots (SoRos) have gained prominence due to their adaptability. Traditional locomotive SoRo design is based on limb structures inspired by biological organisms and requires human intervention. Evolutionary robotics, designed using evolutionary algorithms (EAs), have shown potential for automatic design. However, EA-based methods face the challenge of high computational cost when c… ▽ More

    Submitted 24 July, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: 36 total pages (27 pages, 9 supplementary pages), 5 Figures, 9 Supplementary figures. 1 Supplementary table

    Journal ref: Sci. Adv. 10, eadn6129 (2024)

  31. arXiv:2310.10056  [pdf, other

    cs.LG

    Latent Conservative Objective Models for Data-Driven Crystal Structure Prediction

    Authors: Han Qi, Xinyang Geng, Stefano Rando, Iku Ohama, Aviral Kumar, Sergey Levine

    Abstract: In computational chemistry, crystal structure prediction (CSP) is an optimization problem that involves discovering the lowest energy stable crystal structure for a given chemical formula. This problem is challenging as it requires discovering globally optimal designs with the lowest energies on complex manifolds. One approach to tackle this problem involves building simulators based on density fu… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  32. arXiv:2310.09511  [pdf, other

    cs.GT cs.LG eess.SY

    Online Parameter Identification of Generalized Non-cooperative Game

    Authors: Jianguo Chen, Jinlong Lei, Hongsheng Qi, Yiguang Hong

    Abstract: This work studies the parameter identification problem of a generalized non-cooperative game, where each player's cost function is influenced by an observable signal and some unknown parameters. We consider the scenario where equilibrium of the game at some observable signals can be observed with noises, whereas our goal is to identify the unknown parameters with the observed data. Assuming that t… ▽ More

    Submitted 14 October, 2023; originally announced October 2023.

    Comments: 10 pages, 5 figures

  33. arXiv:2309.16652  [pdf, other

    cs.RO

    Perceiving Extrinsic Contacts from Touch Improves Learning Insertion Policies

    Authors: Carolina Higuera, Joseph Ortiz, Haozhi Qi, Luis Pineda, Byron Boots, Mustafa Mukadam

    Abstract: Robotic manipulation tasks such as object insertion typically involve interactions between object and environment, namely extrinsic contacts. Prior work on Neural Contact Fields (NCF) use intrinsic tactile sensing between gripper and object to estimate extrinsic contacts in simulation. However, its effectiveness and utility in real-world tasks remains unknown. In this work, we improve NCF to ena… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: Under review

  34. arXiv:2309.09979  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    General In-Hand Object Rotation with Vision and Touch

    Authors: Haozhi Qi, Brent Yi, Sudharshan Suresh, Mike Lambeta, Yi Ma, Roberto Calandra, Jitendra Malik

    Abstract: We introduce RotateIt, a system that enables fingertip-based object rotation along multiple axes by leveraging multimodal sensory inputs. Our system is trained in simulation, where it has access to ground-truth object shapes and physical properties. Then we distill it to operate on realistic yet noisy simulated visuotactile and proprioceptive sensory inputs. These multimodal inputs are fused via a… ▽ More

    Submitted 28 September, 2023; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: CoRL 2023; Website: https://haozhi.io/rotateit/

  35. arXiv:2309.03564  [pdf, other

    cs.CL cs.LG

    Supervised Learning and Large Language Model Benchmarks on Mental Health Datasets: Cognitive Distortions and Suicidal Risks in Chinese Social Media

    Authors: Hongzhi Qi, Qing Zhao, Jianqiang Li, Changwei Song, Wei Zhai, Dan Luo, Shuo Liu, Yi Jing Yu, Fan Wang, Huijing Zou, Bing Xiang Yang, Guanghui Fu

    Abstract: On social media, users often express their personal feelings, which may exhibit cognitive distortions or even suicidal tendencies on certain specific topics. Early recognition of these signs is critical for effective psychological intervention. In this paper, we introduce two novel datasets from Chinese social media: SOS-HL-1K for suicidal risk classification and SocialCD-3K for cognitive distorti… ▽ More

    Submitted 9 June, 2024; v1 submitted 7 September, 2023; originally announced September 2023.

    Comments: 10 pages

  36. arXiv:2308.04454  [pdf

    cs.CY

    Sustainable development-oriented campus bike-sharing site evaluation model: A case study of Henan Polytechnic University

    Authors: Huimin Qi, Xianghong Li, Kai Yin, Xiangnan Song, Xufei Fang

    Abstract: Promoting sustainable transportation options is increasingly crucial in the pursuit of environmentally friendly and efficient campus mobility systems. Among these options, bike-sharing programs have garnered substantial attention for their capacity to mitigate traffic congestion, decrease carbon emissions, and enhance overall campus sustainability. However, improper selection of bike-sharing sites… ▽ More

    Submitted 21 September, 2023; v1 submitted 5 August, 2023; originally announced August 2023.

    Comments: 31 pages,4 figures,21 tables

  37. arXiv:2308.02958  [pdf, other

    eess.IV cs.CV cs.LG eess.SP physics.med-ph

    K-band: Self-supervised MRI Reconstruction via Stochastic Gradient Descent over K-space Subsets

    Authors: Frederic Wang, Han Qi, Alfredo De Goyeneche, Reinhard Heckel, Michael Lustig, Efrat Shimron

    Abstract: Although deep learning (DL) methods are powerful for solving inverse problems, their reliance on high-quality training data is a major hurdle. This is significant in high-dimensional (dynamic/volumetric) magnetic resonance imaging (MRI), where acquisition of high-resolution fully sampled k-space data is impractical. We introduce a novel mathematical framework, dubbed k-band, that enables training… ▽ More

    Submitted 23 May, 2024; v1 submitted 5 August, 2023; originally announced August 2023.

  38. arXiv:2308.01520  [pdf, other

    cs.CV

    COMICS: End-to-end Bi-grained Contrastive Learning for Multi-face Forgery Detection

    Authors: Cong Zhang, Honggang Qi, Shuhui Wang, Yuezun Li, Siwei Lyu

    Abstract: DeepFakes have raised serious societal concerns, leading to a great surge in detection-based forensics methods in recent years. Face forgery recognition is a standard detection method that usually follows a two-phase pipeline. While those methods perform well in ideal experimental environment, they face challenges when dealing with DeepFakes in the wild involving complex background and multiple fa… ▽ More

    Submitted 24 May, 2024; v1 submitted 2 August, 2023; originally announced August 2023.

  39. arXiv:2307.14593  [pdf, other

    cs.CV

    FakeTracer: Catching Face-swap DeepFakes via Implanting Traces in Training

    Authors: Pu Sun, Honggang Qi, Yuezun Li, Siwei Lyu

    Abstract: Face-swap DeepFake is an emerging AI-based face forgery technique that can replace the original face in a video with a generated face of the target identity while retaining consistent facial attributes such as expression and orientation. Due to the high privacy of faces, the misuse of this technique can raise severe social concerns, drawing tremendous attention to defend against DeepFakes recently… ▽ More

    Submitted 21 April, 2024; v1 submitted 26 July, 2023; originally announced July 2023.

  40. arXiv:2307.11470  [pdf, other

    cs.CV

    Physics-Aware Semi-Supervised Underwater Image Enhancement

    Authors: Hao Qi, Xinghui Dong

    Abstract: Underwater images normally suffer from degradation due to the transmission medium of water bodies. Both traditional prior-based approaches and deep learning-based methods have been used to address this problem. However, the inflexible assumption of the former often impairs their effectiveness in handling diverse underwater scenes, while the generalization of the latter to unseen images is usually… ▽ More

    Submitted 28 April, 2024; v1 submitted 21 July, 2023; originally announced July 2023.

    Comments: 12 pages, 5 figures

  41. Matching-based Hybrid Service Trading for Task Assignment over Dynamic Mobile Crowdsensing Networks

    Authors: Houyi Qi, Minghui Liwang, Seyyedali Hosseinalipour, Xiaoyu Xia, Zhipeng Cheng, Xianbin Wang, Zhenzhen Jiao

    Abstract: By opportunistically engaging mobile users (workers), mobile crowdsensing (MCS) networks have emerged as important approach to facilitate sharing of sensed/gathered data of heterogeneous mobile devices. To assign tasks among workers and ensure low overheads, a series of stable matching mechanisms is introduced in this paper, which are integrated into a novel hybrid service trading paradigm consist… ▽ More

    Submitted 17 November, 2023; v1 submitted 25 June, 2023; originally announced June 2023.

    Journal ref: IEEE Transactions on Services Computing, 2023

  42. arXiv:2306.04718  [pdf, other

    cs.LG

    Scalable Neural Symbolic Regression using Control Variables

    Authors: Xieting Chu, Hongjue Zhao, Enze Xu, Hairong Qi, Minghan Chen, Huajie Shao

    Abstract: Symbolic regression (SR) is a powerful technique for discovering the analytical mathematical expression from data, finding various applications in natural sciences due to its good interpretability of results. However, existing methods face scalability issues when dealing with complex equations involving multiple variables. To address this challenge, we propose ScaleSR, a scalable symbolic regressi… ▽ More

    Submitted 9 July, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

  43. arXiv:2305.10718  [pdf, other

    cs.LG stat.ML

    Discounted Thompson Sampling for Non-Stationary Bandit Problems

    Authors: Han Qi, Yue Wang, Li Zhu

    Abstract: Non-stationary multi-armed bandit (NS-MAB) problems have recently received significant attention. NS-MAB are typically modelled in two scenarios: abruptly changing, where reward distributions remain constant for a certain period and change at unknown time steps, and smoothly changing, where reward distributions evolve smoothly based on unknown dynamics. In this paper, we propose Discounted Thompso… ▽ More

    Submitted 22 May, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

  44. arXiv:2304.07882  [pdf, other

    cs.CV

    Federated Learning of Shareable Bases for Personalization-Friendly Image Classification

    Authors: Hong-You Chen, Jike Zhong, Mingda Zhang, Xuhui Jia, Hang Qi, Boqing Gong, Wei-Lun Chao, Li Zhang

    Abstract: Personalized federated learning (PFL) aims to harness the collective wisdom of clients' data while building personalized models tailored to individual clients' data distributions. Existing works offer personalization primarily to clients who participate in the FL process, making it hard to encompass new clients who were absent or newly show up. In this paper, we propose FedBasis, a novel PFL frame… ▽ More

    Submitted 31 October, 2023; v1 submitted 16 April, 2023; originally announced April 2023.

    Comments: Preprint

  45. arXiv:2303.06286  [pdf, other

    cs.SE

    NICHE: A Curated Dataset of Engineered Machine Learning Projects in Python

    Authors: Ratnadira Widyasari, Zhou Yang, Ferdian Thung, Sheng Qin Sim, Fiona Wee, Camellia Lok, Jack Phan, Haodi Qi, Constance Tan, Qijin Tay, David Lo

    Abstract: Machine learning (ML) has gained much attention and been incorporated into our daily lives. While there are numerous publicly available ML projects on open source platforms such as GitHub, there have been limited attempts in filtering those projects to curate ML projects of high quality. The limited availability of such a high-quality dataset poses an obstacle in understanding ML projects. To help… ▽ More

    Submitted 10 March, 2023; originally announced March 2023.

    Comments: Accepted by MSR 2023

  46. arXiv:2303.05181  [pdf, other

    cs.IT

    A Theory for Semantic Channel Coding With Many-to-one Source

    Authors: Shuai Ma, Huayan Qi, Hang Li, Guangming Shi, Yong Liang, Naofal Al-Dhahir

    Abstract: As one of the potential key technologies of 6G, semantic communication is still in its infancy and there are many open problems, such as semantic entropy definition and semantic channel coding theory. To address these challenges, we investigate semantic information measures and semantic channel coding theorem. Specifically, we propose a semantic entropy definition as the uncertainty in the semanti… ▽ More

    Submitted 30 November, 2023; v1 submitted 9 March, 2023; originally announced March 2023.

  47. arXiv:2302.11396  [pdf, other

    cs.AI

    KGTrust: Evaluating Trustworthiness of SIoT via Knowledge Enhanced Graph Neural Networks

    Authors: Zhizhi Yu, Di Jin, Cuiying Huo, Zhiqiang Wang, Xiulong Liu, Heng Qi, Jia Wu, Lingfei Wu

    Abstract: Social Internet of Things (SIoT), a promising and emerging paradigm that injects the notion of social networking into smart objects (i.e., things), paving the way for the next generation of Internet of Things. However, due to the risks and uncertainty, a crucial and urgent problem to be settled is establishing reliable relationships within SIoT, that is, trust evaluation. Graph neural networks for… ▽ More

    Submitted 22 February, 2023; originally announced February 2023.

    Comments: Accepted by WWW-23

  48. arXiv:2302.08689  [pdf, other

    cs.CV

    Dynamic Spatial-temporal Hypergraph Convolutional Network for Skeleton-based Action Recognition

    Authors: Shengqin Wang, Yongji Zhang, Hong Qi, Minghao Zhao, Yu Jiang

    Abstract: Skeleton-based action recognition relies on the extraction of spatial-temporal topological information. Hypergraphs can establish prior unnatural dependencies for the skeleton. However, the existing methods only focus on the construction of spatial topology and ignore the time-point dependence. This paper proposes a dynamic spatial-temporal hypergraph convolutional network (DST-HCN) to capture spa… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

  49. arXiv:2301.12487  [pdf

    cs.CR

    Mitigating Adversarial Effects of False Data Injection Attacks in Power Grid

    Authors: Farhin Farhad Riya, Shahinul Hoque, Jinyuan Stella Sun, Jiangnan Li, Hairong Qi

    Abstract: Deep Neural Networks have proven to be highly accurate at a variety of tasks in recent years. The benefits of Deep Neural Networks have also been embraced in power grids to detect False Data Injection Attacks (FDIA) while conducting critical tasks like state estimation. However, the vulnerabilities of DNNs along with the distinct infrastructure of cyber-physical-system (CPS) can favor the attacker… ▽ More

    Submitted 5 February, 2023; v1 submitted 29 January, 2023; originally announced January 2023.

  50. arXiv:2301.06544  [pdf, other

    cs.CL

    Distinguish Sense from Nonsense: Out-of-Scope Detection for Virtual Assistants

    Authors: Cheng Qian, Haode Qi, Gengyu Wang, Ladislav Kunc, Saloni Potdar

    Abstract: Out of Scope (OOS) detection in Conversational AI solutions enables a chatbot to handle a conversation gracefully when it is unable to make sense of the end-user query. Accurately tagging a query as out-of-domain is particularly hard in scenarios when the chatbot is not equipped to handle a topic which has semantic overlap with an existing topic it is trained on. We propose a simple yet effective… ▽ More

    Submitted 16 January, 2023; originally announced January 2023.

    Comments: Accepted to EMNLP 2022 Industry Track