Skip to main content

Showing 1–50 of 536 results for author: Park, H

  1. arXiv:2407.19193  [pdf

    cs.LG cs.AI cs.CR cs.DC

    A collaborative ensemble construction method for federated random forest

    Authors: Penjan Antonio Eng Lim, Cheong Hee Park

    Abstract: Random forests are considered a cornerstone in machine learning for their robustness and versatility. Despite these strengths, their conventional centralized training is ill-suited for the modern landscape of data that is often distributed, sensitive, and subject to privacy concerns. Federated learning (FL) provides a compelling solution to this problem, enabling models to be trained across a grou… ▽ More

    Submitted 27 July, 2024; originally announced July 2024.

    Comments: This is the authors' accepted manuscript of an article published in the journal Expert Systems With Applications. Published version available at: https://www.sciencedirect.com/science/article/pii/S0957417424016099. 22 pages, 3 figures

    MSC Class: 68T05 (Primary); 68W40; 62H30 (Secondary) ACM Class: I.2.6; I.2.11; K.4.1

    Journal ref: Expert Systems with Applications, Volume 255, 2024, Article 124742

  2. arXiv:2407.18879  [pdf, other

    cs.SD cs.LG eess.AS

    Utilizing TTS Synthesized Data for Efficient Development of Keyword Spotting Model

    Authors: Hyun Jin Park, Dhruuv Agarwal, Neng Chen, Rentao Sun, Kurt Partridge, Justin Chen, Harry Zhang, Pai Zhu, Jacob Bartel, Kyle Kastner, Gary Wang, Andrew Rosenberg, Quan Wang

    Abstract: This paper explores the use of TTS synthesized training data for KWS (keyword spotting) task while minimizing development cost and time. Keyword spotting models require a huge amount of training data to be accurate, and obtaining such training data can be costly. In the current state of the art, TTS models can generate large amounts of natural-sounding data, which can help reducing cost and time f… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Comments: to be published in a Workshop at Interspeech 2024, Synthetic Data's Transformative Role in Foundational Speech Models

  3. arXiv:2407.18180  [pdf

    physics.bio-ph cs.RO

    Passive wing deployment and retraction in beetles and flapping microrobots

    Authors: Hoang-Vu Phan, Hoon Cheol Park, Dario Floreano

    Abstract: Birds, bats and many insects can tuck their wings against their bodies at rest and deploy them to power flight. Whereas birds and bats use well-developed pectoral and wing muscles and tendons, how insects control these movements remains unclear, as mechanisms of wing deployment and retraction vary among insect species. Beetles (Coleoptera) display one of the most complex wing mechanisms. For examp… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Comments: 20 pages, 10 figures

  4. arXiv:2407.16840  [pdf, other

    eess.AS cs.AI

    Synth4Kws: Synthesized Speech for User Defined Keyword Spotting in Low Resource Environments

    Authors: Pai Zhu, Dhruuv Agarwal, Jacob W. Bartel, Kurt Partridge, Hyun Jin Park, Quan Wang

    Abstract: One of the challenges in developing a high quality custom keyword spotting (KWS) model is the lengthy and expensive process of collecting training data covering a wide range of languages, phrases and speaking styles. We introduce Synth4Kws - a framework to leverage Text to Speech (TTS) synthesized data for custom KWS in different resource settings. With no real data, we found increasing TTS phrase… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: 5 pages, 5 figures, 2 tables The paper is accepted in Interspeech SynData4GenAI 2024 Workshop - https://syndata4genai.org/#call-for-papers

  5. arXiv:2407.14020  [pdf, other

    q-bio.NC cs.LG

    NeuroBind: Towards Unified Multimodal Representations for Neural Signals

    Authors: Fengyu Yang, Chao Feng, Daniel Wang, Tianye Wang, Ziyao Zeng, Zhiyang Xu, Hyoungseob Park, Pengliang Ji, Hanbin Zhao, Yuanning Li, Alex Wong

    Abstract: Understanding neural activity and information representation is crucial for advancing knowledge of brain function and cognition. Neural activity, measured through techniques like electrophysiology and neuroimaging, reflects various aspects of information processing. Recent advances in deep neural networks offer new approaches to analyzing these signals using pre-trained models. However, challenges… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  6. arXiv:2407.13856  [pdf, other

    cs.RO cs.CV

    Simultaneous Localization and Affordance Prediction for Tasks in Egocentric Video

    Authors: Zachary Chavis, Hyun Soo Park, Stephen J. Guy

    Abstract: Vision-Language Models (VLMs) have shown great success as foundational models for downstream vision and natural language applications in a variety of domains. However, these models lack the spatial understanding necessary for robotics applications where the agent must reason about the affordances provided by the 3D world around them. We present a system which trains on spatially-localized egocentr… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  7. arXiv:2407.12345  [pdf, other

    cs.CV

    VisionTrap: Vision-Augmented Trajectory Prediction Guided by Textual Descriptions

    Authors: Seokha Moon, Hyun Woo, Hongbeen Park, Haeji Jung, Reza Mahjourian, Hyung-gun Chi, Hyerin Lim, Sangpil Kim, Jinkyu Kim

    Abstract: Predicting future trajectories for other road agents is an essential task for autonomous vehicles. Established trajectory prediction methods primarily use agent tracks generated by a detection and tracking system and HD map as inputs. In this work, we propose a novel method that also incorporates visual input from surround-view cameras, allowing the model to utilize visual cues such as human gazes… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: Accepted at ECCV 2024

  8. arXiv:2407.11299  [pdf, other

    cs.RO cs.CV

    FR-SLAM: A SLAM Improvement Method Based on Floor Plan Registration

    Authors: Jiantao Feng, Xinde Li, HyunCheol Park, Juan Liu, Zhentong Zhang

    Abstract: Simultaneous Localization and Mapping (SLAM) technology enables the construction of environmental maps and localization, serving as a key technique for indoor autonomous navigation of mobile robots. Traditional SLAM methods typically require exhaustive traversal of all rooms during indoor navigation to obtain a complete map, resulting in lengthy path planning times and prolonged time to reach targ… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  9. arXiv:2407.08245  [pdf, other

    cs.LG cs.CV

    Feature Diversification and Adaptation for Federated Domain Generalization

    Authors: Seunghan Yang, Seokeon Choi, Hyunsin Park, Sungha Choi, Simyung Chang, Sungrack Yun

    Abstract: Federated learning, a distributed learning paradigm, utilizes multiple clients to build a robust global model. In real-world applications, local clients often operate within their limited domains, leading to a `domain shift' across clients. Privacy concerns limit each client's learning to its own domain data, which increase the risk of overfitting. Moreover, the process of aggregating models train… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  10. arXiv:2407.05683  [pdf, other

    eess.IV cs.AI cs.CV

    RadiomicsFill-Mammo: Synthetic Mammogram Mass Manipulation with Radiomics Features

    Authors: Inye Na, Jonghun Kim, Eun Sook Ko, Hyunjin Park

    Abstract: Motivated by the question, "Can we generate tumors with desired attributes?'' this study leverages radiomics features to explore the feasibility of generating synthetic tumor images. Characterized by its low-dimensional yet biologically meaningful markers, radiomics bridges the gap between complex medical imaging data and actionable clinical insights. We present RadiomicsFill-Mammo, the first of t… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted at MICCAI 2024

  11. arXiv:2407.02403  [pdf, other

    cs.CV cs.AI

    Face Reconstruction Transfer Attack as Out-of-Distribution Generalization

    Authors: Yoon Gyo Jung, Jaewoo Park, Xingbo Dong, Hojin Park, Andrew Beng Jin Teoh, Octavia Camps

    Abstract: Understanding the vulnerability of face recognition systems to malicious attacks is of critical importance. Previous works have focused on reconstructing face images that can penetrate a targeted verification system. Even in the white-box scenario, however, naively reconstructed images misrepresent the identity information, hence the attacks are easily neutralized once the face system is updated o… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV2024

  12. arXiv:2407.00817  [pdf

    cs.AR

    Multi-Objective Optimization for Common-Centroid Placement of Analog Transistors

    Authors: Supriyo Maji, Hyungjoo Park, Gi moon Hong, Souradip Poddar, David Z. Pan

    Abstract: In analog circuits, process variation can cause unpredictability in circuit performance. Common-centroid (CC) type layouts have been shown to mitigate process-induced variations and are widely used to match circuit elements. Nevertheless, selecting the most suitable CC topology necessitates careful consideration of important layout constraints. Manual handling of these constraints becomes challeng… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  13. arXiv:2406.19502  [pdf, other

    cs.CL cs.AI

    Investigating How Large Language Models Leverage Internal Knowledge to Perform Complex Reasoning

    Authors: Miyoung Ko, Sue Hyun Park, Joonsuk Park, Minjoon Seo

    Abstract: Despite significant advancements, there is a limited understanding of how large language models (LLMs) utilize knowledge for reasoning. To address this, we propose a method that deconstructs complex real-world questions into a graph, representing each question as a node with parent nodes of background knowledge needed to solve the question. We develop the DepthQA dataset, deconstructing questions… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Work in progress; code is available at https://github.com/kaistAI/knowledge-reasoning

  14. arXiv:2406.19135  [pdf, other

    eess.AS cs.AI

    DEX-TTS: Diffusion-based EXpressive Text-to-Speech with Style Modeling on Time Variability

    Authors: Hyun Joon Park, Jin Sob Kim, Wooseok Shin, Sung Won Han

    Abstract: Expressive Text-to-Speech (TTS) using reference speech has been studied extensively to synthesize natural speech, but there are limitations to obtaining well-represented styles and improving model generalization ability. In this study, we present Diffusion-based EXpressive TTS (DEX-TTS), an acoustic model designed for reference-based speech synthesis with enhanced style representations. Based on a… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Preprint

  15. arXiv:2406.18388  [pdf, other

    cs.RO cs.AI

    SAM: Semi-Active Mechanism for Extensible Continuum Manipulator and Real-time Hysteresis Compensation Control Algorithm

    Authors: Junhyun Park, Seonghyeok Jang, Myeongbo Park, Hyojae Park, Jeonghyeon Yoon, Minho Hwang

    Abstract: Cable-Driven Continuum Manipulators (CDCMs) enable scar-free procedures via natural orifices and improve target lesion accessibility through curved paths. However, CDCMs face limitations in workspace and control accuracy due to non-linear cable effects causing hysteresis. This paper introduces an extensible CDCM with a Semi-active Mechanism (SAM) to expand the workspace via translational motion wi… ▽ More

    Submitted 27 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: 12 pages, 14 figures, 6 tables

  16. arXiv:2406.16896  [pdf, other

    eess.SP cs.LG

    f-GAN: A frequency-domain-constrained generative adversarial network for PPG to ECG synthesis

    Authors: Nathan C. L. Kong, Dae Lee, Huyen Do, Dae Hoon Park, Cong Xu, Hongda Mao, Jonathan Chung

    Abstract: Electrocardiograms (ECGs) and photoplethysmograms (PPGs) are generally used to monitor an individual's cardiovascular health. In clinical settings, ECGs and fingertip PPGs are the main signals used for assessing cardiovascular health, but the equipment necessary for their collection precludes their use in daily monitoring. Although PPGs obtained from wrist-worn devices are susceptible to noise due… ▽ More

    Submitted 15 May, 2024; originally announced June 2024.

  17. arXiv:2406.09719  [pdf, other

    cs.CL cs.AI

    Self-Knowledge Distillation for Learning Ambiguity

    Authors: Hancheol Park, Soyeong Jeong, Sukmin Cho, Jong C. Park

    Abstract: Recent language models have shown remarkable performance on natural language understanding (NLU) tasks. However, they are often sub-optimal when faced with ambiguous samples that can be interpreted in multiple ways, over-confidently predicting a single label without consideration for its correctness. To address this issue, we propose a novel self-knowledge distillation method that enables models t… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 9 pages, 5 figures

  18. arXiv:2406.09117  [pdf, other

    cs.CV cs.AI

    PC-LoRA: Low-Rank Adaptation for Progressive Model Compression with Knowledge Distillation

    Authors: Injoon Hwang, Haewon Park, Youngwan Lee, Jooyoung Yang, SunJae Maeng

    Abstract: Low-rank adaption (LoRA) is a prominent method that adds a small number of learnable parameters to the frozen pre-trained weights for parameter-efficient fine-tuning. Prompted by the question, ``Can we make its representation enough with LoRA weights solely at the final phase of finetuning without the pre-trained weights?'' In this work, we introduce Progressive Compression LoRA~(PC-LoRA), which u… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted at T4V@CVPR

  19. arXiv:2406.07922  [pdf

    cs.CL

    Automated Information Extraction from Thyroid Operation Narrative: A Comparative Study of GPT-4 and Fine-tuned KoELECTRA

    Authors: Dongsuk Jang, Hyeryun Park, Jiye Son, Hyeonuk Hwang, Sujin Kim, Jinwook Choi

    Abstract: In the rapidly evolving field of healthcare, the integration of artificial intelligence (AI) has become a pivotal component in the automation of clinical workflows, ushering in a new era of efficiency and accuracy. This study focuses on the transformative capabilities of the fine-tuned KoELECTRA model in comparison to the GPT-4 model, aiming to facilitate automated information extraction from thyr… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 9 pages, 2 figures, 3 tables

    Journal ref: AMIA Joint Summits on Translational Science Proceedings, 2024, pp. 249-257

  20. arXiv:2406.07034  [pdf, other

    cs.AI cs.CL

    Improving Multi-hop Logical Reasoning in Knowledge Graphs with Context-Aware Query Representation Learning

    Authors: Jeonghoon Kim, Heesoo Jung, Hyeju Jang, Hogun Park

    Abstract: Multi-hop logical reasoning on knowledge graphs is a pivotal task in natural language processing, with numerous approaches aiming to answer First-Order Logic (FOL) queries. Recent geometry (e.g., box, cone) and probability (e.g., beta distribution)-based methodologies have effectively addressed complex FOL queries. However, a common challenge across these methods lies in determining accurate geome… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 Findings

  21. arXiv:2406.07006  [pdf, other

    cs.CV

    MIPI 2024 Challenge on Few-shot RAW Image Denoising: Methods and Results

    Authors: Xin Jin, Chunle Guo, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Ruoqi Li, Chang Liu, Ziyi Wang, Yao Du, Jingjing Yang, Long Bao, Heng Sun, Xiangyu Kong, Xiaoxia Xing, Jinlong Wu, Yuanyang Xue, Hyunhee Park, Sejun Song, Changho Kim, Jingfan Tan , et al. (17 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: CVPR 2024 Mobile Intelligent Photography and Imaging (MIPI) Workshop--Few-shot RAWImage Denoising Challenge Report. Website: https://mipi-challenge.org/MIPI2024/

  22. arXiv:2406.06448  [pdf, other

    cs.HC

    How is the Pilot Doing: VTOL Pilot Workload Estimation by Multimodal Machine Learning on Psycho-physiological Signals

    Authors: Jong Hoon Park, Lawrence Chen, Ian Higgins, Zhaobo Zheng, Shashank Mehrotra, Kevin Salubre, Mohammadreza Mousaei, Steven Willits, Blain Levedahl, Timothy Buker, Eliot Xing, Teruhisa Misu, Sebastian Scherer, Jean Oh

    Abstract: Vertical take-off and landing (VTOL) aircraft do not require a prolonged runway, thus allowing them to land almost anywhere. In recent years, their flexibility has made them popular in development, research, and operation. When compared to traditional fixed-wing aircraft and rotorcraft, VTOLs bring unique challenges as they combine many maneuvers from both types of aircraft. Pilot workload is a cr… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 8 pages, 7 figures

  23. arXiv:2406.06287  [pdf, other

    math.NA cs.LG

    VS-PINN: A fast and efficient training of physics-informed neural networks using variable-scaling methods for solving PDEs with stiff behavior

    Authors: Seungchan Ko, Sang Hyeon Park

    Abstract: Physics-informed neural networks (PINNs) have recently emerged as a promising way to compute the solutions of partial differential equations (PDEs) using deep neural networks. However, despite their significant success in various fields, it remains unclear in many aspects how to effectively train PINNs if the solutions of PDEs exhibit stiff behaviors or high frequencies. In this paper, we propose… ▽ More

    Submitted 12 July, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  24. arXiv:2406.05761  [pdf, other

    cs.CL

    The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

    Authors: Seungone Kim, Juyoung Suk, Ji Yong Cho, Shayne Longpre, Chaeeun Kim, Dongkeun Yoon, Guijin Son, Yejin Cho, Sheikh Shafayat, Jinheon Baek, Sue Hyun Park, Hyeonbin Hwang, Jinkyung Jo, Hyowon Cho, Haebin Shin, Seongyun Lee, Hanseok Oh, Noah Lee, Namgyu Ho, Se June Joo, Miyoung Ko, Yoonjoo Lee, Hyungjoo Chae, Jamin Shin, Joel Jang , et al. (7 additional authors not shown)

    Abstract: As language models (LMs) become capable of handling a wide range of tasks, their evaluation is becoming as challenging as their development. Most generation benchmarks currently assess LMs using abstract evaluation criteria like helpfulness and harmlessness, which often lack the flexibility and granularity of human assessment. Additionally, these benchmarks tend to focus disproportionately on spec… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Work in Progress

  25. arXiv:2406.02936  [pdf

    eess.IV cs.CV

    Radiomics-guided Multimodal Self-attention Network for Predicting Pathological Complete Response in Breast MRI

    Authors: Jonghun Kim, Hyunjin Park

    Abstract: Breast cancer is the most prevalent cancer among women and predicting pathologic complete response (pCR) after anti-cancer treatment is crucial for patient prognosis and treatment customization. Deep learning has shown promise in medical imaging diagnosis, particularly when utilizing multiple imaging modalities to enhance accuracy. This study presents a model that predicts pCR in breast cancer pat… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 5 pages, 5 figures, IEEE ISBI 2024 proceedings

  26. arXiv:2406.00505  [pdf, other

    cs.CV

    Improving Text Generation on Images with Synthetic Captions

    Authors: Jun Young Koh, Sang Hyun Park, Joy Song

    Abstract: The recent emergence of latent diffusion models such as SDXL and SD 1.5 has shown significant capability in generating highly detailed and realistic images. Despite their remarkable ability to produce images, generating accurate text within images still remains a challenging task. In this paper, we examine the validity of fine-tuning approaches in generating legible text within the image. We propo… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: 9 pages, 12 figures

  27. arXiv:2405.20610  [pdf, other

    cs.CV

    Revisiting and Maximizing Temporal Knowledge in Semi-supervised Semantic Segmentation

    Authors: Wooseok Shin, Hyun Joon Park, Jin Sob Kim, Sung Won Han

    Abstract: In semi-supervised semantic segmentation, the Mean Teacher- and co-training-based approaches are employed to mitigate confirmation bias and coupling problems. However, despite their high performance, these approaches frequently involve complex training pipelines and a substantial computational burden, limiting the scalability and compatibility of these methods. In this paper, we propose a PrevMatc… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 14 pages, 5 figures, submitted to IEEE TPAMI. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  28. arXiv:2405.19346  [pdf, other

    eess.SP cs.AI cs.LG

    Subject-Adaptive Transfer Learning Using Resting State EEG Signals for Cross-Subject EEG Motor Imagery Classification

    Authors: Sion An, Myeongkyun Kang, Soopil Kim, Philip Chikontwe, Li Shen, Sang Hyun Park

    Abstract: Electroencephalography (EEG) motor imagery (MI) classification is a fundamental, yet challenging task due to the variation of signals between individuals i.e., inter-subject variability. Previous approaches try to mitigate this using task-specific (TS) EEG signals from the target subject in training. However, recording TS EEG signals requires time and limits its applicability in various fields. In… ▽ More

    Submitted 9 July, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: Early Accepted at MICCAI 2024

  29. arXiv:2405.17977  [pdf, other

    cs.CL

    Aligning to Thousands of Preferences via System Message Generalization

    Authors: Seongyun Lee, Sue Hyun Park, Seungone Kim, Minjoon Seo

    Abstract: Although humans inherently have diverse values, current large language model (LLM) alignment methods often assume that aligning LLMs with the general public's preferences is optimal. A major challenge in adopting a more individualized approach to LLM alignment is its lack of scalability, as it involves repeatedly acquiring preference data and training new reward models and LLMs for each individual… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Work in progress

  30. arXiv:2405.17633  [pdf, other

    cs.CL

    HEART-felt Narratives: Tracing Empathy and Narrative Style in Personal Stories with LLMs

    Authors: Jocelyn Shen, Joel Mire, Hae Won Park, Cynthia Breazeal, Maarten Sap

    Abstract: Empathy serves as a cornerstone in enabling prosocial behaviors, and can be evoked through sharing of personal experiences in stories. While empathy is influenced by narrative content, intuitively, people respond to the way a story is told as well, through narrative style. Yet the relationship between empathy and narrative style is not fully understood. In this work, we empirically examine and qua… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  31. arXiv:2405.17315  [pdf, other

    cs.CV

    All-day Depth Completion

    Authors: Vadim Ezhov, Hyoungseob Park, Zhaoyang Zhang, Rishi Upadhyay, Howard Zhang, Chethan Chinder Chandrappa, Achuta Kadambi, Yunhao Ba, Julie Dorsey, Alex Wong

    Abstract: We propose a method for depth estimation under different illumination conditions, i.e., day and night time. As photometry is uninformative in regions under low-illumination, we tackle the problem through a multi-sensor fusion approach, where we take as input an additional synchronized sparse point cloud (i.e., from a LiDAR) projected onto the image plane as a sparse depth map, along with a camera… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 8 pages, 4 figures

  32. arXiv:2405.15708  [pdf, other

    cs.CL

    EmpathicStories++: A Multimodal Dataset for Empathy towards Personal Experiences

    Authors: Jocelyn Shen, Yubin Kim, Mohit Hulse, Wazeer Zulfikar, Sharifa Alghowinem, Cynthia Breazeal, Hae Won Park

    Abstract: Modeling empathy is a complex endeavor that is rooted in interpersonal and experiential dimensions of human interaction, and remains an open problem within AI. Existing empathy datasets fall short in capturing the richness of empathy responses, often being confined to in-lab or acted scenarios, lacking longitudinal data, and missing self-reported labels. We introduce a new multimodal dataset for e… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Accepted to ACL 2024 Findings

  33. arXiv:2405.13302  [pdf, other

    stat.ML cs.DM cs.LG math.OC

    Accelerated Evaluation of Ollivier-Ricci Curvature Lower Bounds: Bridging Theory and Computation

    Authors: Wonwoo Kang, Heehyun Park

    Abstract: Curvature serves as a potent and descriptive invariant, with its efficacy validated both theoretically and practically within graph theory. We employ a definition of generalized Ricci curvature proposed by Ollivier, which Lin and Yau later adapted to graph theory, known as Ollivier-Ricci curvature (ORC). ORC measures curvature using the Wasserstein distance, thereby integrating geometric concepts… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  34. arXiv:2405.11911  [pdf, other

    cs.AI cs.LG cs.SI

    PULL: PU-Learning-based Accurate Link Prediction

    Authors: Junghun Kim, Ka Hyun Park, Hoyoung Yoon, U Kang

    Abstract: Given an edge-incomplete graph, how can we accurately find the missing links? The link prediction in edge-incomplete graphs aims to discover the missing relations between entities when their relationships are represented as a graph. Edge-incomplete graphs are prevalent in real-world due to practical limitations, such as not checking all users when adding friends in a social network. Addressing the… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 11 pages

  35. arXiv:2405.07467  [pdf, other

    cs.CL

    MCS-SQL: Leveraging Multiple Prompts and Multiple-Choice Selection For Text-to-SQL Generation

    Authors: Dongjun Lee, Choongwon Park, Jaehyuk Kim, Heesoo Park

    Abstract: Recent advancements in large language models (LLMs) have enabled in-context learning (ICL)-based methods that significantly outperform fine-tuning approaches for text-to-SQL tasks. However, their performance is still considerably lower than that of human experts on benchmarks that include complex schemas and queries, such as BIRD. This study considers the sensitivity of LLMs to the prompts and int… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  36. arXiv:2405.06216  [pdf, other

    cs.CV

    Event-based Structure-from-Orbit

    Authors: Ethan Elms, Yasir Latif, Tae Ha Park, Tat-Jun Chin

    Abstract: Event sensors offer high temporal resolution visual sensing, which makes them ideal for perceiving fast visual phenomena without suffering from motion blur. Certain applications in robotics and vision-based navigation require 3D perception of an object undergoing circular or spinning motion in front of a static camera, such as recovering the angular velocity and shape of the object. The setting is… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: This work will be published in the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 2024

  37. arXiv:2404.17598  [pdf, other

    cs.IR cs.AI cs.LG cs.SI

    Revealing and Utilizing In-group Favoritism for Graph-based Collaborative Filtering

    Authors: Hoin Jung, Hyunsoo Cho, Myungje Choi, Joowon Lee, Jung Ho Park, Myungjoo Kang

    Abstract: When it comes to a personalized item recommendation system, It is essential to extract users' preferences and purchasing patterns. Assuming that users in the real world form a cluster and there is common favoritism in each cluster, in this work, we introduce Co-Clustering Wrapper (CCW). We compute co-clusters of users and items with co-clustering algorithms and add CF subnetworks for each cluster… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 7 pages, 6 figures

  38. arXiv:2404.16223  [pdf, other

    cs.CV eess.IV

    Deep RAW Image Super-Resolution. A NTIRE 2024 Challenge Survey

    Authors: Marcos V. Conde, Florin-Alexandru Vasluianu, Radu Timofte, Jianxing Zhang, Jia Li, Fan Wang, Xiaopeng Li, Zikun Liu, Hyunhee Park, Sejun Song, Changho Kim, Zhijuan Huang, Hongyuan Yu, Cheng Wan, Wending Xiang, Jiamin Lin, Hang Zhong, Qiaosong Zhang, Yue Sun, Xuanwu Yin, Kunlong Zuo, Senyan Xu, Siyuan Jiang, Zhijing Sun, Jiaying Zhu , et al. (10 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 RAW Image Super-Resolution Challenge, highlighting the proposed solutions and results. New methods for RAW Super-Resolution could be essential in modern Image Signal Processing (ISP) pipelines, however, this problem is not as explored as in the RGB domain. Th goal of this challenge is to upscale RAW Bayer images by 2x, considering unknown degradations such as nois… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 - NTIRE Workshop

  39. arXiv:2404.16069  [pdf, other

    cs.HC cs.AI

    Interactive Visual Learning for Stable Diffusion

    Authors: Seongmin Lee, Benjamin Hoover, Hendrik Strobelt, Zijie J. Wang, ShengYun Peng, Austin Wright, Kevin Li, Haekyu Park, Haoyang Yang, Polo Chau

    Abstract: Diffusion-based generative models' impressive ability to create convincing images has garnered global attention. However, their complex internal structures and operations often pose challenges for non-experts to grasp. We introduce Diffusion Explainer, the first interactive visualization tool designed to elucidate how Stable Diffusion transforms text prompts into images. It tightly integrates a vi… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 4 pages, 3 figures. arXiv admin note: substantial text overlap with arXiv:2305.03509

  40. arXiv:2404.15155  [pdf, other

    cs.CL cs.AI cs.LG

    Adaptive Collaboration Strategy for LLMs in Medical Decision Making

    Authors: Yubin Kim, Chanwoo Park, Hyewon Jeong, Yik Siu Chan, Xuhai Xu, Daniel McDuff, Cynthia Breazeal, Hae Won Park

    Abstract: Foundation models have become invaluable in advancing the medical field. Despite their promise, the strategic deployment of LLMs for effective utility in complex medical tasks remains an open question. Our novel framework, Medical Decision-making Agents (MDAgents) aims to address this gap by automatically assigning the effective collaboration structure for LLMs. Assigned solo or group collaboratio… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  41. arXiv:2404.13215  [pdf, other

    physics.app-ph cs.LG

    Machine Learning-Guided Design of Non-Reciprocal and Asymmetric Elastic Chiral Metamaterials

    Authors: Lingxiao Yuan, Emma Lejeune, Harold S. Park

    Abstract: There has been significant recent interest in the mechanics community to design structures that can either violate reciprocity, or exhibit elastic asymmetry or odd elasticity. While these properties are highly desirable to enable mechanical metamaterials to exhibit novel wave propagation phenomena, it remains an open question as to how to design passive structures that exhibit both significant non… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  42. arXiv:2404.12079  [pdf, other

    cs.RO

    Trajectory Planning for Autonomous Vehicle Using Iterative Reward Prediction in Reinforcement Learning

    Authors: Hyunwoo Park

    Abstract: Traditional trajectory planning methods for autonomous vehicles have several limitations. For example, heuristic and explicit simple rules limit generalizability and hinder complex motions. These limitations can be addressed using reinforcement learning-based trajectory planning. However, reinforcement learning suffers from unstable learning, and existing reinforcement learning-based trajectory pl… ▽ More

    Submitted 12 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

    Comments: 8 pages, 6 figures

  43. arXiv:2404.08330  [pdf, other

    cs.CV

    Emerging Property of Masked Token for Effective Pre-training

    Authors: Hyesong Choi, Hunsang Lee, Seyoung Joung, Hyejin Park, Jiyeong Kim, Dongbo Min

    Abstract: Driven by the success of Masked Language Modeling (MLM), the realm of self-supervised learning for computer vision has been invigorated by the central role of Masked Image Modeling (MIM) in driving recent breakthroughs. Notwithstanding the achievements of MIM across various downstream tasks, its overall efficiency is occasionally hampered by the lengthy duration of the pre-training phase. This pap… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  44. arXiv:2404.08327  [pdf, other

    cs.CV

    Salience-Based Adaptive Masking: Revisiting Token Dynamics for Enhanced Pre-training

    Authors: Hyesong Choi, Hyejin Park, Kwang Moo Yi, Sungmin Cha, Dongbo Min

    Abstract: In this paper, we introduce Saliency-Based Adaptive Masking (SBAM), a novel and cost-effective approach that significantly enhances the pre-training performance of Masked Image Modeling (MIM) approaches by prioritizing token salience. Our method provides robustness against variations in masking ratios, effectively mitigating the performance instability issues common in existing methods. This relax… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  45. arXiv:2404.07554  [pdf, other

    cs.CV cs.AI

    CAT: Contrastive Adapter Training for Personalized Image Generation

    Authors: Jae Wan Park, Sang Hyun Park, Jun Young Koh, Junha Lee, Min Song

    Abstract: The emergence of various adapters, including Low-Rank Adaptation (LoRA) applied from the field of natural language processing, has allowed diffusion models to personalize image generation at a low cost. However, due to the various challenges including limited datasets and shortage of regularization and computation resources, adapter training often results in unsatisfactory outcomes, leading to the… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: CVPRW 2024

  46. arXiv:2404.03635  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.MM

    WorDepth: Variational Language Prior for Monocular Depth Estimation

    Authors: Ziyao Zeng, Daniel Wang, Fengyu Yang, Hyoungseob Park, Yangchao Wu, Stefano Soatto, Byung-Woo Hong, Dong Lao, Alex Wong

    Abstract: Three-dimensional (3D) reconstruction from a single image is an ill-posed problem with inherent ambiguities, i.e. scale. Predicting a 3D scene from text description(s) is similarly ill-posed, i.e. spatial arrangements of objects described. We investigate the question of whether two inherently ambiguous modalities can be used in conjunction to produce metric-scaled reconstructions. To test this, we… ▽ More

    Submitted 2 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

  47. arXiv:2404.02387  [pdf, other

    physics.data-an cond-mat.str-el cs.LG physics.comp-ph

    An inversion problem for optical spectrum data via physics-guided machine learning

    Authors: Hwiwoo Park, Jun H. Park, Jungseek Hwang

    Abstract: We propose the regularized recurrent inference machine (rRIM), a novel machine-learning approach to solve the challenging problem of deriving the pairing glue function from measured optical spectra. The rRIM incorporates physical principles into both training and inference and affords noise robustness, flexibility with out-of-distribution data, and reduced data requirements. It effectively obtains… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 19 pages, 4 figures

  48. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  49. arXiv:2404.01580  [pdf, other

    cs.CV

    Learning Temporal Cues by Predicting Objects Move for Multi-camera 3D Object Detection

    Authors: Seokha Moon, Hongbeen Park, Jungphil Kwon, Jaekoo Lee, Jinkyu Kim

    Abstract: In autonomous driving and robotics, there is a growing interest in utilizing short-term historical data to enhance multi-camera 3D object detection, leveraging the continuous and correlated nature of input video streams. Recent work has focused on spatially aligning BEV-based features over timesteps. However, this is often limited as its gain does not scale well with long-term past observations. T… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  50. arXiv:2403.19340  [pdf, other

    cs.CL cs.AI

    Dataverse: Open-Source ETL (Extract, Transform, Load) Pipeline for Large Language Models

    Authors: Hyunbyung Park, Sukyung Lee, Gyoungjin Gim, Yungi Kim, Dahyun Kim, Chanjun Park

    Abstract: To address the challenges associated with data processing at scale, we propose Dataverse, a unified open-source Extract-Transform-Load (ETL) pipeline for large language models (LLMs) with a user-friendly design at its core. Easy addition of custom processors with block-based interface in Dataverse allows users to readily and efficiently use Dataverse to build their own ETL pipeline. We hope that D… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.