-
FACTS About Building Retrieval Augmented Generation-based Chatbots
Authors:
Rama Akkiraju,
Anbang Xu,
Deepak Bora,
Tan Yu,
Lu An,
Vishal Seth,
Aaditya Shukla,
Pritam Gundecha,
Hridhay Mehta,
Ashwin Jha,
Prithvi Raj,
Abhinav Balasubramanian,
Murali Maram,
Guru Muthusamy,
Shivakesh Reddy Annepally,
Sidney Knowles,
Min Du,
Nick Burnett,
Sean Javiya,
Ashok Marannan,
Mamta Kumari,
Surbhi Jha,
Ethan Dereszenski,
Anupam Chakraborty,
Subhash Ranjan
, et al. (13 additional authors not shown)
Abstract:
Enterprise chatbots, powered by generative AI, are emerging as key applications to enhance employee productivity. Retrieval Augmented Generation (RAG), Large Language Models (LLMs), and orchestration frameworks like Langchain and Llamaindex are crucial for building these chatbots. However, creating effective enterprise chatbots is challenging and requires meticulous RAG pipeline engineering. This…
▽ More
Enterprise chatbots, powered by generative AI, are emerging as key applications to enhance employee productivity. Retrieval Augmented Generation (RAG), Large Language Models (LLMs), and orchestration frameworks like Langchain and Llamaindex are crucial for building these chatbots. However, creating effective enterprise chatbots is challenging and requires meticulous RAG pipeline engineering. This includes fine-tuning embeddings and LLMs, extracting documents from vector databases, rephrasing queries, reranking results, designing prompts, honoring document access controls, providing concise responses, including references, safeguarding personal information, and building orchestration agents. We present a framework for building RAG-based chatbots based on our experience with three NVIDIA chatbots: for IT/HR benefits, financial earnings, and general content. Our contributions are three-fold: introducing the FACTS framework (Freshness, Architectures, Cost, Testing, Security), presenting fifteen RAG pipeline control points, and providing empirical results on accuracy-latency tradeoffs between large and small LLMs. To the best of our knowledge, this is the first paper of its kind that provides a holistic view of the factors as well as solutions for building secure enterprise-grade chatbots."
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Parallel Minimum Spanning Forest Computation using Sparse Matrix Kernels
Authors:
Tim Baer,
Raghavendra Kanakagiri,
Edgar Solomonik
Abstract:
Formulations of graph algorithms using sparse linear algebra have yielded highly scalable distributed algorithms for problems such as connectivity and shortest path computation. We develop the first formulation of the Awerbuch-Shiloach parallel minimum spanning forest (MSF) algorithm using linear algebra primitives. We introduce a multilinear kernel that operates on an adjacency matrix and two vec…
▽ More
Formulations of graph algorithms using sparse linear algebra have yielded highly scalable distributed algorithms for problems such as connectivity and shortest path computation. We develop the first formulation of the Awerbuch-Shiloach parallel minimum spanning forest (MSF) algorithm using linear algebra primitives. We introduce a multilinear kernel that operates on an adjacency matrix and two vectors. This kernel updates graph vertices by simultaneously using information from both adjacent edges and vertices. In addition, we explore optimizations to accelerate the shortcutting step in the Awerbuch-Shiloach algorithm. We implement this MSF algorithm with Cyclops, a distributed-memory library for generalized sparse tensor algebra. We analyze the parallel scalability of our implementation on the Stampede2 supercomputer.
△ Less
Submitted 14 December, 2021; v1 submitted 10 October, 2021;
originally announced October 2021.
-
Text-to-speech for the hearing impaired
Authors:
Josef Schlittenlacher,
Thomas Baer
Abstract:
Text-to-speech (TTS) systems offer the opportunity to compensate for a hearing loss at the source rather than correcting for it at the receiving end. This removes limitations such as time constraints for algorithms that amplify a sound in a hearing aid and can lead to higher speech quality. We propose an algorithm that restores loudness to normal perception at a high resolution in time, frequency…
▽ More
Text-to-speech (TTS) systems offer the opportunity to compensate for a hearing loss at the source rather than correcting for it at the receiving end. This removes limitations such as time constraints for algorithms that amplify a sound in a hearing aid and can lead to higher speech quality. We propose an algorithm that restores loudness to normal perception at a high resolution in time, frequency and level, and embed it in a TTS system that uses Tacotron2 and WaveGlow to produce individually amplified speech. Subjective evaluations of speech quality showed that the proposed algorithm led to high-quality audio with sound quality similar to original or linearly amplified speech but considerably higher speech intelligibility in noise. Transfer learning led to a quick adaptation of the produced spectra from original speech to individually amplified speech, resulted in high speech quality and intelligibility, and thus gives us a way to train an individual TTS system efficiently.
△ Less
Submitted 22 March, 2021; v1 submitted 3 December, 2020;
originally announced December 2020.
-
Camera identification by grouping images from database, based on shared noise patterns
Authors:
Teun Baar,
Wiger van Houten,
Zeno Geradts
Abstract:
Previous research showed that camera specific noise patterns, so-called PRNU-patterns, are extracted from images and related images could be found. In this particular research the focus is on grouping images from a database, based on a shared noise pattern as an identification method for cameras. Using the method as described in this article, groups of images, created using the same camera, could…
▽ More
Previous research showed that camera specific noise patterns, so-called PRNU-patterns, are extracted from images and related images could be found. In this particular research the focus is on grouping images from a database, based on a shared noise pattern as an identification method for cameras. Using the method as described in this article, groups of images, created using the same camera, could be linked from a large database of images. Using MATLAB programming, relevant image noise patterns are extracted from images much quicker than common methods by the use of faster noise extraction filters and improvements to reduce the calculation costs. Relating noise patterns, with a correlation above a certain threshold value, can quickly be matched. Hereby, from a database of images, groups of relating images could be linked and the method could be used to scan a large number of images for suspect noise patterns.
△ Less
Submitted 12 July, 2012; v1 submitted 11 July, 2012;
originally announced July 2012.