-
FACTS About Building Retrieval Augmented Generation-based Chatbots
Authors:
Rama Akkiraju,
Anbang Xu,
Deepak Bora,
Tan Yu,
Lu An,
Vishal Seth,
Aaditya Shukla,
Pritam Gundecha,
Hridhay Mehta,
Ashwin Jha,
Prithvi Raj,
Abhinav Balasubramanian,
Murali Maram,
Guru Muthusamy,
Shivakesh Reddy Annepally,
Sidney Knowles,
Min Du,
Nick Burnett,
Sean Javiya,
Ashok Marannan,
Mamta Kumari,
Surbhi Jha,
Ethan Dereszenski,
Anupam Chakraborty,
Subhash Ranjan
, et al. (13 additional authors not shown)
Abstract:
Enterprise chatbots, powered by generative AI, are emerging as key applications to enhance employee productivity. Retrieval Augmented Generation (RAG), Large Language Models (LLMs), and orchestration frameworks like Langchain and Llamaindex are crucial for building these chatbots. However, creating effective enterprise chatbots is challenging and requires meticulous RAG pipeline engineering. This…
▽ More
Enterprise chatbots, powered by generative AI, are emerging as key applications to enhance employee productivity. Retrieval Augmented Generation (RAG), Large Language Models (LLMs), and orchestration frameworks like Langchain and Llamaindex are crucial for building these chatbots. However, creating effective enterprise chatbots is challenging and requires meticulous RAG pipeline engineering. This includes fine-tuning embeddings and LLMs, extracting documents from vector databases, rephrasing queries, reranking results, designing prompts, honoring document access controls, providing concise responses, including references, safeguarding personal information, and building orchestration agents. We present a framework for building RAG-based chatbots based on our experience with three NVIDIA chatbots: for IT/HR benefits, financial earnings, and general content. Our contributions are three-fold: introducing the FACTS framework (Freshness, Architectures, Cost, Testing, Security), presenting fifteen RAG pipeline control points, and providing empirical results on accuracy-latency tradeoffs between large and small LLMs. To the best of our knowledge, this is the first paper of its kind that provides a holistic view of the factors as well as solutions for building secure enterprise-grade chatbots."
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Enhanced Breast Cancer Tumor Classification using MobileNetV2: A Detailed Exploration on Image Intensity, Error Mitigation, and Streamlit-driven Real-time Deployment
Authors:
Aaditya Surya,
Aditya Shah,
Jarnell Kabore,
Subash Sasikumar
Abstract:
This research introduces a sophisticated transfer learning model based on Google's MobileNetV2 for breast cancer tumor classification into normal, benign, and malignant categories, utilizing a dataset of 1576 ultrasound images (265 normal, 891 benign, 420 malignant). The model achieves an accuracy of 0.82, precision of 0.83, recall of 0.81, ROC-AUC of 0.94, PR-AUC of 0.88, and MCC of 0.74. It exam…
▽ More
This research introduces a sophisticated transfer learning model based on Google's MobileNetV2 for breast cancer tumor classification into normal, benign, and malignant categories, utilizing a dataset of 1576 ultrasound images (265 normal, 891 benign, 420 malignant). The model achieves an accuracy of 0.82, precision of 0.83, recall of 0.81, ROC-AUC of 0.94, PR-AUC of 0.88, and MCC of 0.74. It examines image intensity distributions and misclassification errors, offering improvements for future applications. Addressing dataset imbalances, the study ensures a generalizable model. This work, using a dataset from Baheya Hospital, Cairo, Egypt, compiled by Walid Al-Dhabyani et al., emphasizes MobileNetV2's potential in medical imaging, aiming to improve diagnostic precision in oncology. Additionally, the paper explores Streamlit-based deployment for real-time tumor classification, demonstrating MobileNetV2's applicability in medical imaging and setting a benchmark for future research in oncology diagnostics.
△ Less
Submitted 6 January, 2024; v1 submitted 5 December, 2023;
originally announced December 2023.
-
Maximum likelihood smoothing estimation in state-space models: An incomplete-information based approach
Authors:
Budhi Arta Surya
Abstract:
This paper revisits classical works of Rauch (1963, et al. 1965) and develops a novel method for maximum likelihood (ML) smoothing estimation from incomplete information/data of stochastic state-space systems. Score function and conditional observed information matrices of incomplete data are introduced and their distributional identities are established. Using these identities, the ML smoother…
▽ More
This paper revisits classical works of Rauch (1963, et al. 1965) and develops a novel method for maximum likelihood (ML) smoothing estimation from incomplete information/data of stochastic state-space systems. Score function and conditional observed information matrices of incomplete data are introduced and their distributional identities are established. Using these identities, the ML smoother $\widehat{x}_{k\vert n}^s =\argmax_{x_k} \log f(x_k,\widehat{x}_{k+1\vert n}^s, y_{0:n}\vertθ)$, $k\leq n-1$, is presented. The result shows that the ML smoother gives an estimate of state $x_k$ with more adherence of loglikehood having less standard errors than that of the ML state estimator $\widehat{x}_k=\argmax_{x_k} \log f(x_k,y_{0:k}\vertθ)$, with $\widehat{x}_{n\vert n}^s=\widehat{x}_n$. Recursive estimation is given in terms of an EM-gradient-particle algorithm which extends the work of \cite{Lange} for ML smoothing estimation. The algorithm has an explicit iteration update which lacks in (\cite{Ramadan}) EM-algorithm for smoothing. A sequential Monte Carlo method is developed for valuation of the score function and observed information matrices. A recursive equation for the covariance matrix of estimation error is developed to calculate the standard errors. In the case of linear systems, the method shows that the Rauch-Tung-Striebel (RTS) smoother is a fully efficient smoothing state-estimator whose covariance matrix coincides with the Cramér-Rao lower bound, the inverse of expected information matrix. Furthermore, the RTS smoother coincides with the Kalman filter having less covariance matrix. Numerical studies are performed, confirming the accuracy of the main results.
△ Less
Submitted 28 March, 2023;
originally announced March 2023.
-
Maximum likelihood recursive state estimation in state-space models: A new approach based on statistical analysis of incomplete data
Authors:
Budhi Arta Surya
Abstract:
This paper revisits the work of Rauch et al. (1965) and develops a novel method for recursive maximum likelihood particle filtering for general state-space models. The new method is based on statistical analysis of incomplete observations of the systems. Score function and conditional observed information of the incomplete observations/data are introduced and their distributional properties are di…
▽ More
This paper revisits the work of Rauch et al. (1965) and develops a novel method for recursive maximum likelihood particle filtering for general state-space models. The new method is based on statistical analysis of incomplete observations of the systems. Score function and conditional observed information of the incomplete observations/data are introduced and their distributional properties are discussed. Some identities concerning the score function and information matrices of the incomplete data are derived. Maximum likelihood estimation of state-vector is presented in terms of the score function and observed information matrices. In particular, to deal with nonlinear state-space, a sequential Monte Carlo method is developed. It is given recursively by an EM-gradient-particle filtering which extends the work of Lange (1995) for state estimation. To derive covariance matrix of state-estimation errors, an explicit form of observed information matrix is proposed. It extends Louis (1982) general formula for the same matrix to state-vector estimation. Under (Neumann) boundary conditions of state transition probability distribution, the inverse of this matrix coincides with the Cramer-Rao lower bound on the covariance matrix of estimation errors of unbiased state-estimator. In the case of linear models, the method shows that the Kalman filter is a fully efficient state estimator whose covariance matrix of estimation error coincides with the Cramer-Rao lower bound. Some numerical examples are discussed to exemplify the main results.
△ Less
Submitted 8 November, 2022;
originally announced November 2022.
-
A Mosquito is Worth 16x16 Larvae: Evaluation of Deep Learning Architectures for Mosquito Larvae Classification
Authors:
Aswin Surya,
David B. Peral,
Austin VanLoon,
Akhila Rajesh
Abstract:
Mosquito-borne diseases (MBDs), such as dengue virus, chikungunya virus, and West Nile virus, cause over one million deaths globally every year. Because many such diseases are spread by the Aedes and Culex mosquitoes, tracking these larvae becomes critical in mitigating the spread of MBDs. Even as citizen science grows and obtains larger mosquito image datasets, the manual annotation of mosquito i…
▽ More
Mosquito-borne diseases (MBDs), such as dengue virus, chikungunya virus, and West Nile virus, cause over one million deaths globally every year. Because many such diseases are spread by the Aedes and Culex mosquitoes, tracking these larvae becomes critical in mitigating the spread of MBDs. Even as citizen science grows and obtains larger mosquito image datasets, the manual annotation of mosquito images becomes ever more time-consuming and inefficient. Previous research has used computer vision to identify mosquito species, and the Convolutional Neural Network (CNN) has become the de-facto for image classification. However, these models typically require substantial computational resources. This research introduces the application of the Vision Transformer (ViT) in a comparative study to improve image classification on Aedes and Culex larvae. Two ViT models, ViT-Base and CvT-13, and two CNN models, ResNet-18 and ConvNeXT, were trained on mosquito larvae image data and compared to determine the most effective model to distinguish mosquito larvae as Aedes or Culex. Testing revealed that ConvNeXT obtained the greatest values across all classification metrics, demonstrating its viability for mosquito larvae classification. Based on these results, future research includes creating a model specifically designed for mosquito larvae classification by combining elements of CNN and transformer architecture.
△ Less
Submitted 16 September, 2022;
originally announced September 2022.
-
Machine Learning and Ensemble Approach Onto Predicting Heart Disease
Authors:
Aaditya Surya
Abstract:
The four essential chambers of one's heart that lie in the thoracic cavity are crucial for one's survival, yet ironically prove to be the most vulnerable. Cardiovascular disease (CVD) also commonly referred to as heart disease has steadily grown to the leading cause of death amongst humans over the past few decades. Taking this concerning statistic into consideration, it is evident that patients s…
▽ More
The four essential chambers of one's heart that lie in the thoracic cavity are crucial for one's survival, yet ironically prove to be the most vulnerable. Cardiovascular disease (CVD) also commonly referred to as heart disease has steadily grown to the leading cause of death amongst humans over the past few decades. Taking this concerning statistic into consideration, it is evident that patients suffering from CVDs need a quick and correct diagnosis in order to facilitate early treatment to lessen the chances of fatality. This paper attempts to utilize the data provided to train classification models such as Logistic Regression, K Nearest Neighbors, Support Vector Machine, Decision Tree, Gaussian Naive Bayes, Random Forest, and Multi-Layer Perceptron (Artificial Neural Network) and eventually using a soft voting ensemble technique in order to attain as many correct diagnoses as possible.
△ Less
Submitted 16 November, 2021;
originally announced November 2021.
-
Some results on maximum likelihood from incomplete data: finite sample properties and improved M-estimator for resampling
Authors:
Budhi Arta Surya
Abstract:
This paper presents some results on the maximum likelihood (ML) estimation from incomplete data. Finite sample properties of conditional observed information matrices are established. They possess positive definiteness and the same Loewner partial ordering as the expected information matrices do. An explicit form of the observed Fisher information (OFI) is derived for the calculation of standard e…
▽ More
This paper presents some results on the maximum likelihood (ML) estimation from incomplete data. Finite sample properties of conditional observed information matrices are established. They possess positive definiteness and the same Loewner partial ordering as the expected information matrices do. An explicit form of the observed Fisher information (OFI) is derived for the calculation of standard errors of the ML estimates. It simplifies Louis (1982) general formula for the OFI matrix. To prevent from getting an incorrect inverse of the OFI matrix, which may be attributed by the lack of sparsity and large size of the matrix, a monotone convergent recursive equation for the inverse matrix is developed which in turn generalizes the algorithm of Hero and Fessler (1994) for the Cramér-Rao lower bound. To improve the estimation, in particular when applying repeated sampling to incomplete data, a robust M-estimator is introduced. A closed form sandwich estimator of covariance matrix is proposed to provide the standard errors of the M-estimator. By the resulting loss of information presented in finite-sample incomplete data, the sandwich estimator produces smaller standard errors for the M-estimator than the ML estimates. In the case of complete information or absence of re-sampling, the M-estimator coincides with the ML estimates. Application to parameter estimation of a regime switching conditional Markov jump process is discussed to verify the results. The simulation study confirms the accuracy and asymptotic properties of the M-estimator.
△ Less
Submitted 24 July, 2022; v1 submitted 2 August, 2021;
originally announced August 2021.