-
FACTS About Building Retrieval Augmented Generation-based Chatbots
Authors:
Rama Akkiraju,
Anbang Xu,
Deepak Bora,
Tan Yu,
Lu An,
Vishal Seth,
Aaditya Shukla,
Pritam Gundecha,
Hridhay Mehta,
Ashwin Jha,
Prithvi Raj,
Abhinav Balasubramanian,
Murali Maram,
Guru Muthusamy,
Shivakesh Reddy Annepally,
Sidney Knowles,
Min Du,
Nick Burnett,
Sean Javiya,
Ashok Marannan,
Mamta Kumari,
Surbhi Jha,
Ethan Dereszenski,
Anupam Chakraborty,
Subhash Ranjan
, et al. (13 additional authors not shown)
Abstract:
Enterprise chatbots, powered by generative AI, are emerging as key applications to enhance employee productivity. Retrieval Augmented Generation (RAG), Large Language Models (LLMs), and orchestration frameworks like Langchain and Llamaindex are crucial for building these chatbots. However, creating effective enterprise chatbots is challenging and requires meticulous RAG pipeline engineering. This…
▽ More
Enterprise chatbots, powered by generative AI, are emerging as key applications to enhance employee productivity. Retrieval Augmented Generation (RAG), Large Language Models (LLMs), and orchestration frameworks like Langchain and Llamaindex are crucial for building these chatbots. However, creating effective enterprise chatbots is challenging and requires meticulous RAG pipeline engineering. This includes fine-tuning embeddings and LLMs, extracting documents from vector databases, rephrasing queries, reranking results, designing prompts, honoring document access controls, providing concise responses, including references, safeguarding personal information, and building orchestration agents. We present a framework for building RAG-based chatbots based on our experience with three NVIDIA chatbots: for IT/HR benefits, financial earnings, and general content. Our contributions are three-fold: introducing the FACTS framework (Freshness, Architectures, Cost, Testing, Security), presenting fifteen RAG pipeline control points, and providing empirical results on accuracy-latency tradeoffs between large and small LLMs. To the best of our knowledge, this is the first paper of its kind that provides a holistic view of the factors as well as solutions for building secure enterprise-grade chatbots."
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
The sequence of higher order Mersenne numbers and associated binomial transforms
Authors:
Kalika Prasad,
Munesh Kumari,
Rabiranjan Mohanta,
Hrishikesh Mahato
Abstract:
In this article, we introduce and study a new integer sequence referred to as the higher order Mersenne sequence. The proposed sequence is analogous to the higher order Fibonacci numbers and closely associated with the Mersenne numbers. Here, we discuss various algebraic properties such as Binet's formula, Catalan's identity, d'Ocagne's identity, generating functions, finite and binomial sums, etc…
▽ More
In this article, we introduce and study a new integer sequence referred to as the higher order Mersenne sequence. The proposed sequence is analogous to the higher order Fibonacci numbers and closely associated with the Mersenne numbers. Here, we discuss various algebraic properties such as Binet's formula, Catalan's identity, d'Ocagne's identity, generating functions, finite and binomial sums, etc. of this new sequence, and some inter-relations with Mersenne and Jacobsthal numbers. Moreover, we study the sequence generated from the binomial transforms of the higher order Mersenne numbers and present the recurrence relation and algebraic properties of them. Lastly, we give matrix generators and tridiagonal matrix representation for higher order Mersenne numbers.
△ Less
Submitted 16 July, 2023;
originally announced July 2023.
-
Awareness of Predatory Journals in Library and Information Science Faculties in India
Authors:
Madhuri Kumari,
Subaveerapandiyan A
Abstract:
Predatory journals that pretended to resemble refereed journals but are used for money-making purposes. Predatory publishers produce less quality scientific and research papers; it is a severe academic threat in scientific publications. Researchers are ensuring the quality of the journal and peer-reviewing process before submitting the manuscript. This paper aims to know the Indian Library and Inf…
▽ More
Predatory journals that pretended to resemble refereed journals but are used for money-making purposes. Predatory publishers produce less quality scientific and research papers; it is a severe academic threat in scientific publications. Researchers are ensuring the quality of the journal and peer-reviewing process before submitting the manuscript. This paper aims to know the Indian Library and Information Science faculties awareness and knowledge about Predatory journals.
△ Less
Submitted 14 October, 2022;
originally announced October 2022.
-
A novel public key cryptography based on generalized Lucas matrices
Authors:
Kalika Prasad,
Hrishikesh Mahato,
Munesh Kumari
Abstract:
In this article, we have proposed a generalized Lucas matrix (recursive matrix of higher order) having relation with generalized Fibonacci sequences and established many special properties in addition to that usual matrix algebra. Further, we have proposed a modified public key cryptography using these matrices as keys in Affine cipher and key agreement for encryption-decryption with the combinati…
▽ More
In this article, we have proposed a generalized Lucas matrix (recursive matrix of higher order) having relation with generalized Fibonacci sequences and established many special properties in addition to that usual matrix algebra. Further, we have proposed a modified public key cryptography using these matrices as keys in Affine cipher and key agreement for encryption-decryption with the combination of terms of generalized Lucas sequences under residue operations. In this scheme, instead of exchanging the whole key matrix, only a pair of numbers(parameters) need to be exchanged, which reduces the time complexity as well as space complexity of the key transmission and has a large key-space.
△ Less
Submitted 16 February, 2022;
originally announced February 2022.
-
On the role of the Fibonacci matrix as key in modified ECC
Authors:
Munesh Kumari,
Jagmohan Tanti
Abstract:
In this paper, we have proposed a modified cryptographic scheme based on the application of recursive matrices as key in ECC and ElGamal. For encryption, we consider mapping analogous to affine Hill cipher in which a plaintext matrix has been constructed by points corresponding to letters on elliptic curves. In the formation of key-space, the generalized Fibonacci matrices have been taken into acc…
▽ More
In this paper, we have proposed a modified cryptographic scheme based on the application of recursive matrices as key in ECC and ElGamal. For encryption, we consider mapping analogous to affine Hill cipher in which a plaintext matrix has been constructed by points corresponding to letters on elliptic curves. In the formation of key-space, the generalized Fibonacci matrices have been taken into account, which is the sequence of matrices. The beauty of considering Fibonacci matrices is their construction where we need only two parameters(integers) in place of $n^2$ elements. The use of a recursive matrix makes a large keyspace for our proposed scheme and increases its efficiency. Thus, it reduces time as well space complexity, and its security \& strength is based on EC-DLP which is a hard problem in number theory.
△ Less
Submitted 21 December, 2021;
originally announced December 2021.
-
Sylvester Matrix Based Similarity Estimation Method for Automation of Defect Detection in Textile Fabrics
Authors:
R. M. L. N. Kumari,
G. A. C. T. Bandara,
Maheshi B. Dissanayake
Abstract:
Fabric defect detection is a crucial quality control step in the textile manufacturing industry. In this article, machine vision system based on the Sylvester Matrix Based Similarity Method (SMBSM) is proposed to automate the defect detection process. The algorithm involves six phases, namely resolution matching, image enhancement using Histogram Specification and Median-Mean Based Sub-Image-Clipp…
▽ More
Fabric defect detection is a crucial quality control step in the textile manufacturing industry. In this article, machine vision system based on the Sylvester Matrix Based Similarity Method (SMBSM) is proposed to automate the defect detection process. The algorithm involves six phases, namely resolution matching, image enhancement using Histogram Specification and Median-Mean Based Sub-Image-Clipped Histogram Equalization, image registration through alignment and hysteresis process, image subtraction, edge detection, and fault detection by means of the rank of the Sylvester matrix. The experimental results demonstrate that the proposed method is robust and yields an accuracy of 93.4%, precision of 95.8%, with 2275 ms computational speed.
△ Less
Submitted 8 December, 2020;
originally announced December 2020.
-
A review on mathematical strength and analysis of Enigma
Authors:
Kalika Prasad,
Munesh Kumari
Abstract:
In this review article, we discussed the Mathematics and mechanics behind the Enigma machine with an analysis of security strength. The German army used the Enigma machine during the second world war to encrypt communications. Due to its complexity, the encryption done by the Enigma Machine was assumed to be almost unbreakable. However, the Polish believed that people with good background and deep…
▽ More
In this review article, we discussed the Mathematics and mechanics behind the Enigma machine with an analysis of security strength. The German army used the Enigma machine during the second world war to encrypt communications. Due to its complexity, the encryption done by the Enigma Machine was assumed to be almost unbreakable. However, the Polish believed that people with good background and deep knowledge of science and mathematics would have a better chance to break the encryption done by Enigma. They appointed twenty mathematicians from Poznan University to work on this problem at the Polish Cipher Bureau. Three of those, Marian Rejewski, Jerzy Rozycki and Henryk Zygalski were able to exploit certain flaws in the encryption, and by using permutation group theory finally managed to decipher the Enigma messages. The mathematics discovered by them is presented here.
△ Less
Submitted 17 April, 2020;
originally announced April 2020.
-
A public key cryptography using multinacci block matrices
Authors:
Munesh Kumari,
Jagmohan Tanti
Abstract:
In this paper, we have proposed a public key cryptography using recursive block matrices involving generalized Fibonacci numbers over a finite field Fp. For this, we define multinacci block matrices, a type of upper triangular matrix involving multinacci matrices at diagonal places and obtained some of its algebraic properties. Moreover, we have set up a method for key element agreement at end use…
▽ More
In this paper, we have proposed a public key cryptography using recursive block matrices involving generalized Fibonacci numbers over a finite field Fp. For this, we define multinacci block matrices, a type of upper triangular matrix involving multinacci matrices at diagonal places and obtained some of its algebraic properties. Moreover, we have set up a method for key element agreement at end users, which makes the cryptography more efficient. The proposed cryptography comes with a large keyspace and its security relies on the Discrete Logarithm Problem(DLP).
△ Less
Submitted 19 April, 2022; v1 submitted 19 March, 2020;
originally announced March 2020.
-
Reduction of Redundant Rules in Association Rule Mining-Based Bug Assignment
Authors:
Meera Sharma,
Abhishek Tandon,
Madhu Kumari,
V B Singh
Abstract:
Bug triaging is a process to decide what to do with newly coming bug reports. In this paper, we have mined association rules for the prediction of bug assignee of a newly reported bug using different bug attributes, namely, severity, priority, component and operating system. To deal with the problem of large data sets, we have taken subsets of data set by dividing the large data set using K-means…
▽ More
Bug triaging is a process to decide what to do with newly coming bug reports. In this paper, we have mined association rules for the prediction of bug assignee of a newly reported bug using different bug attributes, namely, severity, priority, component and operating system. To deal with the problem of large data sets, we have taken subsets of data set by dividing the large data set using K-means clustering algorithm. We have used an Apriori algorithm in MATLAB to generate association rules. We have extracted the association rules for top 5 assignees in each cluster.The proposed method has been empirically validated on 14696 bug reports of Mozilla open source software project, namely, Seamonkey, Firefox and Bugzilla. The proposed method provides an improvement over the existing techniques for bug assignment problem.
△ Less
Submitted 23 July, 2018;
originally announced July 2018.
-
JU_KS_Group@FIRE 2016: Consumer Health Information Search
Authors:
Kamal Sarkar,
Debanjan Das,
Indra Banerjee,
Mamta Kumari,
Prasenjit Biswas
Abstract:
In this paper, we describe the methodology used and the results obtained by us for completing the tasks given under the shared task on Consumer Health Information Search (CHIS) collocated with the Forum for Information Retrieval Evaluation (FIRE) 2016, ISI Kolkata. The shared task consists of two sub-tasks - (1) task1: given a query and a document/set of documents associated with that query, the t…
▽ More
In this paper, we describe the methodology used and the results obtained by us for completing the tasks given under the shared task on Consumer Health Information Search (CHIS) collocated with the Forum for Information Retrieval Evaluation (FIRE) 2016, ISI Kolkata. The shared task consists of two sub-tasks - (1) task1: given a query and a document/set of documents associated with that query, the task is to classify the sentences in the document as relevant to the query or not and (2) task 2: the relevant sentences need to be further classified as supporting the claim made in the query, or opposing the claim made in the query. We have participated in both the sub-tasks. The percentage accuracy obtained by our developed system for task1 was 73.39 which is third highest among the 9 teams participated in the shared task.
△ Less
Submitted 24 December, 2016;
originally announced December 2016.
-
Non Binary Local Gradient Contours for Face Recognition
Authors:
Abdullah Gubbi,
Mohammad Fazle Azeem,
M Sharmila Kumari
Abstract:
As the features from the traditional Local Binary Patterns (LBP) and Local Directional Patterns (LDP) are found to be ineffective for face recognition, we have proposed a new approach derived on the basis of Information sets whereby the loss of information that occurs during the binarization is eliminated. The information sets expand the scope of fuzzy sets by connecting the attribute and the corr…
▽ More
As the features from the traditional Local Binary Patterns (LBP) and Local Directional Patterns (LDP) are found to be ineffective for face recognition, we have proposed a new approach derived on the basis of Information sets whereby the loss of information that occurs during the binarization is eliminated. The information sets expand the scope of fuzzy sets by connecting the attribute and the corresponding membership function value as a product. Since face is having smooth texture in a limited area, the extracted features must be highly discernible. To limit the number of features, we consider only the non overlapping windows. By the application of the information set theory we can reduce the number of feature of an image. The derived features are shown to work fairly well over eigenface, fisherface and LBP methods.
△ Less
Submitted 3 November, 2014;
originally announced November 2014.
-
3D Face Recognition using Significant Point based SULD Descriptor
Authors:
B. H. Shekar,
N. Harivinod,
M. Sharmila Kumari,
K. Raghurama Holla
Abstract:
In this work, we present a new 3D face recognition method based on Speeded-Up Local Descriptor (SULD) of significant points extracted from the range images of faces. The proposed model consists of a method for extracting distinctive invariant features from range images of faces that can be used to perform reliable matching between different poses of range images of faces. For a given 3D face scan,…
▽ More
In this work, we present a new 3D face recognition method based on Speeded-Up Local Descriptor (SULD) of significant points extracted from the range images of faces. The proposed model consists of a method for extracting distinctive invariant features from range images of faces that can be used to perform reliable matching between different poses of range images of faces. For a given 3D face scan, range images are computed and the potential interest points are identified by searching at all scales. Based on the stability of the interest point, significant points are extracted. For each significant point we compute the SULD descriptor which consists of vector made of values from the convolved Haar wavelet responses located on concentric circles centred on the significant point, and where the amount of Gaussian smoothing is proportional to the radii of the circles. Experimental results show that the newly proposed method provides higher recognition rate compared to other existing contemporary models developed for 3D face recognition.
△ Less
Submitted 26 October, 2012;
originally announced October 2012.
-
Primitive Polynomials, Singer Cycles, and Word-Oriented Linear Feedback Shift Registers
Authors:
Sudhir R. Ghorpade,
Sartaj Ul Hasan,
Meena Kumari
Abstract:
Using the structure of Singer cycles in general linear groups, we prove that a conjecture of Zeng, Han and He (2007) holds in the affirmative in a special case, and outline a plausible approach to prove it in the general case. This conjecture is about the number of primitive $σ$-LFSRs of a given order over a finite field, and it generalizes a known formula for the number of primitive LFSRs, which…
▽ More
Using the structure of Singer cycles in general linear groups, we prove that a conjecture of Zeng, Han and He (2007) holds in the affirmative in a special case, and outline a plausible approach to prove it in the general case. This conjecture is about the number of primitive $σ$-LFSRs of a given order over a finite field, and it generalizes a known formula for the number of primitive LFSRs, which, in turn, is the number of primitive polynomials of a given degree over a finite field. Moreover, this conjecture is intimately related to an open question of Niederreiter (1995) on the enumeration of splitting subspaces of a given dimension.
△ Less
Submitted 28 March, 2010; v1 submitted 8 April, 2009;
originally announced April 2009.