subscribe to arXiv mailings

FACTS About Building Retrieval Augmented Generation-based Chatbots

Authors: Rama Akkiraju, Anbang Xu, Deepak Bora, Tan Yu, Lu An, Vishal Seth, Aaditya Shukla, Pritam Gundecha, Hridhay Mehta, Ashwin Jha, Prithvi Raj, Abhinav Balasubramanian, Murali Maram, Guru Muthusamy, Shivakesh Reddy Annepally, Sidney Knowles, Min Du, Nick Burnett, Sean Javiya, Ashok Marannan, Mamta Kumari, Surbhi Jha, Ethan Dereszenski, Anupam Chakraborty, Subhash Ranjan , et al. (13 additional authors not shown)

Abstract: Enterprise chatbots, powered by generative AI, are emerging as key applications to enhance employee productivity. Retrieval Augmented Generation (RAG), Large Language Models (LLMs), and orchestration frameworks like Langchain and Llamaindex are crucial for building these chatbots. However, creating effective enterprise chatbots is challenging and requires meticulous RAG pipeline engineering. This… ▽ More Enterprise chatbots, powered by generative AI, are emerging as key applications to enhance employee productivity. Retrieval Augmented Generation (RAG), Large Language Models (LLMs), and orchestration frameworks like Langchain and Llamaindex are crucial for building these chatbots. However, creating effective enterprise chatbots is challenging and requires meticulous RAG pipeline engineering. This includes fine-tuning embeddings and LLMs, extracting documents from vector databases, rephrasing queries, reranking results, designing prompts, honoring document access controls, providing concise responses, including references, safeguarding personal information, and building orchestration agents. We present a framework for building RAG-based chatbots based on our experience with three NVIDIA chatbots: for IT/HR benefits, financial earnings, and general content. Our contributions are three-fold: introducing the FACTS framework (Freshness, Architectures, Cost, Testing, Security), presenting fifteen RAG pipeline control points, and providing empirical results on accuracy-latency tradeoffs between large and small LLMs. To the best of our knowledge, this is the first paper of its kind that provides a holistic view of the factors as well as solutions for building secure enterprise-grade chatbots." △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: 8 pages, 6 figures, 2 tables, Preprint submission to ACM CIKM 2024

arXiv:2406.09443 [pdf, other]

Comparative Analysis of Personalized Voice Activity Detection Systems: Assessing Real-World Effectiveness

Authors: Satyam Kumar, Sai Srujana Buddi, Utkarsh Oggy Sarawgi, Vineet Garg, Shivesh Ranjan, Ognjen, Rudovic, Ahmed Hussen Abdelaziz, Saurabh Adya

Abstract: Voice activity detection (VAD) is a critical component in various applications such as speech recognition, speech enhancement, and hands-free communication systems. With the increasing demand for personalized and context-aware technologies, the need for effective personalized VAD systems has become paramount. In this paper, we present a comparative analysis of Personalized Voice Activity Detection… ▽ More Voice activity detection (VAD) is a critical component in various applications such as speech recognition, speech enhancement, and hands-free communication systems. With the increasing demand for personalized and context-aware technologies, the need for effective personalized VAD systems has become paramount. In this paper, we present a comparative analysis of Personalized Voice Activity Detection (PVAD) systems to assess their real-world effectiveness. We introduce a comprehensive approach to assess PVAD systems, incorporating various performance metrics such as frame-level and utterance-level error rates, detection latency and accuracy, alongside user-level analysis. Through extensive experimentation and evaluation, we provide a thorough understanding of the strengths and limitations of various PVAD variants. This paper advances the understanding of PVAD technology by offering insights into its efficacy and viability in practical applications using a comprehensive set of metrics. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2405.07730 [pdf, other]

Does Dependency Locality Predict Non-canonical Word Order in Hindi?

Authors: Sidharth Ranjan, Marten van Schijndel

Abstract: Previous work has shown that isolated non-canonical sentences with Object-before-Subject (OSV) order are initially harder to process than their canonical counterparts with Subject-before-Object (SOV) order. Although this difficulty diminishes with appropriate discourse context, the underlying cognitive factors responsible for alleviating processing challenges in OSV sentences remain a question. In… ▽ More Previous work has shown that isolated non-canonical sentences with Object-before-Subject (OSV) order are initially harder to process than their canonical counterparts with Subject-before-Object (SOV) order. Although this difficulty diminishes with appropriate discourse context, the underlying cognitive factors responsible for alleviating processing challenges in OSV sentences remain a question. In this work, we test the hypothesis that dependency length minimization is a significant predictor of non-canonical (OSV) syntactic choices, especially when controlling for information status such as givenness and surprisal measures. We extract sentences from the Hindi-Urdu Treebank corpus (HUTB) that contain clearly-defined subjects and objects, systematically permute the preverbal constituents of those sentences, and deploy a classifier to distinguish between original corpus sentences and artificially generated alternatives. The classifier leverages various discourse-based and cognitive features, including dependency length, surprisal, and information status, to inform its predictions. Our results suggest that, although there exists a preference for minimizing dependency length in non-canonical corpus sentences amidst the generated variants, this factor does not significantly contribute in identifying corpus sentences above and beyond surprisal and givenness measures. Notably, discourse predictability emerges as the primary determinant of constituent-order preferences. These findings are further supported by human evaluations involving 44 native Hindi speakers. Overall, this work sheds light on the role of expectation adaptation in word-ordering decisions. We conclude by situating our results within the theories of discourse production and information locality. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: Accepted at CogSci-2024 with full paper publication

arXiv:2404.18684 [pdf, other]

Work Smarter...Not Harder: Efficient Minimization of Dependency Length in SOV Languages

Authors: Sidharth Ranjan, Titus von der Malsburg

Abstract: Dependency length minimization is a universally observed quantitative property of natural languages. However, the extent of dependency length minimization, and the cognitive mechanisms through which the language processor achieves this minimization remain unclear. This research offers mechanistic insights by postulating that moving a short preverbal constituent next to the main verb explains preve… ▽ More Dependency length minimization is a universally observed quantitative property of natural languages. However, the extent of dependency length minimization, and the cognitive mechanisms through which the language processor achieves this minimization remain unclear. This research offers mechanistic insights by postulating that moving a short preverbal constituent next to the main verb explains preverbal constituent ordering decisions better than global minimization of dependency length in SOV languages. This approach constitutes a least-effort strategy because it's just one operation but simultaneously reduces the length of all preverbal dependencies linked to the main verb. We corroborate this strategy using large-scale corpus evidence across all seven SOV languages that are prominently represented in the Universal Dependency Treebank. These findings align with the concept of bounded rationality, where decision-making is influenced by 'quick-yet-economical' heuristics rather than exhaustive searches for optimal solutions. Overall, this work sheds light on the role of bounded rationality in linguistic decision-making and language evolution. △ Less

Submitted 10 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

Comments: Accepted at CogSci-2024 as talk with full paper publication

arXiv:2312.10092 [pdf, other]

Introspecting the Happiness amongst University Students using Machine Learning

Authors: Sakshi Ranjan, Pooja Priyadarshini, Subhankar Mishra

Abstract: Happiness underlines the intuitive constructs of a specified population based on positive psychological outcomes. It is the cornerstone of the cognitive skills and exploring university student's happiness has been the essence of the researchers lately. In this study, we have analyzed the university student's happiness and its facets using statistical distribution charts; designing research questio… ▽ More Happiness underlines the intuitive constructs of a specified population based on positive psychological outcomes. It is the cornerstone of the cognitive skills and exploring university student's happiness has been the essence of the researchers lately. In this study, we have analyzed the university student's happiness and its facets using statistical distribution charts; designing research questions. Furthermore, regression analysis, machine learning, and clustering algorithms were applied on the world happiness dataset and university student's dataset for training and testing respectively. Philosophy was the happiest department while Sociology the saddest; average happiness score being 2.8 and 2.44 respectively. Pearson coefficient of correlation was 0.74 for Health. Predicted happiness score was 5.2 and the goodness of model fit was 51%. train and test error being 0.52, 0.47 respectively. On a Confidence Interval(CI) of 5% p-value was least for Campus Environment(CE) and University Reputation(UR) and maximum for Extra-curricular Activities(ECA) and Work Balance(WB) (i.e. 0.184 and 0.228 respectively). RF with Clustering got the highest accuracy(89%) and F score(0.98) and the least error(17.91%), hence turned out to be best for our study △ Less

Submitted 14 December, 2023; originally announced December 2023.

Comments: 5 Figures, 10 tables, 12 pages. Accepted at Happiness Meet IIT Kharagpur-2022

arXiv:2312.06705 [pdf, other]

doi 10.1002/cpe.6800

Perceiving University Student's Opinions from Google App Reviews

Authors: Sakshi Ranjan, Subhankar Mishra

Abstract: Google app market captures the school of thought of users from every corner of the globe via ratings and text reviews, in a multilinguistic arena. The potential information from the reviews cannot be extracted manually, due to its exponential growth. So, Sentiment analysis, by machine learning and deep learning algorithms employing NLP, explicitly uncovers and interprets the emotions. This study p… ▽ More Google app market captures the school of thought of users from every corner of the globe via ratings and text reviews, in a multilinguistic arena. The potential information from the reviews cannot be extracted manually, due to its exponential growth. So, Sentiment analysis, by machine learning and deep learning algorithms employing NLP, explicitly uncovers and interprets the emotions. This study performs the sentiment classification of the app reviews and identifies the university student's behavior towards the app market via exploratory analysis. We applied machine learning algorithms using the TP, TF, and TF IDF text representation scheme and evaluated its performance on Bagging, an ensemble learning method. We used word embedding, Glove, on the deep learning paradigms. Our model was trained on Google app reviews and tested on Student's App Reviews(SAR). The various combinations of these algorithms were compared amongst each other using F score and accuracy and inferences were highlighted graphically. SVM, amongst other classifiers, gave fruitful accuracy(93.41%), F score(89%) on bigram and TF IDF scheme. Bagging enhanced the performance of LR and NB with accuracy of 87.88% and 86.69% and F score of 86% and 78% respectively. Overall, LSTM on Glove embedding recorded the highest accuracy(95.2%) and F score(88%). △ Less

Submitted 10 December, 2023; originally announced December 2023.

Comments: Accepted in Concurrency and Computation Practice and Experience

Journal ref: Concurrency and Computation: Practice and Experience, 34(10), p.e6800 (2022)

arXiv:2306.04332 [pdf]

A Systematic Study Of Various Fingertip Detection Techniques For Air Writing Using Machine Learning

Authors: Heena, Sandeep Ranjan

Abstract: The recent advancement in technology breaks the barriers to communication between users and computers. The communication between humans and computers includes emotion and gesture recognition. Emotions can be recognized on the face of humans whereas gesture recognition includes hand and body gesture recognition. Fingertip detection is also part of it. Gesture recognition is the way of interaction t… ▽ More The recent advancement in technology breaks the barriers to communication between users and computers. The communication between humans and computers includes emotion and gesture recognition. Emotions can be recognized on the face of humans whereas gesture recognition includes hand and body gesture recognition. Fingertip detection is also part of it. Gesture recognition is the way of interaction that is used in air writing. Users can control the devices with simple gestures without touching them. It is how computers can understand human language which will reduce the interaction barriers between them. This paper discusses the different techniques that can be used for fingertip detection in air writing using machine learning △ Less

Submitted 7 June, 2023; originally announced June 2023.

arXiv:2304.11410 [pdf, other]

A bounded rationality account of dependency length minimization in Hindi

Authors: Sidharth Ranjan, Titus von der Malsburg

Abstract: The principle of DEPENDENCY LENGTH MINIMIZATION, which seeks to keep syntactically related words close in a sentence, is thought to universally shape the structure of human languages for effective communication. However, the extent to which dependency length minimization is applied in human language systems is not yet fully understood. Preverbally, the placement of long-before-short constituents a… ▽ More The principle of DEPENDENCY LENGTH MINIMIZATION, which seeks to keep syntactically related words close in a sentence, is thought to universally shape the structure of human languages for effective communication. However, the extent to which dependency length minimization is applied in human language systems is not yet fully understood. Preverbally, the placement of long-before-short constituents and postverbally, short-before-long constituents are known to minimize overall dependency length of a sentence. In this study, we test the hypothesis that placing only the shortest preverbal constituent next to the main-verb explains word order preferences in Hindi (a SOV language) as opposed to the global minimization of dependency length. We characterize this approach as a least-effort strategy because it is a cost-effective way to shorten all dependencies between the verb and its preverbal dependencies. As such, this approach is consistent with the bounded-rationality perspective according to which decision making is governed by "fast but frugal" heuristics rather than by a search for optimal solutions. Consistent with this idea, our results indicate that actual corpus sentences in the Hindi-Urdu Treebank corpus are better explained by the least effort strategy than by global minimization of dependency lengths. Additionally, for the task of distinguishing corpus sentences from counterfactual variants, we find that the dependency length and constituent length of the constituent closest to the main verb are much better predictors of whether a sentence appeared in the corpus than total dependency length. Overall, our findings suggest that cognitive resource constraints play a crucial role in shaping natural languages. △ Less

Submitted 22 April, 2023; originally announced April 2023.

Comments: Accepted at CogSci-2023

arXiv:2302.04577 [pdf, other]

Incorporating Total Variation Regularization in the design of an intelligent Query by Humming system

Authors: Shivangi Ranjan, Vishal Srivastava

Abstract: A Query-By-Humming (QBH) system constitutes a particular case of music information retrieval where the input is a user-hummed melody and the output is the original song which contains that melody. A typical QBH system consists of melody extraction and candidate melody retrieval. For melody extraction, accurate note transcription is the key enabling technology. However, current transcription meth… ▽ More A Query-By-Humming (QBH) system constitutes a particular case of music information retrieval where the input is a user-hummed melody and the output is the original song which contains that melody. A typical QBH system consists of melody extraction and candidate melody retrieval. For melody extraction, accurate note transcription is the key enabling technology. However, current transcription methods are unable to definitively capture the melody and address inaccuracies in user-hummed queries. In this paper, we incorporate Total Variation Regularization (TVR) to denoise queries. This approach accounts for user error in humming without loss of meaningful data and reliably captures the underlying melody. For candidate melody retrieval, we employ a deep learning approach to time series classification using a Fully Convolutional Neural Network. The trained network classifies the incoming query as belonging to one of the target songs. For our experiments, we use Roger Jang's MIR-QBSH dataset which is the standard MIREX dataset. We demonstrate that inclusion of TVR denoised queries in the training set enhances the overall accuracy of the system to 93% which is higher than other state-of-the-art QBH systems. △ Less

Submitted 9 February, 2023; originally announced February 2023.

arXiv:2210.14380 [pdf, other]

Progressive Sentiment Analysis for Code-Switched Text Data

Authors: Sudhanshu Ranjan, Dheeraj Mekala, Jingbo Shang

Abstract: Multilingual transformer language models have recently attracted much attention from researchers and are used in cross-lingual transfer learning for many NLP tasks such as text classification and named entity recognition. However, similar methods for transfer learning from monolingual text to code-switched text have not been extensively explored mainly due to the following challenges: (1) Code-swi… ▽ More Multilingual transformer language models have recently attracted much attention from researchers and are used in cross-lingual transfer learning for many NLP tasks such as text classification and named entity recognition. However, similar methods for transfer learning from monolingual text to code-switched text have not been extensively explored mainly due to the following challenges: (1) Code-switched corpus, unlike monolingual corpus, consists of more than one language and existing methods can't be applied efficiently, (2) Code-switched corpus is usually made of resource-rich and low-resource languages and upon using multilingual pre-trained language models, the final model might bias towards resource-rich language. In this paper, we focus on code-switched sentiment analysis where we have a labelled resource-rich language dataset and unlabelled code-switched data. We propose a framework that takes the distinction between resource-rich and low-resource language into account. Instead of training on the entire code-switched corpus at once, we create buckets based on the fraction of words in the resource-rich language and progressively train from resource-rich language dominated samples to low-resource language dominated samples. Extensive experiments across multiple language pairs demonstrate that progressive training helps low-resource language dominated samples. △ Less

Submitted 25 October, 2022; originally announced October 2022.

Comments: To appear in Findings of EMNLP 2022

arXiv:2210.13940 [pdf, other]

Discourse Context Predictability Effects in Hindi Word Order

Authors: Sidharth Ranjan, Marten van Schijndel, Sumeet Agarwal, Rajakrishnan Rajkumar

Abstract: We test the hypothesis that discourse predictability influences Hindi syntactic choice. While prior work has shown that a number of factors (e.g., information status, dependency length, and syntactic surprisal) influence Hindi word order preferences, the role of discourse predictability is underexplored in the literature. Inspired by prior work on syntactic priming, we investigate how the words an… ▽ More We test the hypothesis that discourse predictability influences Hindi syntactic choice. While prior work has shown that a number of factors (e.g., information status, dependency length, and syntactic surprisal) influence Hindi word order preferences, the role of discourse predictability is underexplored in the literature. Inspired by prior work on syntactic priming, we investigate how the words and syntactic structures in a sentence influence the word order of the following sentences. Specifically, we extract sentences from the Hindi-Urdu Treebank corpus (HUTB), permute the preverbal constituents of those sentences, and build a classifier to predict which sentences actually occurred in the corpus against artificially generated distractors. The classifier uses a number of discourse-based features and cognitive features to make its predictions, including dependency length, surprisal, and information status. We find that information status and LSTM-based discourse predictability influence word order choices, especially for non-canonical object-fronted orders. We conclude by situating our results within the broader syntactic priming literature. △ Less

Submitted 25 October, 2022; originally announced October 2022.

Comments: Accepted to EMNLP 2022

arXiv:2210.13938 [pdf, other]

Dual Mechanism Priming Effects in Hindi Word Order

Authors: Sidharth Ranjan, Marten van Schijndel, Sumeet Agarwal, Rajakrishnan Rajkumar

Abstract: Word order choices during sentence production can be primed by preceding sentences. In this work, we test the DUAL MECHANISM hypothesis that priming is driven by multiple different sources. Using a Hindi corpus of text productions, we model lexical priming with an n-gram cache model and we capture more abstract syntactic priming with an adaptive neural language model. We permute the preverbal cons… ▽ More Word order choices during sentence production can be primed by preceding sentences. In this work, we test the DUAL MECHANISM hypothesis that priming is driven by multiple different sources. Using a Hindi corpus of text productions, we model lexical priming with an n-gram cache model and we capture more abstract syntactic priming with an adaptive neural language model. We permute the preverbal constituents of corpus sentences, and then use a logistic regression model to predict which sentences actually occurred in the corpus against artificially generated meaning-equivalent variants. Our results indicate that lexical priming and lexically-independent syntactic priming affect complementary sets of verb classes. By showing that different priming influences are separable from one another, our results support the hypothesis that multiple different cognitive mechanisms underlie priming. △ Less

Submitted 25 October, 2022; originally announced October 2022.

Comments: Accepted to AACL 2022

arXiv:2204.02455 [pdf, other]

Improving Voice Trigger Detection with Metric Learning

Authors: Prateeth Nayak, Takuya Higuchi, Anmol Gupta, Shivesh Ranjan, Stephen Shum, Siddharth Sigtia, Erik Marchi, Varun Lakshminarasimhan, Minsik Cho, Saurabh Adya, Chandra Dhir, Ahmed Tewfik

Abstract: Voice trigger detection is an important task, which enables activating a voice assistant when a target user speaks a keyword phrase. A detector is typically trained on speech data independent of speaker information and used for the voice trigger detection task. However, such a speaker independent voice trigger detector typically suffers from performance degradation on speech from underrepresented… ▽ More Voice trigger detection is an important task, which enables activating a voice assistant when a target user speaks a keyword phrase. A detector is typically trained on speech data independent of speaker information and used for the voice trigger detection task. However, such a speaker independent voice trigger detector typically suffers from performance degradation on speech from underrepresented groups, such as accented speakers. In this work, we propose a novel voice trigger detector that can use a small number of utterances from a target speaker to improve detection accuracy. Our proposed model employs an encoder-decoder architecture. While the encoder performs speaker independent voice trigger detection, similar to the conventional detector, the decoder predicts a personalized embedding for each utterance. A personalized voice trigger score is then obtained as a similarity score between the embeddings of enrollment utterances and a test utterance. The personalized embedding allows adapting to target speaker's speech when computing the voice trigger score, hence improving voice trigger detection accuracy. Experimental results show that the proposed approach achieves a 38% relative reduction in a false rejection rate (FRR) compared to a baseline speaker independent voice trigger model. △ Less

Submitted 13 September, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

Comments: Accepted at InterSpeech 2022

arXiv:2201.13029 [pdf, other]

A Flexible IAB Architecture for Beyond 5G Network

Authors: Shashi Ranjan, Pranav Jha, Abhay Karandikar, Prasanna Chaporkar

Abstract: IAB is an innovative wireless backhaul solution to provide cost-efficient deployment of small cells for successful 5G adoption. Besides, IAB can utilize the same spectrum for access and backhaul purposes. The 3GPP standardized IAB in Release 16 and would incorporate a few enhancements in the upcoming releases. The 3GPP IAB architecture, however, suffers from some limitations, such as it does not s… ▽ More IAB is an innovative wireless backhaul solution to provide cost-efficient deployment of small cells for successful 5G adoption. Besides, IAB can utilize the same spectrum for access and backhaul purposes. The 3GPP standardized IAB in Release 16 and would incorporate a few enhancements in the upcoming releases. The 3GPP IAB architecture, however, suffers from some limitations, such as it does not support mobile relays or dual-connectivity. This article presents a novel IAB architecture that addresses these limitations and is transparent to legacy operations of the 5G system. The architecture also supports multi-RAT coexistence where access and backhaul may belong to different RATs. These factors (and many others) enable operators to capitalize on the architecture for deploying IAB anywhere in a plug-and-play manner. We also show the merits of the architecture by evaluating its capacity and mobility robustness compared to the 3GPP architecture. Simulation results corroborate our design approach. Owing its robust design, the architecture can contend for standardization in B5G system. △ Less

Submitted 31 January, 2022; originally announced January 2022.

Comments: 7 pages, 5 figures, journal

arXiv:2112.15589 [pdf, ps, other]

3-D Material Style Transfer for Reconstructing Unknown Appearance in Complex Natural Materials

Authors: Shashank Ranjan, Corey Toler-Franklin

Abstract: We propose a 3-D material style transfer framework for reconstructing invisible (or faded) appearance properties in complex natural materials. Our algorithm addresses the technical challenge of transferring appearance properties from one object to another of the same material when both objects have intricate, noncorresponding color patterns. Eggshells, exoskeletons, and minerals, for example, have… ▽ More We propose a 3-D material style transfer framework for reconstructing invisible (or faded) appearance properties in complex natural materials. Our algorithm addresses the technical challenge of transferring appearance properties from one object to another of the same material when both objects have intricate, noncorresponding color patterns. Eggshells, exoskeletons, and minerals, for example, have patterns composed of highly randomized layers of organic and inorganic compounds. These materials pose a challenge as the distribution of compounds that determine surface color changes from object to object and within local pattern regions. Our solution adapts appearance observations from a material property distribution in an exemplar to the material property distribution of a target object to reconstruct its unknown appearance. We use measured reflectance in 3-D bispectral textures to record changing material property distributions. Our novel implementation of spherical harmonics uses principles from chemistry and biology to learn relationships between color (hue and saturation) and material composition and concentration in an exemplar. The encoded relationships are transformed to the property distribution of a target for color recovery and material assignment. Quantitative and qualitative evaluation methods show that we replicate color patterns more accurately than methods that only rely on shape correspondences and coarse-level perceptual differences. We demonstrate applications of our work for reconstructing color in extinct fossils, restoring faded artifacts and generating synthetic textures. △ Less

Submitted 31 December, 2021; originally announced December 2021.

Comments: 15 pages, 22 figures

ACM Class: I.3; I.3.5; I.3.7; I.3.8

arXiv:2109.00780 [pdf, ps, other]

Non-Photorealistic Rendering of Layered Materials: A Multispectral Approach

Authors: Corey Toler-Franklin, Shashank Ranjan

Abstract: We present multispectral rendering techniques for visualizing layered materials found in biological specimens. We are the first to use acquired data from the near-infrared and ultraviolet spectra for non-photorealistic rendering (NPR). Several plant and animal species are more comprehensively understood by multispectral analysis. However, traditional NPR techniques ignore unique information outsid… ▽ More We present multispectral rendering techniques for visualizing layered materials found in biological specimens. We are the first to use acquired data from the near-infrared and ultraviolet spectra for non-photorealistic rendering (NPR). Several plant and animal species are more comprehensively understood by multispectral analysis. However, traditional NPR techniques ignore unique information outside the visible spectrum. We introduce algorithms and principles for processing wavelength dependent surface normals and reflectance. Our registration and feature detection methods are used to formulate stylization effects not considered by current NPR methods including: Spectral Band Shading which isolates and emphasizes shape features at specific wavelengths at multiple scales. Experts in our user study demonstrate the effectiveness of our system for applications in the biological sciences. △ Less

Submitted 2 September, 2021; originally announced September 2021.

Comments: 15 pages, 35 figures

ACM Class: I.3.3; I.3.8; I.4.0; I.4.1; I.4.3; I.4.8; I.4.9

arXiv:2101.02628 [pdf]

Analyzing the response to TV serials retelecast during COVID19 lockdown in India

Authors: Sandeep Ranjan

Abstract: TV serials are a popular source of entertainment. The ongoing COVID19 lockdown has a high probability of degrading the publics mental health. The Government of India started the retelecast of yesteryears popular TV serials on public broadcaster Doordarshan from 28th March 2020 to 31st July 2020. Tweets corresponding to the Doordarshan hashtag were mined to create a dataset. The experiment aims to… ▽ More TV serials are a popular source of entertainment. The ongoing COVID19 lockdown has a high probability of degrading the publics mental health. The Government of India started the retelecast of yesteryears popular TV serials on public broadcaster Doordarshan from 28th March 2020 to 31st July 2020. Tweets corresponding to the Doordarshan hashtag were mined to create a dataset. The experiment aims to analyze the publics response to the retelecast of TV serials by calculating the sentiment score of the tweet dataset. Datasets mean sentiment score of 0.65 and high share 64.58% of positive tweets signifies the acceptance of Doordarshans retelecast decision. The sentiment analysis result also reflects the positive state of mind of the public. △ Less

Submitted 10 January, 2021; v1 submitted 22 December, 2020; originally announced January 2021.

arXiv:2011.05186 [pdf, other]

Pristine annotations-based multi-modal trained artificial intelligence solution to triage chest X-ray for COVID-19

Authors: Tao Tan, Bipul Das, Ravi Soni, Mate Fejes, Sohan Ranjan, Daniel Attila Szabo, Vikram Melapudi, K S Shriram, Utkarsh Agrawal, Laszlo Rusko, Zita Herczeg, Barbara Darazs, Pal Tegzes, Lehel Ferenczi, Rakesh Mullick, Gopal Avinash

Abstract: The COVID-19 pandemic continues to spread and impact the well-being of the global population. The front-line modalities including computed tomography (CT) and X-ray play an important role for triaging COVID patients. Considering the limited access of resources (both hardware and trained personnel) and decontamination considerations, CT may not be ideal for triaging suspected subjects. Artificial i… ▽ More The COVID-19 pandemic continues to spread and impact the well-being of the global population. The front-line modalities including computed tomography (CT) and X-ray play an important role for triaging COVID patients. Considering the limited access of resources (both hardware and trained personnel) and decontamination considerations, CT may not be ideal for triaging suspected subjects. Artificial intelligence (AI) assisted X-ray based applications for triaging and monitoring require experienced radiologists to identify COVID patients in a timely manner and to further delineate the disease region boundary are seen as a promising solution. Our proposed solution differs from existing solutions by industry and academic communities, and demonstrates a functional AI model to triage by inferencing using a single x-ray image, while the deep-learning model is trained using both X-ray and CT data. We report on how such a multi-modal training improves the solution compared to X-ray only training. The multi-modal solution increases the AUC (area under the receiver operating characteristic curve) from 0.89 to 0.93 and also positively impacts the Dice coefficient (0.59 to 0.62) for localizing the pathology. To the best our knowledge, it is the first X-ray solution by leveraging multi-modal information for the development. △ Less

Submitted 10 November, 2020; originally announced November 2020.

arXiv:2007.06230 [pdf, other]

doi 10.1088/1361-6587/ac234c

Using LSTM for the Prediction of Disruption in ADITYA Tokamak

Authors: Aman Agarwal, Aditya Mishra, Priyanka Sharma, Swati Jain, Sutapa Ranjan, Ranjana Manchanda

Abstract: Major disruptions in tokamak pose a serious threat to the vessel and its surrounding pieces of equipment. The ability of the systems to detect any behavior that can lead to disruption can help in alerting the system beforehand and prevent its harmful effects. Many machine learning techniques have already been in use at large tokamaks like JET and ASDEX, but are not suitable for ADITYA, which is co… ▽ More Major disruptions in tokamak pose a serious threat to the vessel and its surrounding pieces of equipment. The ability of the systems to detect any behavior that can lead to disruption can help in alerting the system beforehand and prevent its harmful effects. Many machine learning techniques have already been in use at large tokamaks like JET and ASDEX, but are not suitable for ADITYA, which is comparatively small. Through this work, we discuss a new real-time approach to predict the time of disruption in ADITYA tokamak and validate the results on an experimental dataset. The system uses selected diagnostics from the tokamak and after some pre-processing steps, sends them to a time-sequence Long Short-Term Memory (LSTM) network. The model can make the predictions 12 ms in advance at less computation cost that is quick enough to be deployed in real-time applications. △ Less

Submitted 13 July, 2020; originally announced July 2020.

Comments: 7 pages, 4 figures

Journal ref: Plasma Physics and Controlled Fusion, Volume 63, Number 11, 2021

arXiv:2006.09739 [pdf, other]

Comparative Sentiment Analysis of App Reviews

Authors: Sakshi Ranjan, Subhankar Mishra

Abstract: Google app market captures the school of thought of users via ratings and text reviews. The critique's viewpoint regarding an app is proportional to their satisfaction level. Consequently, this helps other users to gain insights before downloading or purchasing the apps. The potential information from the reviews can't be extracted manually, due to its exponential growth. Sentiment analysis, by ma… ▽ More Google app market captures the school of thought of users via ratings and text reviews. The critique's viewpoint regarding an app is proportional to their satisfaction level. Consequently, this helps other users to gain insights before downloading or purchasing the apps. The potential information from the reviews can't be extracted manually, due to its exponential growth. Sentiment analysis, by machine learning algorithms employing NLP, is used to explicitly uncover and interpret the emotions. This study aims to perform the sentiment classification of the app reviews and identify the university students' behavior towards the app market. We applied machine learning algorithms using the TF-IDF text representation scheme and the performance was evaluated on the ensemble learning method. Our model was trained on Google reviews and tested on students' reviews. SVM recorded the maximum accuracy(93.37\%), F-score(0.88) on tri-gram + TF-IDF scheme. Bagging enhanced the performance of LR and NB with accuracy of 87.80\% and 85.5\% respectively. △ Less

Submitted 17 June, 2020; originally announced June 2020.

Comments: 10 pages, 7 figures, Accepted to the 11th ICCCNT, 2020, IIT KGP

arXiv:1904.07386 [pdf, other]

I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences

Authors: Kong Aik Lee, Ville Hautamaki, Tomi Kinnunen, Hitoshi Yamamoto, Koji Okabe, Ville Vestman, Jing Huang, Guohong Ding, Hanwu Sun, Anthony Larcher, Rohan Kumar Das, Haizhou Li, Mickael Rouvier, Pierre-Michel Bousquet, Wei Rao, Qing Wang, Chunlei Zhang, Fahimeh Bahmaninezhad, Hector Delgado, Jose Patino, Qiongqiong Wang, Ling Guo, Takafumi Koshinaka, Jiacen Zhang, Koichi Shinoda , et al. (21 additional authors not shown)

Abstract: The I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE). The latest edition of such joint submission was in SRE 2018, in which the I4U submission was among the best-performing systems. SRE'18 also marks the 10-year anniversary of I4U consortium into NIST SRE series of evaluation. The primary objective of the current paper is to summarize the res… ▽ More The I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE). The latest edition of such joint submission was in SRE 2018, in which the I4U submission was among the best-performing systems. SRE'18 also marks the 10-year anniversary of I4U consortium into NIST SRE series of evaluation. The primary objective of the current paper is to summarize the results and lessons learned based on the twelve sub-systems and their fusion submitted to SRE'18. It is also our intention to present a shared view on the advancements, progresses, and major paradigm shifts that we have witnessed as an SRE participant in the past decade from SRE'08 to SRE'18. In this regard, we have seen, among others, a paradigm shift from supervector representation to deep speaker embedding, and a switch of research challenge from channel compensation to domain adaptation. △ Less

Submitted 15 April, 2019; originally announced April 2019.

Comments: 5 pages

arXiv:1610.07651 [pdf, ps, other]

UTD-CRSS Systems for 2016 NIST Speaker Recognition Evaluation

Authors: Chunlei Zhang, Fahimeh Bahmaninezhad, Shivesh Ranjan, Chengzhu Yu, Navid Shokouhi, John H. L. Hansen

Abstract: This document briefly describes the systems submitted by the Center for Robust Speech Systems (CRSS) from The University of Texas at Dallas (UTD) to the 2016 National Institute of Standards and Technology (NIST) Speaker Recognition Evaluation (SRE). We developed several UBM and DNN i-Vector based speaker recognition systems with different data sets and feature representations. Given that the empha… ▽ More This document briefly describes the systems submitted by the Center for Robust Speech Systems (CRSS) from The University of Texas at Dallas (UTD) to the 2016 National Institute of Standards and Technology (NIST) Speaker Recognition Evaluation (SRE). We developed several UBM and DNN i-Vector based speaker recognition systems with different data sets and feature representations. Given that the emphasis of the NIST SRE 2016 is on language mismatch between training and enrollment/test data, so-called domain mismatch, in our system development we focused on: (1) using unlabeled in-domain data for centralizing data to alleviate the domain mismatch problem, (2) finding the best data set for training LDA/PLDA, (3) using newly proposed dimension reduction technique incorporating unlabeled in-domain data before PLDA training, (4) unsupervised speaker clustering of unlabeled data and using them alone or with previous SREs for PLDA training, (5) score calibration using only unlabeled data and combination of unlabeled and development (Dev) data as separate experiments. △ Less

Submitted 24 October, 2016; originally announced October 2016.

Comments: 5 pages

arXiv:1311.4900 [pdf]

Query Interface Integrator For Domain Specific Hidden Web

Authors: Sudhakar Ranjan, Komal K. Bhatia

Abstract: Web is title admittance today mainly relies on search engines. A large amount of data is hidden in the databases behind the search interfaces referred to as Hidden web, which needs to be indexed so in order to serve user query. In this paper database and data mining techniques are used for query interface integration. The query interface must resemble the look and feel of local interface as much a… ▽ More Web is title admittance today mainly relies on search engines. A large amount of data is hidden in the databases behind the search interfaces referred to as Hidden web, which needs to be indexed so in order to serve user query. In this paper database and data mining techniques are used for query interface integration. The query interface must resemble the look and feel of local interface as much as possible despite being automatically generated without human support.This technique keeps the related documents in the same domain so that searching of documents becomes more efficient in terms of time complexity. △ Less

Submitted 16 November, 2013; originally announced November 2013.

Comments: 8 Pages. International Journal of Computer Engineering and Applications, 2013

Showing 1–23 of 23 results for author: Ranjan, S