Skip to main content
The 2024 Developer Survey results are live! See the results

NLP Collective

Questions

Browse questions with relevant NLP tags

7,946 questions

No answer
0 votes
0 answers
6 views

GGUF model in LM Studio returns broken answer

I try to run LLM GGUF model QuantFactory/T-lite-instruct-0.1-GGUF specifically its quantized version T-lite-instruct-0.1.Q2_K.gguf in LM Studio. Sometimes it works fine. But sometimes it returns "...
pav's user avatar
  • 99
1 vote
0 answers
13 views

LDA is predicting same topics for all data

I'm using the German political speech dataset to train the LDA model. My goal here is to categorize each speech into some topics. But the problem is that the generated topics are too similar, and all ...
Ryu Ahmed's user avatar
0 votes
0 answers
5 views

RuntimeError with DeBERTaV3 Sequence Classification: Tensor Size Mismatch

Iam trying to fine-tune the microsoft/deberta-v3-base model for sequence classification with three labels. I have set up my tokenizer and data preprocessing, but I encounter a RuntimeError during ...
suri's user avatar
  • 21
-1 votes
0 answers
13 views

How can I use Word Embeddings for Sentiment Analysis?

I have a project where I've created a classifier but I've learned that word embeddings are a better approach. From my search, I found that CBOW and Skip-grams are the methods to use with Word2Vec. I ...
LoukasPap's user avatar
  • 1,350
1 vote
0 answers
14 views

CPU Memory Leak While Inference Models in Infinite Loop

I'm experiencing a CPU memory leak while running a Python script that processes text using various NLP models in an infinite loop. The script includes language translation, sentiment analysis, and ...
Amritesh Nandan's user avatar
-2 votes
0 answers
28 views

Divide a text based on Intent Analysis with NLP

I have this input from a chat: "Set an alarm for 7:00 am and play a song by Caparezza on Spotify." The input may contain multiple actions to do on the back-end. I want to divide a text based ...
flowibbia's user avatar
0 votes
0 answers
26 views

CUDA error: device-side assert triggered Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions

i am trying to convert my text into its embeddings using a bert model , when i apply this to my my dataset it works fine for some of my inputs then stops and gives that error i have set ...
Gaurav B.V's user avatar
-1 votes
0 answers
24 views

Poor Performance and Signs of Overfitting When Fine-Tuning BART with Adapters on CNN/DailyMail Dataset

I am currently fine-tuning the BART model with adapters for a summarization task using the CNN/DailyMail dataset. I've noticed that the model shows poor performance and signs of overfitting. Below is ...
Emilia Delizia's user avatar
1 vote
0 answers
15 views

execute lucene query in multiple language utilizing AI Model

We have requirement to support multiple language search for the same field. for example title is "Badminton" and subject is "sports" I want to search in solr like title:Badminton ...
Jigar Gajjar's user avatar
-1 votes
0 answers
11 views

Hybridized collaborative filtering and sentence similarity-based system for doctor recommendation based on user input of symptoms and location

I'm trying to solve a problem of recommending a doctor based on a user's symptoms and location using a hybridized collaborative filtering and sentence similarity-based recommender system that follow ...
Sadura Akinrinwa's user avatar
1 vote
0 answers
22 views

Multitasking bert for multilabel classification of 5 categories

I built and finetuned 5 BioClinicalBERT-based models (finetuned bert) to predict labels for medical records for the following categories: specialties = ["aud","den","oph",...
FATMA HAMZA's user avatar
1 vote
0 answers
8 views

Hugging Face pipeline vs manual processing produces different embeddings for Vision Transformers

I am using the transformers library with the ViTForImageClassification model ('google/vit-base-patch16-224') to extract embeddings from images. However, I am observing different embeddings when I use ...
martinelliadr's user avatar
0 votes
0 answers
14 views

RuntimeError: Failed to import transformers.training_args

I am trying to use transformers in a task of building a chatbot from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, GenerationConfig, TrainingArguments, trainer import torch import time ...
Chawki.Hjaiji's user avatar
0 votes
0 answers
33 views

How do I run this model in HuggingFace from Nvidia and Mistral?

The model is: nvidia/Mistral-NeMo-12B-Instruct And the link in HuggingFace nvidia/Mistral-NeMo-12B-Instruct Most model pages in HuggingFace have example Python code. But this model page doesn't have ...
abbas-h's user avatar
  • 420
0 votes
0 answers
13 views

BPE tokenizer add_tokens overlap with trained tokens

I am training a BPE from scratch. I want the vocabulary to include certain tokens that might or might not exist in the training dataset. from datasets import load_dataset from tokenizers import models,...
meliksahturker's user avatar
0 votes
0 answers
35 views

Separating text into smaller chunks based on meaning

I am working on a project involving approximately 8,000 job advertisements in CSV format. I have extracted job titles, IDs, descriptions, and other relevant information and saved it in a PostgreSQL ...
Ameya's user avatar
  • 1
0 votes
0 answers
18 views

Transformer models for contextual word embedding in large datasets

I'm interested in using contextual word embeddings generated by a transformer-based model to explore the similarity of certain words in a large dataset. Most transformer models only allow up to 512 ...
C_B's user avatar
  • 13
0 votes
0 answers
66 views

CUDA out of memory when training Llama-2-7b-hf model locally

I want to finetune meta-llama/Llama-2-7b-hf locally on my laptop. I am running out of CUDA memory when instantiating the Trainer class. I have 16Gb system RAM and a GTX 1060 with 6 Gb of GPU memory. I ...
Vinmean's user avatar
  • 113
0 votes
0 answers
8 views

Fine-Tuning T5 for Question Answering using HuggingFace Transformers, Pytorch Lightning & Python

when try follow video on finetuning T5 on Question Answering link: https://www.youtube.com/watch?v=r6XY80Z9eSA&list=RDCMUCoW_WzQNJVAjxo4osNAxd_g&index=1 when i run 53 trainer.fit(model,...
Nhất Duy Nguyễn Trần's user avatar
0 votes
0 answers
22 views

Is updating points in Qdrant vectordb without re-embedding the data safe?

I'm building a RAG chatbot using Langchain, using the data I've stored in a Qdrant vector database. I wanted to change the metadata of a few documents in my qdrant vector database. For this, I stored ...
Akshitha Rao's user avatar
0 votes
0 answers
17 views

Transformer Model Repeating Same Codon During Inference Despite High Training Accuracy

I'm working on a transformer-based model to translate amino acids to codons. During training and validation, my model achieves 95-98% accuracy. However, during inference, I encounter an issue where ...
Farshid B's user avatar
-1 votes
0 answers
26 views

How to Estimate GPU Memory for training and inference, Data Requirements, and Training Time for Large Language Models?

This is a very concrete and well-defined computer engineering question. I don't understand why someone would want to close it. Today, I faced this question during an interview for an ML Engineer ...
maplemaple's user avatar
  • 1,435
-1 votes
0 answers
38 views
+50

How to use HuggingFace's run_translation.py script to train a translation from scratch?

I tried various HuggingFace scripts to build language models, such as run_mlm.py (link), run_clm.py (link) and run_translation.py (link). For the former 2 scripts, it can train a language model from ...
Raptor's user avatar
  • 53.6k
0 votes
0 answers
18 views

Training LLM uses unexpected amount of GPU memory

I'm training model with self-implemented training loops. A 1.5B Qwen2 occupies 40G of GPU memory. When I did the same training using llama factory, it only takes about 24G. I tried to delete some ...
StaEx_G's user avatar
  • 13
-1 votes
0 answers
19 views

what kind of pre-processing is required to apply on sentence before passing it dependency parser?

I'm trying out sentiment analysis where I convert the sentence into a Graph with nodes being word embedding and edges being dependency between the two words. I'm still confused how exactly should I ...
Harsh Chauhan's user avatar
0 votes
0 answers
18 views

Finetuning BERT on classification task, tensor device mismatch error

I'm having trouble on fine-tuning a BERT model on a classification task, as I'm quite new to this. My data is composed of two columns, "item_title" (my input) and "meta_categ_id" (...
Jerry Zhu's user avatar
-1 votes
0 answers
48 views

cleaning list object containing text and creating new variables using Python

I am trying to create a data frame running the following code - # pip install edgartools import pandas as pd from edgar import * # Tell the SEC who you are set_identity("Your Name youremail@...
Sharif's user avatar
  • 177
0 votes
0 answers
37 views

ValueError: expected sequence of length 129 at dim 1 (got 46)

I was trying to fine-tune an image-to-text model using the following code: import json import torch from torch.utils.data import DataLoader import io from transformers import VisionEncoderDecoderModel,...
demostene's user avatar
0 votes
0 answers
22 views

Huggingface Trainer CUDA Out Of Memory for 500M Model

I'm training MobiLLama for classification. This model is just 500Million Parameters and when I fine-tune it for the downstream tasks, the trainer keep giving me the CUDA out of memory error. I faced ...
Hoangdz's user avatar
  • 187
-1 votes
0 answers
9 views

I want to evaluate the three models which are LDA, LSM and CTM for my data based on coherence score?

My name is Phani. I want to choose which is the best model i.e Latent Dirichlet Allocation, Latent Semantic Analysis and Correlated Topic Model for my data. I already preprocessed the data but I want ...
Phaneswar Manchina's user avatar
0 votes
0 answers
25 views

special_tokens parameter of SentencePieceBPETokenizer.train_from_iterator()

I want to train a custom tokenizer from scratch. Following some online tutorials, they suggest adding a series of special tokens to the train_from_iterator() function: special_tokens = ["<unk&...
Raptor's user avatar
  • 53.6k
-1 votes
0 answers
18 views

Got `disk_offload` error while trying to get the LLma3 model from Hugging face

import torch from transformers import AutoModelForCausalLM,AutoTokenizer from llama_index.llms.huggingface import HuggingFaceLLM from accelerate import disk_offload tokenizer = AutoTokenizer....
Vins Shaji's user avatar
0 votes
0 answers
20 views

Huggingface trainer with 2 optimizers

Is there any way to use the huggingface trainer with 2 optimizers? I need to train 2 parts of my model iteratively, but the Trainer object seems to only take on optimizer. Thanks!
Sandy's user avatar
  • 143
0 votes
0 answers
14 views

How does the transformer model's attention mechanism deal with differing sequence lengths?

I am going through the architecture of the transformer and its attention mechanism. The thing I don't get about this mechanism is how it handles sequences of different lengths. For example: How does ...
Syed Mustaqhim's user avatar
-1 votes
0 answers
7 views

Does DBCV score for density based clustering algorithms reward more granular clusters?

I am trying to run a hyperparameter search for HDBSCAN based on the DBCV scores. From what I observe, the DBCV score is generally higher for more granular clusters. Is it because DBCV rewards granular ...
Tanay's user avatar
  • 179
0 votes
0 answers
19 views

Knowing the format of dataset a pretrained model was trained on

i am working on a Multilingual TTS project , and developing a TTS for my regional language by using a pretrained model from Hugging Face hub , the model i am trying to fine tune is facebook-mms-tts ...
Injila's user avatar
  • 1
2 votes
0 answers
30 views

DSPy can't retrieve passage with text embeddings in ChromaDB

I am working on a RAG application using DSPy and ChromaDB for pdf files. At first I fetched the text from the pdf and add it to the Chromadb as chunks. Also added the embeddings of the chunks. And ...
Anandu Aji's user avatar
0 votes
0 answers
32 views

How to use HuggingFace's Transformers.js to distill messy dictionary definitions down to a clean array of 1-3 word definitions?

Background What pieces would need to be involved using Transformers.js to distill/summarize/clean dictionary definitions which are messy and full of "junk", and return a JSON array of short, ...
Lance's user avatar
  • 77.9k
0 votes
0 answers
10 views

Issue with Data Preprocessing and Tensor Concatenation for Whisper Model Training

I am trying to train a Whisper model for Jeju dialect speech recognition. However, I am encountering several errors related to tensor concatenation during the data preprocessing phase. Below is the ...
dw26's user avatar
  • 1
-1 votes
0 answers
38 views

Group similar vectors in high-dimensional vector space into "spaces/partitions" with unique IDs being assigned per similar group

Clarifying Example As a contrived example, let's say I have vectors in some R^3 vector space: A: [1, 2, 3] B: [1.02, 2.5, 3] C: [1512, 123, 51] I'd like to partition this space into N "slices/...
Matthew Trent's user avatar
1 vote
0 answers
32 views

How to load LoRA weights for image classification model

I trained a model like below. model_name = 'owkin/phikon' model = AutoModelForImageClassification.from_pretrained( model_name, label2id=label2id, id2label=id2label, ...
Wtow's user avatar
  • 108
0 votes
0 answers
12 views

IndexError: string index out of range in bert NER model

I have this code: !pip install datasets !pip install transformers from datasets import load_dataset raw_dataset= load_dataset("Amir13/wnut2017-persian") print(raw_dataset) print(...
abbas mahmudi's user avatar
0 votes
0 answers
36 views

ImportError: Using the `Trainer` with `PyTorch` requires `accelerate>=0.21.0`

I am trying to fine tune a Bert Pretrained model and I am using Transformers trainer and I use the TrainingArguments to tune some hyperparameters training_args = TrainingArguments( output_dir='/...
Chawki.Hjaiji's user avatar
0 votes
0 answers
16 views

Video Object Segmentation Accuracy

I have a question for measuring accuracy of video segmentation models. I see that these models use 2 measures of success generaly: Jaccard Index and Contour Accuracy. However, I don't really ever see ...
Joe Tsai's user avatar
0 votes
0 answers
77 views

Can't install spaCy (and thinc) on Python

When I try to install spaCy to use with chatterBot (somehow it didn't download with ChatterBot), first I got an error because I did not have the Cython module installed into my virtual environment, ...
Tio Zuca's user avatar
  • 109
0 votes
0 answers
20 views

OOM Error using PPO Trainer to LoRa-tune 4-bit Llama-3-8B Model (TRL Hugging Face Library)

As per the standard for PPO Training (which is to do supervised-fine tuning before running the PPO Algorithm) I did a QLoRa fine-tuning of the Llama-3-8B instruct model using my own custom data and ...
Aryaman Jaggi's user avatar
0 votes
0 answers
42 views

BERT embedding cosine similarities look very random and useless

I thought you can use BERT embeddings to determine semantic similarity. I was trying to group some words in categories using this, but the results were very bad. E.g. here is a small example with ...
mihovg93's user avatar
0 votes
0 answers
24 views

How do I install language model for spacy on Kaggle?

Aloha! Everybody knows how to install model at home: python -m spacy download ru_core_news_md But since python notebook on Kaggle is isolated of the global web, it does not seem possible to do so. ...
Dimas del Pablo's user avatar
0 votes
0 answers
38 views

module 'keras_nlp' has no attribute 'models

HAS ANYONE ELSE EXPERIENCED THE SAME ERROR WHEN RUNNING IT LOCALLY? IT RUNS CORRECTLY ON COLAB. module 'keras_nlp' has no attribute 'models i Tried to install the updated version of pip install -U ...
Mody's user avatar
  • 1
0 votes
0 answers
25 views

“Bus Error and Resource Tracker Warning When Training PyTorch Model on GPU with MPS”

I’ve built a vanilla Transformer using PyTorch for machine translation and am encountering issues while trying to train it on an Apple Mac M3 with a 12-core CPU and an 18-core GPU (18GB RAM) ...
Pratheesh Kumar's user avatar


15 30 50 per page
1
2 3 4 5
159