Stay up to date
We'll highlight new content in your collectives with a blue activity indicator on navigation items and posts.
Manage preferences
Questions
Browse questions with relevant NLP tags
7,946 questions
No answer
0
votes
0
answers
6
views
GGUF model in LM Studio returns broken answer
I try to run LLM GGUF model QuantFactory/T-lite-instruct-0.1-GGUF specifically its quantized version T-lite-instruct-0.1.Q2_K.gguf in LM Studio.
Sometimes it works fine. But sometimes it returns "...
1
vote
0
answers
13
views
LDA is predicting same topics for all data
I'm using the German political speech dataset to train the LDA model. My goal here is to categorize each speech into some topics. But the problem is that the generated topics are too similar, and all ...
0
votes
0
answers
5
views
RuntimeError with DeBERTaV3 Sequence Classification: Tensor Size Mismatch
Iam trying to fine-tune the microsoft/deberta-v3-base model for sequence classification with three labels. I have set up my tokenizer and data preprocessing, but I encounter a RuntimeError during ...
-1
votes
0
answers
13
views
How can I use Word Embeddings for Sentiment Analysis?
I have a project where I've created a classifier but I've learned that word embeddings are a better approach.
From my search, I found that CBOW and Skip-grams are the methods to use with Word2Vec. I ...
1
vote
0
answers
14
views
CPU Memory Leak While Inference Models in Infinite Loop
I'm experiencing a CPU memory leak while running a Python script that processes text using various NLP models in an infinite loop. The script includes language translation, sentiment analysis, and ...
-2
votes
0
answers
28
views
Divide a text based on Intent Analysis with NLP
I have this input from a chat:
"Set an alarm for 7:00 am and play a song by Caparezza on Spotify."
The input may contain multiple actions to do on the back-end.
I want to divide a text based ...
0
votes
0
answers
26
views
CUDA error: device-side assert triggered Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions
i am trying to convert my text into its embeddings using a bert model , when i apply this to my my dataset it works fine for some of my inputs then stops and gives that error
i have set ...
-1
votes
0
answers
24
views
Poor Performance and Signs of Overfitting When Fine-Tuning BART with Adapters on CNN/DailyMail Dataset
I am currently fine-tuning the BART model with adapters for a summarization task using the CNN/DailyMail dataset. I've noticed that the model shows poor performance and signs of overfitting. Below is ...
1
vote
0
answers
15
views
execute lucene query in multiple language utilizing AI Model
We have requirement to support multiple language search for the same field. for example title is "Badminton" and subject is "sports" I want to search in solr like title:Badminton ...
-1
votes
0
answers
11
views
Hybridized collaborative filtering and sentence similarity-based system for doctor recommendation based on user input of symptoms and location
I'm trying to solve a problem of recommending a doctor based on a user's symptoms and location using a hybridized collaborative filtering and sentence similarity-based recommender system that follow ...
1
vote
0
answers
22
views
Multitasking bert for multilabel classification of 5 categories
I built and finetuned 5 BioClinicalBERT-based models (finetuned bert) to predict labels for medical records for the following categories:
specialties = ["aud","den","oph",...
1
vote
0
answers
8
views
Hugging Face pipeline vs manual processing produces different embeddings for Vision Transformers
I am using the transformers library with the ViTForImageClassification model ('google/vit-base-patch16-224') to extract embeddings from images. However, I am observing different embeddings when I use ...
0
votes
0
answers
14
views
RuntimeError: Failed to import transformers.training_args
I am trying to use transformers in a task of building a chatbot
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, GenerationConfig, TrainingArguments, trainer
import torch
import time
...
0
votes
0
answers
33
views
How do I run this model in HuggingFace from Nvidia and Mistral?
The model is:
nvidia/Mistral-NeMo-12B-Instruct
And the link in HuggingFace nvidia/Mistral-NeMo-12B-Instruct
Most model pages in HuggingFace have example Python code.
But this model page doesn't have ...
0
votes
0
answers
13
views
BPE tokenizer add_tokens overlap with trained tokens
I am training a BPE from scratch. I want the vocabulary to include certain tokens that might or might not exist in the training dataset.
from datasets import load_dataset
from tokenizers import models,...
0
votes
0
answers
35
views
Separating text into smaller chunks based on meaning
I am working on a project involving approximately 8,000 job advertisements in CSV format. I have extracted job titles, IDs, descriptions, and other relevant information and saved it in a PostgreSQL ...
0
votes
0
answers
18
views
Transformer models for contextual word embedding in large datasets
I'm interested in using contextual word embeddings generated by a transformer-based model to explore the similarity of certain words in a large dataset.
Most transformer models only allow up to 512 ...
0
votes
0
answers
66
views
CUDA out of memory when training Llama-2-7b-hf model locally
I want to finetune meta-llama/Llama-2-7b-hf locally on my laptop. I am running out of CUDA memory when instantiating the Trainer class. I have 16Gb system RAM and a GTX 1060 with 6 Gb of GPU memory. I ...
0
votes
0
answers
8
views
Fine-Tuning T5 for Question Answering using HuggingFace Transformers, Pytorch Lightning & Python
when try follow video on finetuning T5 on Question Answering
link: https://www.youtube.com/watch?v=r6XY80Z9eSA&list=RDCMUCoW_WzQNJVAjxo4osNAxd_g&index=1
when i run
53 trainer.fit(model,...
0
votes
0
answers
22
views
Is updating points in Qdrant vectordb without re-embedding the data safe?
I'm building a RAG chatbot using Langchain, using the data I've stored in a Qdrant vector database.
I wanted to change the metadata of a few documents in my qdrant vector database.
For this, I stored ...
0
votes
0
answers
17
views
Transformer Model Repeating Same Codon During Inference Despite High Training Accuracy
I'm working on a transformer-based model to translate amino acids to codons. During training and validation, my model achieves 95-98% accuracy. However, during inference, I encounter an issue where ...
-1
votes
0
answers
26
views
How to Estimate GPU Memory for training and inference, Data Requirements, and Training Time for Large Language Models?
This is a very concrete and well-defined computer engineering question. I don't understand why someone would want to close it.
Today, I faced this question during an interview for an ML Engineer ...
-1
votes
0
answers
38
views
+50
How to use HuggingFace's run_translation.py script to train a translation from scratch?
I tried various HuggingFace scripts to build language models, such as run_mlm.py (link), run_clm.py (link) and run_translation.py (link). For the former 2 scripts, it can train a language model from ...
0
votes
0
answers
18
views
Training LLM uses unexpected amount of GPU memory
I'm training model with self-implemented training loops. A 1.5B Qwen2 occupies 40G of GPU memory. When I did the same training using llama factory, it only takes about 24G.
I tried to delete some ...
-1
votes
0
answers
19
views
what kind of pre-processing is required to apply on sentence before passing it dependency parser?
I'm trying out sentiment analysis where I convert the sentence into a Graph with nodes being word embedding and edges being dependency between the two words. I'm still confused how exactly should I ...
0
votes
0
answers
18
views
Finetuning BERT on classification task, tensor device mismatch error
I'm having trouble on fine-tuning a BERT model on a classification task, as I'm quite new to this. My data is composed of two columns, "item_title" (my input) and "meta_categ_id" (...
-1
votes
0
answers
48
views
cleaning list object containing text and creating new variables using Python
I am trying to create a data frame running the following code -
# pip install edgartools
import pandas as pd
from edgar import *
# Tell the SEC who you are
set_identity("Your Name youremail@...
0
votes
0
answers
37
views
ValueError: expected sequence of length 129 at dim 1 (got 46)
I was trying to fine-tune an image-to-text model using the following code:
import json
import torch
from torch.utils.data import DataLoader
import io
from transformers import VisionEncoderDecoderModel,...
0
votes
0
answers
22
views
Huggingface Trainer CUDA Out Of Memory for 500M Model
I'm training MobiLLama for classification. This model is just 500Million Parameters and when I fine-tune it for the downstream tasks, the trainer keep giving me the CUDA out of memory error.
I faced ...
-1
votes
0
answers
9
views
I want to evaluate the three models which are LDA, LSM and CTM for my data based on coherence score?
My name is Phani. I want to choose which is the best model i.e Latent Dirichlet Allocation, Latent Semantic Analysis and Correlated Topic Model for my data. I already preprocessed the data but I want ...
0
votes
0
answers
25
views
special_tokens parameter of SentencePieceBPETokenizer.train_from_iterator()
I want to train a custom tokenizer from scratch. Following some online tutorials, they suggest adding a series of special tokens to the train_from_iterator() function:
special_tokens = ["<unk&...
-1
votes
0
answers
18
views
Got `disk_offload` error while trying to get the LLma3 model from Hugging face
import torch
from transformers import AutoModelForCausalLM,AutoTokenizer
from llama_index.llms.huggingface import HuggingFaceLLM
from accelerate import disk_offload
tokenizer = AutoTokenizer....
0
votes
0
answers
20
views
Huggingface trainer with 2 optimizers
Is there any way to use the huggingface trainer with 2 optimizers? I need to train 2 parts of my model iteratively, but the Trainer object seems to only take on optimizer.
Thanks!
0
votes
0
answers
14
views
How does the transformer model's attention mechanism deal with differing sequence lengths?
I am going through the architecture of the transformer and its attention mechanism. The thing I don't get about this mechanism is how it handles sequences of different lengths. For example:
How does ...
-1
votes
0
answers
7
views
Does DBCV score for density based clustering algorithms reward more granular clusters?
I am trying to run a hyperparameter search for HDBSCAN based on the DBCV scores. From what I observe, the DBCV score is generally higher for more granular clusters. Is it because DBCV rewards granular ...
0
votes
0
answers
19
views
Knowing the format of dataset a pretrained model was trained on
i am working on a Multilingual TTS project , and developing a TTS for my regional language by using a pretrained model from Hugging Face hub , the model i am trying to fine tune is facebook-mms-tts ...
2
votes
0
answers
30
views
DSPy can't retrieve passage with text embeddings in ChromaDB
I am working on a RAG application using DSPy and ChromaDB for pdf files.
At first I fetched the text from the pdf and add it to the Chromadb as chunks. Also added the embeddings of the chunks. And ...
0
votes
0
answers
32
views
How to use HuggingFace's Transformers.js to distill messy dictionary definitions down to a clean array of 1-3 word definitions?
Background
What pieces would need to be involved using Transformers.js to distill/summarize/clean dictionary definitions which are messy and full of "junk", and return a JSON array of short, ...
0
votes
0
answers
10
views
Issue with Data Preprocessing and Tensor Concatenation for Whisper Model Training
I am trying to train a Whisper model for Jeju dialect speech recognition. However, I am encountering several errors related to tensor concatenation during the data preprocessing phase. Below is the ...
-1
votes
0
answers
38
views
Group similar vectors in high-dimensional vector space into "spaces/partitions" with unique IDs being assigned per similar group
Clarifying Example
As a contrived example, let's say I have vectors in some R^3 vector space:
A: [1, 2, 3]
B: [1.02, 2.5, 3]
C: [1512, 123, 51]
I'd like to partition this space into N "slices/...
1
vote
0
answers
32
views
How to load LoRA weights for image classification model
I trained a model like below.
model_name = 'owkin/phikon'
model = AutoModelForImageClassification.from_pretrained(
model_name,
label2id=label2id,
id2label=id2label,
...
0
votes
0
answers
12
views
IndexError: string index out of range in bert NER model
I have this code:
!pip install datasets
!pip install transformers
from datasets import load_dataset
raw_dataset= load_dataset("Amir13/wnut2017-persian")
print(raw_dataset)
print(...
0
votes
0
answers
36
views
ImportError: Using the `Trainer` with `PyTorch` requires `accelerate>=0.21.0`
I am trying to fine tune a Bert Pretrained model and I am using Transformers trainer and I use the TrainingArguments to tune some hyperparameters
training_args = TrainingArguments(
output_dir='/...
0
votes
0
answers
16
views
Video Object Segmentation Accuracy
I have a question for measuring accuracy of video segmentation models. I see that these models use 2 measures of success generaly: Jaccard Index and Contour Accuracy. However, I don't really ever see ...
0
votes
0
answers
77
views
Can't install spaCy (and thinc) on Python
When I try to install spaCy to use with chatterBot (somehow it didn't download with ChatterBot), first I got an error because I did not have the Cython module installed into my virtual environment, ...
0
votes
0
answers
20
views
OOM Error using PPO Trainer to LoRa-tune 4-bit Llama-3-8B Model (TRL Hugging Face Library)
As per the standard for PPO Training (which is to do supervised-fine tuning before running the PPO Algorithm) I did a QLoRa fine-tuning of the Llama-3-8B instruct model using my own custom data and ...
0
votes
0
answers
42
views
BERT embedding cosine similarities look very random and useless
I thought you can use BERT embeddings to determine semantic similarity. I was trying to group some words in categories using this, but the results were very bad.
E.g. here is a small example with ...
0
votes
0
answers
24
views
How do I install language model for spacy on Kaggle?
Aloha! Everybody knows how to install model at home:
python -m spacy download ru_core_news_md
But since python notebook on Kaggle is isolated of the global web, it does not seem possible to do so.
...
0
votes
0
answers
38
views
module 'keras_nlp' has no attribute 'models
HAS ANYONE ELSE EXPERIENCED THE SAME ERROR WHEN RUNNING IT LOCALLY? IT RUNS CORRECTLY ON COLAB.
module 'keras_nlp' has no attribute 'models
i
Tried to install the updated version of
pip install -U ...
0
votes
0
answers
25
views
“Bus Error and Resource Tracker Warning When Training PyTorch Model on GPU with MPS”
I’ve built a vanilla Transformer using PyTorch for machine translation and am encountering issues while trying to train it on an Apple Mac M3 with a 12-core CPU and an 18-core GPU (18GB RAM) ...
Members can contribute articles
Simply submit a proposal, get it approved, and publish it.
See how the process works
Simply submit a proposal, get it approved, and publish it.
See how the process works