NLP Collective

0 votes

0 answers

6 views

GGUF model in LM Studio returns broken answer

I try to run LLM GGUF model QuantFactory/T-lite-instruct-0.1-GGUF specifically its quantized version T-lite-instruct-0.1.Q2_K.gguf in LM Studio. Sometimes it works fine. But sometimes it returns "...

pav

99

asked 13 hours ago

1 vote

0 answers

13 views

LDA is predicting same topics for all data

I'm using the German political speech dataset to train the LDA model. My goal here is to categorize each speech into some topics. But the problem is that the generated topics are too similar, and all ...

Ryu Ahmed

11

asked 13 hours ago

0 votes

0 answers

5 views

RuntimeError with DeBERTaV3 Sequence Classification: Tensor Size Mismatch

Iam trying to fine-tune the microsoft/deberta-v3-base model for sequence classification with three labels. I have set up my tokenizer and data preprocessing, but I encounter a RuntimeError during ...

suri

21

asked 14 hours ago

-1 votes

0 answers

13 views

How can I use Word Embeddings for Sentiment Analysis?

I have a project where I've created a classifier but I've learned that word embeddings are a better approach. From my search, I found that CBOW and Skip-grams are the methods to use with Word2Vec. I ...

LoukasPap

1,350

asked 15 hours ago

1 vote

0 answers

14 views

CPU Memory Leak While Inference Models in Infinite Loop

I'm experiencing a CPU memory leak while running a Python script that processes text using various NLP models in an infinite loop. The script includes language translation, sentiment analysis, and ...

Amritesh Nandan

41

asked 18 hours ago

-2 votes

0 answers

28 views

Divide a text based on Intent Analysis with NLP

I have this input from a chat: "Set an alarm for 7:00 am and play a song by Caparezza on Spotify." The input may contain multiple actions to do on the back-end. I want to divide a text based ...

flowibbia

17

asked 23 hours ago

0 votes

0 answers

26 views

CUDA error: device-side assert triggered Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions

i am trying to convert my text into its embeddings using a bert model , when i apply this to my my dataset it works fine for some of my inputs then stops and gives that error i have set ...

Gaurav B.V

1

asked yesterday

-1 votes

0 answers

24 views

Poor Performance and Signs of Overfitting When Fine-Tuning BART with Adapters on CNN/DailyMail Dataset

I am currently fine-tuning the BART model with adapters for a summarization task using the CNN/DailyMail dataset. I've noticed that the model shows poor performance and signs of overfitting. Below is ...

Emilia Delizia

349

asked yesterday

1 vote

0 answers

15 views

execute lucene query in multiple language utilizing AI Model

We have requirement to support multiple language search for the same field. for example title is "Badminton" and subject is "sports" I want to search in solr like title:Badminton ...

Jigar Gajjar

333

asked 2 days ago

-1 votes

0 answers

11 views

Hybridized collaborative filtering and sentence similarity-based system for doctor recommendation based on user input of symptoms and location

I'm trying to solve a problem of recommending a doctor based on a user's symptoms and location using a hybridized collaborative filtering and sentence similarity-based recommender system that follow ...

Sadura Akinrinwa

1

asked 2 days ago

1 vote

0 answers

22 views

Multitasking bert for multilabel classification of 5 categories

I built and finetuned 5 BioClinicalBERT-based models (finetuned bert) to predict labels for medical records for the following categories: specialties = ["aud","den","oph",...

FATMA HAMZA

9

asked 2 days ago

1 vote

0 answers

8 views

Hugging Face pipeline vs manual processing produces different embeddings for Vision Transformers

I am using the transformers library with the ViTForImageClassification model ('google/vit-base-patch16-224') to extract embeddings from images. However, I am observing different embeddings when I use ...

martinelliadr

11

asked 2 days ago

0 votes

0 answers

14 views

RuntimeError: Failed to import transformers.training_args

I am trying to use transformers in a task of building a chatbot from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, GenerationConfig, TrainingArguments, trainer import torch import time ...

Chawki.Hjaiji

1

asked 2 days ago

0 votes

0 answers

33 views

How do I run this model in HuggingFace from Nvidia and Mistral?

The model is: nvidia/Mistral-NeMo-12B-Instruct And the link in HuggingFace nvidia/Mistral-NeMo-12B-Instruct Most model pages in HuggingFace have example Python code. But this model page doesn't have ...

abbas-h

420

asked Jul 23 at 7:21

0 votes

0 answers

13 views

BPE tokenizer add_tokens overlap with trained tokens

I am training a BPE from scratch. I want the vocabulary to include certain tokens that might or might not exist in the training dataset. from datasets import load_dataset from tokenizers import models,...

meliksahturker

1,404

asked Jul 22 at 19:11

0 votes

0 answers

35 views

Separating text into smaller chunks based on meaning

I am working on a project involving approximately 8,000 job advertisements in CSV format. I have extracted job titles, IDs, descriptions, and other relevant information and saved it in a PostgreSQL ...

Ameya

1

asked Jul 22 at 15:21

0 votes

0 answers

18 views

Transformer models for contextual word embedding in large datasets

I'm interested in using contextual word embeddings generated by a transformer-based model to explore the similarity of certain words in a large dataset. Most transformer models only allow up to 512 ...

C_B

13

asked Jul 22 at 13:48

0 votes

0 answers

66 views

CUDA out of memory when training Llama-2-7b-hf model locally

I want to finetune meta-llama/Llama-2-7b-hf locally on my laptop. I am running out of CUDA memory when instantiating the Trainer class. I have 16Gb system RAM and a GTX 1060 with 6 Gb of GPU memory. I ...

Vinmean

113

asked Jul 22 at 2:27

0 votes

0 answers

8 views

Fine-Tuning T5 for Question Answering using HuggingFace Transformers, Pytorch Lightning & Python

when try follow video on finetuning T5 on Question Answering link: https://www.youtube.com/watch?v=r6XY80Z9eSA&list=RDCMUCoW_WzQNJVAjxo4osNAxd_g&index=1 when i run 53 trainer.fit(model,...

Nhất Duy Nguyễn Trần

1

asked Jul 21 at 22:03

0 votes

0 answers

22 views

Is updating points in Qdrant vectordb without re-embedding the data safe?

I'm building a RAG chatbot using Langchain, using the data I've stored in a Qdrant vector database. I wanted to change the metadata of a few documents in my qdrant vector database. For this, I stored ...

Akshitha Rao

11

asked Jul 21 at 21:14

0 votes

0 answers

17 views

Transformer Model Repeating Same Codon During Inference Despite High Training Accuracy

I'm working on a transformer-based model to translate amino acids to codons. During training and validation, my model achieves 95-98% accuracy. However, during inference, I encounter an issue where ...

Farshid B

1

asked Jul 21 at 11:06

-1 votes

0 answers

26 views

How to Estimate GPU Memory for training and inference, Data Requirements, and Training Time for Large Language Models?

This is a very concrete and well-defined computer engineering question. I don't understand why someone would want to close it. Today, I faced this question during an interview for an ML Engineer ...

maplemaple

1,435

asked Jul 20 at 7:32

-1 votes

0 answers

38 views

+50

How to use HuggingFace's run_translation.py script to train a translation from scratch?

I tried various HuggingFace scripts to build language models, such as run_mlm.py (link), run_clm.py (link) and run_translation.py (link). For the former 2 scripts, it can train a language model from ...

Raptor

53.6k

asked Jul 19 at 14:53

0 votes

0 answers

18 views

Training LLM uses unexpected amount of GPU memory

I'm training model with self-implemented training loops. A 1.5B Qwen2 occupies 40G of GPU memory. When I did the same training using llama factory, it only takes about 24G. I tried to delete some ...

StaEx_G

13

asked Jul 19 at 10:02

-1 votes

0 answers

19 views

what kind of pre-processing is required to apply on sentence before passing it dependency parser?

I'm trying out sentiment analysis where I convert the sentence into a Graph with nodes being word embedding and edges being dependency between the two words. I'm still confused how exactly should I ...

Harsh Chauhan

1

asked Jul 19 at 6:57

0 votes

0 answers

18 views

Finetuning BERT on classification task, tensor device mismatch error

I'm having trouble on fine-tuning a BERT model on a classification task, as I'm quite new to this. My data is composed of two columns, "item_title" (my input) and "meta_categ_id" (...

Jerry Zhu

1

asked Jul 18 at 20:01

-1 votes

0 answers

48 views

cleaning list object containing text and creating new variables using Python

I am trying to create a data frame running the following code - # pip install edgartools import pandas as pd from edgar import * # Tell the SEC who you are set_identity("Your Name youremail@...

Sharif

177

asked Jul 18 at 19:23

0 votes

0 answers

37 views

ValueError: expected sequence of length 129 at dim 1 (got 46)

I was trying to fine-tune an image-to-text model using the following code: import json import torch from torch.utils.data import DataLoader import io from transformers import VisionEncoderDecoderModel,...

demostene

1

asked Jul 18 at 18:40

0 votes

0 answers

22 views

Huggingface Trainer CUDA Out Of Memory for 500M Model

I'm training MobiLLama for classification. This model is just 500Million Parameters and when I fine-tune it for the downstream tasks, the trainer keep giving me the CUDA out of memory error. I faced ...

Hoangdz

187

asked Jul 18 at 16:28

-1 votes

0 answers

9 views

I want to evaluate the three models which are LDA, LSM and CTM for my data based on coherence score?

My name is Phani. I want to choose which is the best model i.e Latent Dirichlet Allocation, Latent Semantic Analysis and Correlated Topic Model for my data. I already preprocessed the data but I want ...

Phaneswar Manchina

1

asked Jul 18 at 14:19

0 votes

0 answers

25 views

special_tokens parameter of SentencePieceBPETokenizer.train_from_iterator()

I want to train a custom tokenizer from scratch. Following some online tutorials, they suggest adding a series of special tokens to the train_from_iterator() function: special_tokens = ["<unk&...

Raptor

53.6k

asked Jul 18 at 9:01

-1 votes

0 answers

18 views

Got `disk_offload` error while trying to get the LLma3 model from Hugging face

import torch from transformers import AutoModelForCausalLM,AutoTokenizer from llama_index.llms.huggingface import HuggingFaceLLM from accelerate import disk_offload tokenizer = AutoTokenizer....

Vins Shaji

1

asked Jul 18 at 6:57

0 votes

0 answers

20 views

Huggingface trainer with 2 optimizers

Is there any way to use the huggingface trainer with 2 optimizers? I need to train 2 parts of my model iteratively, but the Trainer object seems to only take on optimizer. Thanks!

Sandy

143

asked Jul 17 at 22:34

0 votes

0 answers

14 views

How does the transformer model's attention mechanism deal with differing sequence lengths?

I am going through the architecture of the transformer and its attention mechanism. The thing I don't get about this mechanism is how it handles sequences of different lengths. For example: How does ...

Syed Mustaqhim

466

asked Jul 17 at 17:29

-1 votes

0 answers

7 views

Does DBCV score for density based clustering algorithms reward more granular clusters?

I am trying to run a hyperparameter search for HDBSCAN based on the DBCV scores. From what I observe, the DBCV score is generally higher for more granular clusters. Is it because DBCV rewards granular ...

Tanay

179

asked Jul 17 at 16:40

0 votes

0 answers

19 views

Knowing the format of dataset a pretrained model was trained on

i am working on a Multilingual TTS project , and developing a TTS for my regional language by using a pretrained model from Hugging Face hub , the model i am trying to fine tune is facebook-mms-tts ...

Injila

1

asked Jul 17 at 11:34

2 votes

0 answers

30 views

DSPy can't retrieve passage with text embeddings in ChromaDB

I am working on a RAG application using DSPy and ChromaDB for pdf files. At first I fetched the text from the pdf and add it to the Chromadb as chunks. Also added the embeddings of the chunks. And ...

Anandu Aji

41

asked Jul 17 at 8:03

0 votes

0 answers

32 views

How to use HuggingFace's Transformers.js to distill messy dictionary definitions down to a clean array of 1-3 word definitions?

Background What pieces would need to be involved using Transformers.js to distill/summarize/clean dictionary definitions which are messy and full of "junk", and return a JSON array of short, ...

Lance

77.9k

asked Jul 17 at 1:56

0 votes

0 answers

10 views

Issue with Data Preprocessing and Tensor Concatenation for Whisper Model Training

I am trying to train a Whisper model for Jeju dialect speech recognition. However, I am encountering several errors related to tensor concatenation during the data preprocessing phase. Below is the ...

dw26

1

asked Jul 17 at 1:45

-1 votes

0 answers

38 views

Group similar vectors in high-dimensional vector space into "spaces/partitions" with unique IDs being assigned per similar group

Clarifying Example As a contrived example, let's say I have vectors in some R^3 vector space: A: [1, 2, 3] B: [1.02, 2.5, 3] C: [1512, 123, 51] I'd like to partition this space into N "slices/...

Matthew Trent

3,134

asked Jul 16 at 21:22

1 vote

0 answers

32 views

How to load LoRA weights for image classification model

I trained a model like below. model_name = 'owkin/phikon' model = AutoModelForImageClassification.from_pretrained( model_name, label2id=label2id, id2label=id2label, ...

Wtow

108

asked Jul 16 at 15:53

0 votes

0 answers

12 views

IndexError: string index out of range in bert NER model

I have this code: !pip install datasets !pip install transformers from datasets import load_dataset raw_dataset= load_dataset("Amir13/wnut2017-persian") print(raw_dataset) print(...

abbas mahmudi

175

asked Jul 16 at 15:06

0 votes

0 answers

36 views

ImportError: Using the `Trainer` with `PyTorch` requires `accelerate>=0.21.0`

I am trying to fine tune a Bert Pretrained model and I am using Transformers trainer and I use the TrainingArguments to tune some hyperparameters training_args = TrainingArguments( output_dir='/...

Chawki.Hjaiji

1

asked Jul 16 at 10:49

0 votes

0 answers

16 views

Video Object Segmentation Accuracy

I have a question for measuring accuracy of video segmentation models. I see that these models use 2 measures of success generaly: Jaccard Index and Contour Accuracy. However, I don't really ever see ...

Joe Tsai

1

asked Jul 16 at 4:51

0 votes

0 answers

77 views

Can't install spaCy (and thinc) on Python

When I try to install spaCy to use with chatterBot (somehow it didn't download with ChatterBot), first I got an error because I did not have the Cython module installed into my virtual environment, ...

Tio Zuca

109

asked Jul 15 at 23:45

0 votes

0 answers

20 views

OOM Error using PPO Trainer to LoRa-tune 4-bit Llama-3-8B Model (TRL Hugging Face Library)

As per the standard for PPO Training (which is to do supervised-fine tuning before running the PPO Algorithm) I did a QLoRa fine-tuning of the Llama-3-8B instruct model using my own custom data and ...

Aryaman Jaggi

1

asked Jul 15 at 2:45

0 votes

0 answers

42 views

BERT embedding cosine similarities look very random and useless

I thought you can use BERT embeddings to determine semantic similarity. I was trying to group some words in categories using this, but the results were very bad. E.g. here is a small example with ...

mihovg93

93

asked Jul 13 at 20:58

0 votes

0 answers

24 views

How do I install language model for spacy on Kaggle?

Aloha! Everybody knows how to install model at home: python -m spacy download ru_core_news_md But since python notebook on Kaggle is isolated of the global web, it does not seem possible to do so. ...

Dimas del Pablo

19

asked Jul 13 at 14:00

0 votes

0 answers

38 views

module 'keras_nlp' has no attribute 'models

HAS ANYONE ELSE EXPERIENCED THE SAME ERROR WHEN RUNNING IT LOCALLY? IT RUNS CORRECTLY ON COLAB. module 'keras_nlp' has no attribute 'models i Tried to install the updated version of pip install -U ...

Mody

1

asked Jul 13 at 10:27

0 votes

0 answers

25 views

“Bus Error and Resource Tracker Warning When Training PyTorch Model on GPU with MPS”

I’ve built a vanilla Transformer using PyTorch for machine translation and am encountering issues while trying to train it on an Apple Mac M3 with a 12-core CPU and an 18-core GPU (18GB RAM) ...

Pratheesh Kumar

1

asked Jul 13 at 6:23

Collectives™ on Stack Overflow

Questions

7,946 questions