Skip to main content
The 2024 Developer Survey results are live! See the results

Questions tagged [huggingface-transformers]

Transformers is a Python library that implements various transformer NLP models in PyTorch and Tensorflow.

huggingface-transformers
0 votes
0 answers
5 views

BertTokenizer vocab_size remains unchanged after adding tokens

I am using HuggingFace BertTokenizer and adding some tokens to it. Here are the codes: from transformers import BertTokenizer tokenizer = BertTokenizer.from_pretrained('fnlp/bart-base-chinese') print(...
Raptor's user avatar
  • 53.6k
0 votes
1 answer
10 views

SgaeMaker training: what's the correct REGEX patrern to capture metrics?

This is the pattern I've seen suggested in a few different posts on SO: metric_definitions = [ {'Name': 'loss', 'Regex': "'loss': ([0-9]+(.|e\-)[0-9]+),?"}, {'Name': 'learning_rate', ...
Yoan B. M.Sc's user avatar
  • 1,503
0 votes
0 answers
5 views

RuntimeError with DeBERTaV3 Sequence Classification: Tensor Size Mismatch

Iam trying to fine-tune the microsoft/deberta-v3-base model for sequence classification with three labels. I have set up my tokenizer and data preprocessing, but I encounter a RuntimeError during ...
suri's user avatar
  • 21
1 vote
0 answers
18 views

CPU Memory Leak While Inference Models in Infinite Loop

I'm experiencing a CPU memory leak while running a Python script that processes text using various NLP models in an infinite loop. The script includes language translation, sentiment analysis, and ...
Amritesh Nandan's user avatar
1 vote
0 answers
8 views

Hugging Face pipeline vs manual processing produces different embeddings for Vision Transformers

I am using the transformers library with the ViTForImageClassification model ('google/vit-base-patch16-224') to extract embeddings from images. However, I am observing different embeddings when I use ...
martinelliadr's user avatar
0 votes
0 answers
15 views

RuntimeError: Failed to import transformers.training_args

I am trying to use transformers in a task of building a chatbot from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, GenerationConfig, TrainingArguments, trainer import torch import time ...
Chawki.Hjaiji's user avatar
0 votes
0 answers
34 views

How do I run this model in HuggingFace from Nvidia and Mistral?

The model is: nvidia/Mistral-NeMo-12B-Instruct And the link in HuggingFace nvidia/Mistral-NeMo-12B-Instruct Most model pages in HuggingFace have example Python code. But this model page doesn't have ...
abbas-h's user avatar
  • 420
0 votes
1 answer
18 views

HF transformers: ValueError: Unable to create tensor

I was following this guide for text classification and i gotten and error: ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=...
Ryan's user avatar
  • 402
0 votes
0 answers
13 views

BPE tokenizer add_tokens overlap with trained tokens

I am training a BPE from scratch. I want the vocabulary to include certain tokens that might or might not exist in the training dataset. from datasets import load_dataset from tokenizers import models,...
meliksahturker's user avatar
0 votes
0 answers
66 views

CUDA out of memory when training Llama-2-7b-hf model locally

I want to finetune meta-llama/Llama-2-7b-hf locally on my laptop. I am running out of CUDA memory when instantiating the Trainer class. I have 16Gb system RAM and a GTX 1060 with 6 Gb of GPU memory. I ...
Vinmean's user avatar
  • 113
-1 votes
1 answer
34 views

IndexError: list index out of range, when trying to predict from the fine tuned model using Hugginface

i am trying to learn on how to fine tune a pretrained model and use it. this is my code from transformers import AutoModelForSequenceClassification, AutoTokenizer, TrainingArguments, Trainer from ...
Lijin Durairaj's user avatar
-1 votes
0 answers
40 views
+50

How to use HuggingFace's run_translation.py script to train a translation from scratch?

I tried various HuggingFace scripts to build language models, such as run_mlm.py (link), run_clm.py (link) and run_translation.py (link). For the former 2 scripts, it can train a language model from ...
Raptor's user avatar
  • 53.6k
0 votes
0 answers
18 views

Training LLM uses unexpected amount of GPU memory

I'm training model with self-implemented training loops. A 1.5B Qwen2 occupies 40G of GPU memory. When I did the same training using llama factory, it only takes about 24G. I tried to delete some ...
StaEx_G's user avatar
  • 13
0 votes
0 answers
18 views

Finetuning BERT on classification task, tensor device mismatch error

I'm having trouble on fine-tuning a BERT model on a classification task, as I'm quite new to this. My data is composed of two columns, "item_title" (my input) and "meta_categ_id" (...
Jerry Zhu's user avatar
0 votes
0 answers
37 views

ValueError: expected sequence of length 129 at dim 1 (got 46)

I was trying to fine-tune an image-to-text model using the following code: import json import torch from torch.utils.data import DataLoader import io from transformers import VisionEncoderDecoderModel,...
demostene's user avatar

15 30 50 per page
1
2 3 4 5
230