Questions tagged [huggingface-transformers]
Transformers is a Python library that implements various transformer NLP models in PyTorch and Tensorflow.
huggingface-transformers
3,449
questions
0
votes
0
answers
5
views
BertTokenizer vocab_size remains unchanged after adding tokens
I am using HuggingFace BertTokenizer and adding some tokens to it. Here are the codes:
from transformers import BertTokenizer
tokenizer = BertTokenizer.from_pretrained('fnlp/bart-base-chinese')
print(...
0
votes
1
answer
10
views
SgaeMaker training: what's the correct REGEX patrern to capture metrics?
This is the pattern I've seen suggested in a few different posts on SO:
metric_definitions = [
{'Name': 'loss', 'Regex': "'loss': ([0-9]+(.|e\-)[0-9]+),?"},
{'Name': 'learning_rate', ...
0
votes
0
answers
5
views
RuntimeError with DeBERTaV3 Sequence Classification: Tensor Size Mismatch
Iam trying to fine-tune the microsoft/deberta-v3-base model for sequence classification with three labels. I have set up my tokenizer and data preprocessing, but I encounter a RuntimeError during ...
1
vote
0
answers
18
views
CPU Memory Leak While Inference Models in Infinite Loop
I'm experiencing a CPU memory leak while running a Python script that processes text using various NLP models in an infinite loop. The script includes language translation, sentiment analysis, and ...
1
vote
0
answers
8
views
Hugging Face pipeline vs manual processing produces different embeddings for Vision Transformers
I am using the transformers library with the ViTForImageClassification model ('google/vit-base-patch16-224') to extract embeddings from images. However, I am observing different embeddings when I use ...
0
votes
0
answers
15
views
RuntimeError: Failed to import transformers.training_args
I am trying to use transformers in a task of building a chatbot
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, GenerationConfig, TrainingArguments, trainer
import torch
import time
...
0
votes
0
answers
34
views
How do I run this model in HuggingFace from Nvidia and Mistral?
The model is:
nvidia/Mistral-NeMo-12B-Instruct
And the link in HuggingFace nvidia/Mistral-NeMo-12B-Instruct
Most model pages in HuggingFace have example Python code.
But this model page doesn't have ...
0
votes
1
answer
18
views
HF transformers: ValueError: Unable to create tensor
I was following this guide for text classification
and i gotten and error:
ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=...
0
votes
0
answers
13
views
BPE tokenizer add_tokens overlap with trained tokens
I am training a BPE from scratch. I want the vocabulary to include certain tokens that might or might not exist in the training dataset.
from datasets import load_dataset
from tokenizers import models,...
0
votes
0
answers
66
views
CUDA out of memory when training Llama-2-7b-hf model locally
I want to finetune meta-llama/Llama-2-7b-hf locally on my laptop. I am running out of CUDA memory when instantiating the Trainer class. I have 16Gb system RAM and a GTX 1060 with 6 Gb of GPU memory. I ...
-1
votes
1
answer
34
views
IndexError: list index out of range, when trying to predict from the fine tuned model using Hugginface
i am trying to learn on how to fine tune a pretrained model and use it. this is my code
from transformers import AutoModelForSequenceClassification, AutoTokenizer, TrainingArguments, Trainer
from ...
-1
votes
0
answers
40
views
+50
How to use HuggingFace's run_translation.py script to train a translation from scratch?
I tried various HuggingFace scripts to build language models, such as run_mlm.py (link), run_clm.py (link) and run_translation.py (link). For the former 2 scripts, it can train a language model from ...
0
votes
0
answers
18
views
Training LLM uses unexpected amount of GPU memory
I'm training model with self-implemented training loops. A 1.5B Qwen2 occupies 40G of GPU memory. When I did the same training using llama factory, it only takes about 24G.
I tried to delete some ...
0
votes
0
answers
18
views
Finetuning BERT on classification task, tensor device mismatch error
I'm having trouble on fine-tuning a BERT model on a classification task, as I'm quite new to this. My data is composed of two columns, "item_title" (my input) and "meta_categ_id" (...
0
votes
0
answers
37
views
ValueError: expected sequence of length 129 at dim 1 (got 46)
I was trying to fine-tune an image-to-text model using the following code:
import json
import torch
from torch.utils.data import DataLoader
import io
from transformers import VisionEncoderDecoderModel,...