Skip to main content
The 2024 Developer Survey results are live! See the results

All Questions

0 votes
0 answers
9 views

CUDA Out of Memory Error Despite Having Multiple GPUs

I'm encountering a CUDA out-of-memory error while trying to run a PyTorch model, even though my system has multiple NVIDIA GPUs. # Load the tokenizer and model tokenizer = AutoTokenizer....
Flying-Meta's user avatar
0 votes
0 answers
6 views

RuntimeError with DeBERTaV3 Sequence Classification: Tensor Size Mismatch

Iam trying to fine-tune the microsoft/deberta-v3-base model for sequence classification with three labels. I have set up my tokenizer and data preprocessing, but I encounter a RuntimeError during ...
suri's user avatar
  • 21
0 votes
0 answers
20 views

Training LLM uses unexpected amount of GPU memory

I'm training model with self-implemented training loops. A 1.5B Qwen2 occupies 40G of GPU memory. When I did the same training using llama factory, it only takes about 24G. I tried to delete some ...
StaEx_G's user avatar
  • 13
0 votes
0 answers
23 views

Huggingface Trainer CUDA Out Of Memory for 500M Model

I'm training MobiLLama for classification. This model is just 500Million Parameters and when I fine-tune it for the downstream tasks, the trainer keep giving me the CUDA out of memory error. I faced ...
Hoangdz's user avatar
  • 187
0 votes
0 answers
14 views

How does the transformer model's attention mechanism deal with differing sequence lengths?

I am going through the architecture of the transformer and its attention mechanism. The thing I don't get about this mechanism is how it handles sequences of different lengths. For example: How does ...
Syed Mustaqhim's user avatar
2 votes
1 answer
139 views

Saving Fine-tune Falcon HuggingFace LLM Model

I'm trying to save my model so it won't need to re-download the base model every time I want to use it but nothing seems to work for me, I would love your help with it. The following parameters are ...
Lidor Eliyahu Shelef's user avatar
0 votes
1 answer
53 views

GPT-2 model from hugging face always generate same result

Why were all the results I got from the GPT-2 model the same no matter what I fed into it? The following are my operating details. First I download the needed files from the official website. These ...
zhangtianpu's user avatar
0 votes
0 answers
20 views

OOM Error using PPO Trainer to LoRa-tune 4-bit Llama-3-8B Model (TRL Hugging Face Library)

As per the standard for PPO Training (which is to do supervised-fine tuning before running the PPO Algorithm) I did a QLoRa fine-tuning of the Llama-3-8B instruct model using my own custom data and ...
Aryaman Jaggi's user avatar
0 votes
0 answers
26 views

“Bus Error and Resource Tracker Warning When Training PyTorch Model on GPU with MPS”

I’ve built a vanilla Transformer using PyTorch for machine translation and am encountering issues while trying to train it on an Apple Mac M3 with a 12-core CPU and an 18-core GPU (18GB RAM) ...
Pratheesh Kumar's user avatar
0 votes
0 answers
53 views

Llama-3-70B with pipeline cannot generate new tokens (texts)

I have sucessfully downloaded Llama-3-70B, and when I want to test its "text-generation" ability, it always outputs my prompt and no other more texts. Here is my demo code (copied from ...
Martin's user avatar
  • 11
0 votes
1 answer
97 views

Size mismatch for embed_out.weight: copying a param with shape torch.Size([0]) from checkpoint - Huggingface PyTorch

I want to finetune an LLM. I am able to successfully finetune LLM. But when reload the model after save, gets error. Below is the code import argparse import numpy as np import torch from datasets ...
Masthan's user avatar
  • 685
0 votes
1 answer
112 views

Deepspeed : AttributeError: 'DummyOptim' object has no attribute 'step'

I want to use deepspeed for training LLMs along with Huggingface Trainer. But when I use deepspeed along with trainer I get error "AttributeError: 'DummyOptim' object has no attribute 'step'&...
Masthan's user avatar
  • 685
1 vote
0 answers
43 views

HuggingFace pipeline doesn't use multiple GPUs

I made a RAG app that basically answers user questions based on provided data, it works fine on GPU and a single GPU. I want to deploy it on multiple GPUs (4 T4s) but I always get CUDA out of Memory ...
Cihan Yalçın's user avatar
1 vote
0 answers
113 views

How to fine-tune merlinite 7B model in Python

I am new to LLM programming in Python and I am trying to fine-tune the instructlab/merlinite-7b-lab model on my Mac M1. My goal is to teach this model to a new music composer Xenobi Amilen I have ...
Salvatore D'angelo's user avatar
0 votes
0 answers
15 views

Huggingface Trainer logs different sample size than actual

I am trying to Finetune model. Here is the train-test split of my dataset - Train - 4746 (80%) Test - 1188 (20%) Here is my code snippet: training_args = TrainingArguments( bf16=True, # specify ...
quick_silver009's user avatar

15 30 50 per page
1
2 3 4 5
15