Skip to main content
The 2024 Developer Survey results are live! See the results

All Questions

0 votes
0 answers
16 views

RuntimeError: The size of tensor a (128) must match the size of tensor b (122) at non-singleton dimension 2

Description Error During Fine-Tuning Nvidia TTS Fastpitch Model with Custom Dataset I am currently trying to fine-tune the FastPitch model from NVIDIA NeMo on a custom dataset but encountered the ...
Hasan Maqsood's user avatar
0 votes
0 answers
41 views

Hyperparameter tuning for detectron2 maskrcnn

For the hyperparameter tuning, below code shows the configuration of the model, i can change its learning rate, iterations, batch size but i am stuck in changing its filter size, activation fuction,...
Cheng Kai Chew's user avatar
1 vote
0 answers
114 views

Peft model from checkpoint leading into size missmatch

I have trained peft model and saved it in huggingface. No i want to merge it with base model. i have used following code. from peft import PeftModel, PeftConfig,AutoPeftModelForCausalLM from ...
Sandun Tharaka's user avatar
0 votes
0 answers
20 views

Expected input batch_size (3) to match target batch_size (1)

Context: Image Captioning model via PyTorch Fine-Tune with 1 image and a list of captions Converting the image Error with prints that show that the size and shape are the same Before Outputs size: ...
Bolofo's user avatar
  • 21
0 votes
0 answers
301 views

Error while loading MISTRAL LLM for fine-tune. Qlora doesn't work but full works

if I try to load the model in this way : bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True) model = ...
Antonio's user avatar
0 votes
0 answers
109 views

How can I use the encoder part of the whisper model and sending the output of the encoder to a classification head?

I want to use whisper for speech emotion recognition, and since whisper is an encoder-decoder architecture model, I only want to leverage the encoder part and add a classification head on top of it to ...
stanley101's user avatar
0 votes
0 answers
14 views

Why does loss return negative?

` for _ in range(int(args["num_train_epochs"])): for step, batch in enumerate(train_dataloader): model.train() inputs = {"input_ids": batch[0].to(args["device"]), "...
Minh Trần Tuyết's user avatar
0 votes
0 answers
41 views

Which mixed precision setting is in effect when it is set both in accelerate config and TrainingArguments?

I am using both HuggingFace's transformers library and accelerate library. When it comes to mixed precision setting, there are two places that I can set: At accelerate config, in the interface below ...
Forrest Sheng bao's user avatar
0 votes
0 answers
41 views

Failed to execute (TrainDeepLearningModel) ArcGIS Pro

I was using pre-trained model for human settlement landsat-8 for collecting residential area in ArcGIS Pro. Because the result was not good, i tried to follow this article (https://doc.arcgis.com/en/...
Aden Muflih's user avatar
0 votes
0 answers
134 views

Most efficient way to fine tune SDXL for range of product

I want to fine tune SDXL using LoRA on a range of product so that SDXL can generate images of those product later on. I have many products. What is the most efficient way to fine tune? Do I just train ...
DevEnma's user avatar
  • 121
1 vote
1 answer
2k views

Implement Dropout to pretrained Resnet Model in Pytorch

I am trying to implement Dropout to pretrained Resnet Model in Pytorch, and here is my code feats_list = [] for key, value in model._modules.items(): feats_list.append(value) for ...
sharon shen's user avatar
0 votes
0 answers
403 views

After training the model using SFT, how do I load the model?

I have trained the model with the following code. from datasets import load_dataset from trl import SFTTrainer from transformers import AutoModel, DataCollatorForLanguageModeling, AutoTokenizer, ...
金坤东's user avatar
0 votes
0 answers
295 views

Finetune Wav2vec2 for downstream speech classification

I want to finetune a wav2vec2 model by adding some more custom layers of my own on top of wav2vec for downstream task. Is there any easier way to do this, like just calling the the model without ...
user avatar
1 vote
0 answers
158 views

Low GPU utilization when training PyTorch Model on HPC server, but no issues on personal computer

I'm currently retraining/fine-tuning a visual transformer model (which was pretrained on ImageNet) on Cifar10. Unfortunately I have problems understanding the performance of the System. On my personal ...
lukasbm's user avatar
  • 68
-1 votes
1 answer
68 views

Determining the Optimal Approach for Fine-tuning a Pre-trained Neural Network on Images of Varying Sizes

In the context of fine-tuning a pre-trained neural network initially trained on 1024x1024 images, which method is more suitable for adapting a dataset containing images ranging from 320x120 to 320x320?...
morteza eskandarian's user avatar

15 30 50 per page