Newest 'fine-tuning+deep-learning' Questions

0 votes

0 answers

16 views

RuntimeError: The size of tensor a (128) must match the size of tensor b (122) at non-singleton dimension 2

Description Error During Fine-Tuning Nvidia TTS Fastpitch Model with Custom Dataset I am currently trying to fine-tune the FastPitch model from NVIDIA NeMo on a custom dataset but encountered the ...

Hasan Maqsood

1

asked Jun 20 at 12:38

0 votes

0 answers

41 views

Hyperparameter tuning for detectron2 maskrcnn

For the hyperparameter tuning, below code shows the configuration of the model, i can change its learning rate, iterations, batch size but i am stuck in changing its filter size, activation fuction,...

Cheng Kai Chew

1

asked May 31 at 12:32

1 vote

0 answers

114 views

Peft model from checkpoint leading into size missmatch

I have trained peft model and saved it in huggingface. No i want to merge it with base model. i have used following code. from peft import PeftModel, PeftConfig,AutoPeftModelForCausalLM from ...

Sandun Tharaka

11

asked May 27 at 17:35

0 votes

0 answers

20 views

Expected input batch_size (3) to match target batch_size (1)

Context: Image Captioning model via PyTorch Fine-Tune with 1 image and a list of captions Converting the image Error with prints that show that the size and shape are the same Before Outputs size: ...

Bolofo

21

asked May 8 at 13:41

0 votes

0 answers

301 views

Error while loading MISTRAL LLM for fine-tune. Qlora doesn't work but full works

if I try to load the model in this way : bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True) model = ...

Antonio

1

asked May 7 at 14:50

0 votes

0 answers

109 views

How can I use the encoder part of the whisper model and sending the output of the encoder to a classification head?

I want to use whisper for speech emotion recognition, and since whisper is an encoder-decoder architecture model, I only want to leverage the encoder part and add a classification head on top of it to ...

stanley101

1

asked May 6 at 15:24

0 votes

0 answers

14 views

Why does loss return negative?

` for _ in range(int(args["num_train_epochs"])): for step, batch in enumerate(train_dataloader): model.train() inputs = {"input_ids": batch[0].to(args["device"]), "...

Minh Trần Tuyết

1

asked May 1 at 5:23

0 votes

0 answers

41 views

Which mixed precision setting is in effect when it is set both in accelerate config and TrainingArguments?

I am using both HuggingFace's transformers library and accelerate library. When it comes to mixed precision setting, there are two places that I can set: At accelerate config, in the interface below ...

Forrest Sheng bao

11

asked Apr 25 at 23:01

0 votes

0 answers

41 views

Failed to execute (TrainDeepLearningModel) ArcGIS Pro

I was using pre-trained model for human settlement landsat-8 for collecting residential area in ArcGIS Pro. Because the result was not good, i tried to follow this article (https://doc.arcgis.com/en/...

Aden Muflih

1

asked Apr 24 at 3:56

0 votes

0 answers

134 views

Most efficient way to fine tune SDXL for range of product

I want to fine tune SDXL using LoRA on a range of product so that SDXL can generate images of those product later on. I have many products. What is the most efficient way to fine tune? Do I just train ...

DevEnma

121

asked Apr 18 at 2:30

1 vote

1 answer

2k views

Implement Dropout to pretrained Resnet Model in Pytorch

I am trying to implement Dropout to pretrained Resnet Model in Pytorch, and here is my code feats_list = [] for key, value in model._modules.items(): feats_list.append(value) for ...

sharon shen

15

asked Sep 12, 2023 at 16:06

0 votes

0 answers

403 views

After training the model using SFT, how do I load the model?

I have trained the model with the following code. from datasets import load_dataset from trl import SFTTrainer from transformers import AutoModel, DataCollatorForLanguageModeling, AutoTokenizer, ...

金坤东

1

asked Aug 17, 2023 at 2:41

0 votes

0 answers

295 views

Finetune Wav2vec2 for downstream speech classification

I want to finetune a wav2vec2 model by adding some more custom layers of my own on top of wav2vec for downstream task. Is there any easier way to do this, like just calling the the model without ...

user22329976

asked Aug 3, 2023 at 2:51

1 vote

0 answers

158 views

Low GPU utilization when training PyTorch Model on HPC server, but no issues on personal computer

I'm currently retraining/fine-tuning a visual transformer model (which was pretrained on ImageNet) on Cifar10. Unfortunately I have problems understanding the performance of the System. On my personal ...

lukasbm

68

asked Jun 28, 2023 at 15:34

-1 votes

1 answer

68 views

Determining the Optimal Approach for Fine-tuning a Pre-trained Neural Network on Images of Varying Sizes

In the context of fine-tuning a pre-trained neural network initially trained on 1024x1024 images, which method is more suitable for adapting a dataset containing images ranging from 320x120 to 320x320?...

morteza eskandarian

13

asked Jun 3, 2023 at 19:18

Collectives™ on Stack Overflow

All Questions

RuntimeError: The size of tensor a (128) must match the size of tensor b (122) at non-singleton dimension 2

Hyperparameter tuning for detectron2 maskrcnn

Peft model from checkpoint leading into size missmatch

Expected input batch_size (3) to match target batch_size (1)

Error while loading MISTRAL LLM for fine-tune. Qlora doesn't work but full works

How can I use the encoder part of the whisper model and sending the output of the encoder to a classification head?

Why does loss return negative?

Which mixed precision setting is in effect when it is set both in accelerate config and TrainingArguments?

Failed to execute (TrainDeepLearningModel) ArcGIS Pro

Most efficient way to fine tune SDXL for range of product

Implement Dropout to pretrained Resnet Model in Pytorch

After training the model using SFT, how do I load the model?

Finetune Wav2vec2 for downstream speech classification

Low GPU utilization when training PyTorch Model on HPC server, but no issues on personal computer

Determining the Optimal Approach for Fine-tuning a Pre-trained Neural Network on Images of Varying Sizes

Hot Network Questions

Collectives™ on Stack Overflow

All Questions

Related Tags