Newest 'fine-tuning' Questions

0 votes

1 answer

29 views

Exception: Cannot load model parameters from checkpoint /home/krish/content/1.2B_last_checkpoint.pt; please ensure that the architectures match

I am fine-tuning the M2M model, with 1.2B model as the last checkpoint. But while training the model I am getting this error that it cannot load the paramters and the model architechure should match ...

KRISH MANTRI

1

asked Jul 7 at 13:00

-2 votes

0 answers

25 views

Need Recommendations for Fine-Tuning LLMs on Azure ML: Best Practices [closed]

Context I am using Azure ML and I want to fine-tune a Large Language Model (LLM) on a dataset. Integration and Deployment Should I commit my code from VS Code to GitHub / Azure DevOps, then retrieve ...

Anas_LA

1

asked Jul 5 at 14:59

-1 votes

1 answer

66 views

How to prepare data for batch-inference in Azure ML?

The data format (.csv) that I am using for inferencing produces the error : "each data point should be a conversation array" when running the batch scoring job. All the documentations ...

S R

9

asked Jul 5 at 4:55

0 votes

0 answers

36 views

The issue of bitsandbytes package supporting CUDA 12.4 version

when running the peft fine-tuning program, execute the following code: model = get_peft_model(model, peft_config) report errors: Could not find the bitsandbytes CUDA binary at WindowsPath('D:/Users/1/...

paul qin

1

asked Jul 1 at 8:19

1 vote

0 answers

66 views

Fine tune llama3 with message replies like dataset (slack)

I want to fine tune llama3 on a dataset in which the data structure is a list of messages considering the below rules: there are channels. in each channel there are messages from all sort of users. ...

Ben

423

asked Jun 29 at 20:35

0 votes

0 answers

11 views

Fine-tunning model vs training from scrath

I used two approaches to train a YOLOv8X detection model. In the first approach, I split the dataset into three parts, trained the model from scratch on the first part, then fine-tuned it on the ...

Krilaria

21

asked Jun 27 at 15:46

-1 votes

0 answers

15 views

Model Training Does Not Update .bin File Size Despite Training

I'm experiencing an issue with my Transformer-based model training where the .bin file size does not change after training, despite the loss decreasing over epochs. I suspect the model weights are not ...

Baskan Aqua

1

asked Jun 26 at 10:49

0 votes

0 answers

12 views

Layer "sequential_29" expects 1 input(s), but it received 3 input tensors

I am trying to use GridSearchCV on a trained model. But the following error occurs: Layer "sequential_29" expects 1 input(s), but it received 3 input tensors. Inputs received: [<tf....

Adriana

1

asked Jun 25 at 16:14

0 votes

1 answer

42 views

Different results for the same epoch using different number of total epochs

I am training a Machine Learning model for STS task using the Sentence Transformers library. When I was testing it, I noticed that my model generated different results for the same number of epochs ...

Hígor Hahn

1

asked Jun 24 at 22:35

0 votes

0 answers

16 views

Pretrained Model Weights Not Updating During DPO Training

I'm trying to apply DPO to a pre-trained model. However, during the training process, the scores given by the pre-trained model and the fine-tuned model are identical, and the loss remains the same ...

jeash

1

asked Jun 24 at 19:48

0 votes

0 answers

33 views

Fine tune Llama 2 model with custom dataset but getting zero training loss and validation loss

My problem is that the output of training loss and validation loss is 0 for the 3 epoch Here I am using kaggle notebook !pip install transformers datasets torch bitsandbytes peft accelerate import ...

Menna Bakry

1

asked Jun 24 at 14:08

0 votes

0 answers

27 views

What's the correct data structure and format to fine-tune OpenAI assistant as a vector file?

I'm trying to fine-tune an assistant that depends on gpt-3.5-turbo model with numbers and bullets but bullets never show up. I created both docx and txt files with different data format like: Title: ...

PHP User

2,412

asked Jun 23 at 11:52

1 vote

0 answers

29 views

Fine-tuned LLaMA-2-Chat-HF Model Generates Same Responses as Pre-trained Model and Suitability for Retrieval-based Task

I am working on building a chatbot for substance abuse support. My approach involves two main steps: Fine-tuning the LLaMA-2-Chat-HF model: I have fine-tuned the LLaMA-2-Chat-HF model using a dataset ...

Hannah Mariam John

11

asked Jun 20 at 14:50

0 votes

0 answers

15 views

RuntimeError: The size of tensor a (128) must match the size of tensor b (122) at non-singleton dimension 2

Description Error During Fine-Tuning Nvidia TTS Fastpitch Model with Custom Dataset I am currently trying to fine-tune the FastPitch model from NVIDIA NeMo on a custom dataset but encountered the ...

Hasan Maqsood

1

asked Jun 20 at 12:38

0 votes

0 answers

21 views

Formatting .lstmf for tesseract fine tuning (Windows11) Deserialize header failed: C:\Users\Dell7420\Desktop\KerasOCR\KerasOCR\tesstrain\data\AW.lstmf

I am fine tuning a tesseract-best model on some handwritten images. I am trying to run the following command & "C:\Program Files\Tesseract-OCR\lstmtraining.exe" ` >> --...

Henry

1

asked Jun 19 at 6:20

Collectives™ on Stack Overflow

Questions tagged [fine-tuning]

Exception: Cannot load model parameters from checkpoint /home/krish/content/1.2B_last_checkpoint.pt; please ensure that the architectures match

Need Recommendations for Fine-Tuning LLMs on Azure ML: Best Practices [closed]

How to prepare data for batch-inference in Azure ML?

The issue of bitsandbytes package supporting CUDA 12.4 version

Fine tune llama3 with message replies like dataset (slack)

Fine-tunning model vs training from scrath

Model Training Does Not Update .bin File Size Despite Training

Layer "sequential_29" expects 1 input(s), but it received 3 input tensors

Different results for the same epoch using different number of total epochs

Pretrained Model Weights Not Updating During DPO Training

Fine tune Llama 2 model with custom dataset but getting zero training loss and validation loss

What's the correct data structure and format to fine-tune OpenAI assistant as a vector file?

Fine-tuned LLaMA-2-Chat-HF Model Generates Same Responses as Pre-trained Model and Suitability for Retrieval-based Task

RuntimeError: The size of tensor a (128) must match the size of tensor b (122) at non-singleton dimension 2

Formatting .lstmf for tesseract fine tuning (Windows11) Deserialize header failed: C:\Users\Dell7420\Desktop\KerasOCR\KerasOCR\tesstrain\data\AW.lstmf

Hot Network Questions

Collectives™ on Stack Overflow

Questions tagged [fine-tuning]

Related Tags