Questions tagged [fine-tuning]
The fine-tuning tag has no usage guidance.
fine-tuning
258
questions
0
votes
1
answer
29
views
Exception: Cannot load model parameters from checkpoint /home/krish/content/1.2B_last_checkpoint.pt; please ensure that the architectures match
I am fine-tuning the M2M model, with 1.2B model as the last checkpoint. But while training the model I am getting this error that it cannot load the paramters and the model architechure should match
...
-2
votes
0
answers
25
views
Need Recommendations for Fine-Tuning LLMs on Azure ML: Best Practices [closed]
Context
I am using Azure ML and I want to fine-tune a Large Language Model (LLM) on a dataset.
Integration and Deployment
Should I commit my code from VS Code to GitHub / Azure DevOps, then retrieve ...
-1
votes
1
answer
66
views
How to prepare data for batch-inference in Azure ML?
The data format (.csv) that I am using for inferencing produces the error :
"each data point should be a conversation array" when running the batch scoring job. All the documentations ...
0
votes
0
answers
36
views
The issue of bitsandbytes package supporting CUDA 12.4 version
when running the peft fine-tuning program, execute the following code:
model = get_peft_model(model, peft_config)
report errors:
Could not find the bitsandbytes CUDA binary at WindowsPath('D:/Users/1/...
1
vote
0
answers
66
views
Fine tune llama3 with message replies like dataset (slack)
I want to fine tune llama3 on a dataset in which the data structure is a list of messages considering the below rules:
there are channels.
in each channel there are messages from all sort of users.
...
0
votes
0
answers
11
views
Fine-tunning model vs training from scrath
I used two approaches to train a YOLOv8X detection model. In the first approach, I split the dataset into three parts, trained the model from scratch on the first part, then fine-tuned it on the ...
-1
votes
0
answers
15
views
Model Training Does Not Update .bin File Size Despite Training
I'm experiencing an issue with my Transformer-based model training where the .bin file size does not change after training, despite the loss decreasing over epochs. I suspect the model weights are not ...
0
votes
0
answers
12
views
Layer "sequential_29" expects 1 input(s), but it received 3 input tensors
I am trying to use GridSearchCV on a trained model. But the following error occurs:
Layer "sequential_29" expects 1 input(s), but it received 3 input tensors. Inputs received: [<tf....
0
votes
1
answer
42
views
Different results for the same epoch using different number of total epochs
I am training a Machine Learning model for STS task using the Sentence Transformers library.
When I was testing it, I noticed that my model generated different results for the same number of epochs ...
0
votes
0
answers
16
views
Pretrained Model Weights Not Updating During DPO Training
I'm trying to apply DPO to a pre-trained model. However, during the training process, the scores given by the pre-trained model and the fine-tuned model are identical, and the loss remains the same ...
0
votes
0
answers
33
views
Fine tune Llama 2 model with custom dataset but getting zero training loss and validation loss
My problem is that the output of training loss and validation loss is 0 for the 3 epoch
Here I am using kaggle notebook
!pip install transformers datasets torch bitsandbytes peft accelerate
import ...
0
votes
0
answers
27
views
What's the correct data structure and format to fine-tune OpenAI assistant as a vector file?
I'm trying to fine-tune an assistant that depends on gpt-3.5-turbo model with numbers and bullets but bullets never show up. I created both docx and txt files with different data format like:
Title: ...
1
vote
0
answers
29
views
Fine-tuned LLaMA-2-Chat-HF Model Generates Same Responses as Pre-trained Model and Suitability for Retrieval-based Task
I am working on building a chatbot for substance abuse support. My approach involves two main steps:
Fine-tuning the LLaMA-2-Chat-HF model: I have fine-tuned the LLaMA-2-Chat-HF model using a dataset ...
0
votes
0
answers
15
views
RuntimeError: The size of tensor a (128) must match the size of tensor b (122) at non-singleton dimension 2
Description
Error During Fine-Tuning Nvidia TTS Fastpitch Model with Custom Dataset
I am currently trying to fine-tune the FastPitch model from NVIDIA NeMo on a custom dataset but encountered the ...
0
votes
0
answers
21
views
Formatting .lstmf for tesseract fine tuning (Windows11) Deserialize header failed: C:\Users\Dell7420\Desktop\KerasOCR\KerasOCR\tesstrain\data\AW.lstmf
I am fine tuning a tesseract-best model on some handwritten images. I am trying to run the following command
& "C:\Program Files\Tesseract-OCR\lstmtraining.exe" `
>> --...