Skip to main content

Questions tagged [fine-tuning]

The tag has no usage guidance.

0 votes
1 answer
29 views

Exception: Cannot load model parameters from checkpoint /home/krish/content/1.2B_last_checkpoint.pt; please ensure that the architectures match

I am fine-tuning the M2M model, with 1.2B model as the last checkpoint. But while training the model I am getting this error that it cannot load the paramters and the model architechure should match ...
KRISH MANTRI's user avatar
-2 votes
0 answers
25 views

Need Recommendations for Fine-Tuning LLMs on Azure ML: Best Practices [closed]

Context I am using Azure ML and I want to fine-tune a Large Language Model (LLM) on a dataset. Integration and Deployment Should I commit my code from VS Code to GitHub / Azure DevOps, then retrieve ...
Anas_LA's user avatar
-1 votes
1 answer
66 views

How to prepare data for batch-inference in Azure ML?

The data format (.csv) that I am using for inferencing produces the error : "each data point should be a conversation array" when running the batch scoring job. All the documentations ...
S R's user avatar
  • 9
0 votes
0 answers
36 views

The issue of bitsandbytes package supporting CUDA 12.4 version

when running the peft fine-tuning program, execute the following code: model = get_peft_model(model, peft_config) report errors: Could not find the bitsandbytes CUDA binary at WindowsPath('D:/Users/1/...
paul qin's user avatar
1 vote
0 answers
66 views

Fine tune llama3 with message replies like dataset (slack)

I want to fine tune llama3 on a dataset in which the data structure is a list of messages considering the below rules: there are channels. in each channel there are messages from all sort of users. ...
Ben's user avatar
  • 423
0 votes
0 answers
11 views

Fine-tunning model vs training from scrath

I used two approaches to train a YOLOv8X detection model. In the first approach, I split the dataset into three parts, trained the model from scratch on the first part, then fine-tuned it on the ...
Krilaria's user avatar
-1 votes
0 answers
15 views

Model Training Does Not Update .bin File Size Despite Training

I'm experiencing an issue with my Transformer-based model training where the .bin file size does not change after training, despite the loss decreasing over epochs. I suspect the model weights are not ...
Baskan Aqua's user avatar
0 votes
0 answers
12 views

Layer "sequential_29" expects 1 input(s), but it received 3 input tensors

I am trying to use GridSearchCV on a trained model. But the following error occurs: Layer "sequential_29" expects 1 input(s), but it received 3 input tensors. Inputs received: [<tf....
Adriana's user avatar
0 votes
1 answer
42 views

Different results for the same epoch using different number of total epochs

I am training a Machine Learning model for STS task using the Sentence Transformers library. When I was testing it, I noticed that my model generated different results for the same number of epochs ...
Hígor Hahn's user avatar
0 votes
0 answers
16 views

Pretrained Model Weights Not Updating During DPO Training

I'm trying to apply DPO to a pre-trained model. However, during the training process, the scores given by the pre-trained model and the fine-tuned model are identical, and the loss remains the same ...
jeash's user avatar
  • 1
0 votes
0 answers
33 views

Fine tune Llama 2 model with custom dataset but getting zero training loss and validation loss

My problem is that the output of training loss and validation loss is 0 for the 3 epoch Here I am using kaggle notebook !pip install transformers datasets torch bitsandbytes peft accelerate import ...
Menna Bakry's user avatar
0 votes
0 answers
27 views

What's the correct data structure and format to fine-tune OpenAI assistant as a vector file?

I'm trying to fine-tune an assistant that depends on gpt-3.5-turbo model with numbers and bullets but bullets never show up. I created both docx and txt files with different data format like: Title: ...
PHP User's user avatar
  • 2,412
1 vote
0 answers
29 views

Fine-tuned LLaMA-2-Chat-HF Model Generates Same Responses as Pre-trained Model and Suitability for Retrieval-based Task

I am working on building a chatbot for substance abuse support. My approach involves two main steps: Fine-tuning the LLaMA-2-Chat-HF model: I have fine-tuned the LLaMA-2-Chat-HF model using a dataset ...
Hannah Mariam John's user avatar
0 votes
0 answers
15 views

RuntimeError: The size of tensor a (128) must match the size of tensor b (122) at non-singleton dimension 2

Description Error During Fine-Tuning Nvidia TTS Fastpitch Model with Custom Dataset I am currently trying to fine-tune the FastPitch model from NVIDIA NeMo on a custom dataset but encountered the ...
Hasan Maqsood's user avatar
0 votes
0 answers
21 views

Formatting .lstmf for tesseract fine tuning (Windows11) Deserialize header failed: C:\Users\Dell7420\Desktop\KerasOCR\KerasOCR\tesstrain\data\AW.lstmf

I am fine tuning a tesseract-best model on some handwritten images. I am trying to run the following command & "C:\Program Files\Tesseract-OCR\lstmtraining.exe" ` >> --...
Henry's user avatar
  • 1

15 30 50 per page
1
2 3 4 5
18