Questions tagged [fine-tuning]
The fine-tuning tag has no usage guidance.
fine-tuning
257
questions
0
votes
2
answers
26
views
isinstance() arg 2 must be a type, a tuple of types, or a union
I'm getting an error message when trying to train my model, but some reason its giving me the same message every time I alter it.
Here is the code:
# Define training arguments
training_args = ...
0
votes
0
answers
30
views
Unable to solve dtype issue using UnslothAI fine tuning for Llama 3.1 8B model
I am new to fine tuning LLMs and I have been trying to run the notebooks provided by UnSlothAI. For this question, I am running the code for fine-tuning LLaMa 3.1 8B model as posted here
This colab ...
0
votes
0
answers
24
views
Fine-tune LLM on custom schema to be used in sqlcoder, an ollama based llm
I am working on a POC to convert Natural language to SQL. I have used phi3 and now planning to use sqlcoder as part of the llm. All this are set up via ollama which I am running on docker.
The one ...
0
votes
0
answers
21
views
Optimal hyperparameters for fine tuning LLM
could I ask you for help? I am doing fine tuning of LLM model Llama3 8b (with LoRA) for text classification. I am using Trainer from Huggingface. I am looking for the optimal ...
-1
votes
1
answer
35
views
IndexError: list index out of range, when trying to predict from the fine tuned model using Hugginface
i am trying to learn on how to fine tune a pretrained model and use it. this is my code
from transformers import AutoModelForSequenceClassification, AutoTokenizer, TrainingArguments, Trainer
from ...
0
votes
0
answers
50
views
'LlamaForCausalLM' object has no attribute 'max_seq_length'
I'm fine-tuning llama3 using unsloth , I trained my model and saved it successfully but when I tried loading using AutoPeftModelForCausalLM.from_pretrained ,then I used TextStreamer from transformer ...
0
votes
0
answers
21
views
Knowing the format of dataset a pretrained model was trained on
i am working on a Multilingual TTS project , and developing a TTS for my regional language by using a pretrained model from Hugging Face hub , the model i am trying to fine tune is facebook-mms-tts ...
0
votes
1
answer
26
views
Difference between batch size and train batch size and validation batch size
The finetuning job creation required us to specify the values for these three kinds of batch sizes in the Azure ML Studio but I do not understand why we are specifying the batch size initially if we ...
0
votes
0
answers
24
views
Vertex AI Studio: Fine-tuned chat-bison@002 returns results are not in training data
I have a training dataset of about 1500 samples, in 1 JSONL file. I tried to fine-tune chat-bison@002 model but none of the answers in the test prompt is desired. Even when I try to copy a short ...
0
votes
1
answer
91
views
RuntimeError: Placeholder storage has not been allocated on MPS device while fine-tuning model on MacBook Pro M2
I'm trying to create a proof of concept (PoC) for a local code assistant by fine-tuning the tiny_starcoder_py-vi06 model on my MacBook Pro with an M2 chip. My dataset looks like this:
[
{ "...
0
votes
1
answer
43
views
Exception: Cannot load model parameters from checkpoint /home/krish/content/1.2B_last_checkpoint.pt; please ensure that the architectures match
I am fine-tuning the M2M model, with 1.2B model as the last checkpoint. But while training the model I am getting this error that it cannot load the paramters and the model architechure should match
...
-1
votes
1
answer
82
views
How to prepare data for batch-inference in Azure ML?
The data format (.csv) that I am using for inferencing produces the error :
"each data point should be a conversation array" when running the batch scoring job. All the documentations ...
0
votes
0
answers
94
views
The issue of bitsandbytes package supporting CUDA 12.4 version
when running the peft fine-tuning program, execute the following code:
model = get_peft_model(model, peft_config)
report errors:
Could not find the bitsandbytes CUDA binary at WindowsPath('D:/Users/1/...
1
vote
0
answers
82
views
Fine tune llama3 with message replies like dataset (slack)
I want to fine tune llama3 on a dataset in which the data structure is a list of messages considering the below rules:
there are channels.
in each channel there are messages from all sort of users.
...
0
votes
0
answers
15
views
Fine-tunning model vs training from scrath
I used two approaches to train a YOLOv8X detection model. In the first approach, I split the dataset into three parts, trained the model from scratch on the first part, then fine-tuned it on the ...