Skip to main content
The 2024 Developer Survey results are live! See the results

Questions tagged [fine-tuning]

The tag has no usage guidance.

0 votes
2 answers
26 views

isinstance() arg 2 must be a type, a tuple of types, or a union

I'm getting an error message when trying to train my model, but some reason its giving me the same message every time I alter it. Here is the code: # Define training arguments training_args = ...
Jay-Tech456's user avatar
0 votes
0 answers
30 views

Unable to solve dtype issue using UnslothAI fine tuning for Llama 3.1 8B model

I am new to fine tuning LLMs and I have been trying to run the notebooks provided by UnSlothAI. For this question, I am running the code for fine-tuning LLaMa 3.1 8B model as posted here This colab ...
adhok's user avatar
  • 411
0 votes
0 answers
24 views

Fine-tune LLM on custom schema to be used in sqlcoder, an ollama based llm

I am working on a POC to convert Natural language to SQL. I have used phi3 and now planning to use sqlcoder as part of the llm. All this are set up via ollama which I am running on docker. The one ...
Srikant Sahu's user avatar
0 votes
0 answers
21 views

Optimal hyperparameters for fine tuning LLM

could I ask you for help? I am doing fine tuning of LLM model Llama3 8b (with LoRA) for text classification. I am using Trainer from Huggingface. I am looking for the optimal ...
Roman Frič's user avatar
-1 votes
1 answer
35 views

IndexError: list index out of range, when trying to predict from the fine tuned model using Hugginface

i am trying to learn on how to fine tune a pretrained model and use it. this is my code from transformers import AutoModelForSequenceClassification, AutoTokenizer, TrainingArguments, Trainer from ...
Lijin Durairaj's user avatar
0 votes
0 answers
50 views

'LlamaForCausalLM' object has no attribute 'max_seq_length'

I'm fine-tuning llama3 using unsloth , I trained my model and saved it successfully but when I tried loading using AutoPeftModelForCausalLM.from_pretrained ,then I used TextStreamer from transformer ...
Sarra Ben Messaoud's user avatar
0 votes
0 answers
21 views

Knowing the format of dataset a pretrained model was trained on

i am working on a Multilingual TTS project , and developing a TTS for my regional language by using a pretrained model from Hugging Face hub , the model i am trying to fine tune is facebook-mms-tts ...
Injila's user avatar
  • 1
0 votes
1 answer
26 views

Difference between batch size and train batch size and validation batch size

The finetuning job creation required us to specify the values for these three kinds of batch sizes in the Azure ML Studio but I do not understand why we are specifying the batch size initially if we ...
S R's user avatar
  • 11
0 votes
0 answers
24 views

Vertex AI Studio: Fine-tuned chat-bison@002 returns results are not in training data

I have a training dataset of about 1500 samples, in 1 JSONL file. I tried to fine-tune chat-bison@002 model but none of the answers in the test prompt is desired. Even when I try to copy a short ...
nogias's user avatar
  • 583
0 votes
1 answer
91 views

RuntimeError: Placeholder storage has not been allocated on MPS device while fine-tuning model on MacBook Pro M2

I'm trying to create a proof of concept (PoC) for a local code assistant by fine-tuning the tiny_starcoder_py-vi06 model on my MacBook Pro with an M2 chip. My dataset looks like this: [ { "...
Varuzhan's user avatar
0 votes
1 answer
43 views

Exception: Cannot load model parameters from checkpoint /home/krish/content/1.2B_last_checkpoint.pt; please ensure that the architectures match

I am fine-tuning the M2M model, with 1.2B model as the last checkpoint. But while training the model I am getting this error that it cannot load the paramters and the model architechure should match ...
KRISH MANTRI's user avatar
-1 votes
1 answer
82 views

How to prepare data for batch-inference in Azure ML?

The data format (.csv) that I am using for inferencing produces the error : "each data point should be a conversation array" when running the batch scoring job. All the documentations ...
S R's user avatar
  • 11
0 votes
0 answers
94 views

The issue of bitsandbytes package supporting CUDA 12.4 version

when running the peft fine-tuning program, execute the following code: model = get_peft_model(model, peft_config) report errors: Could not find the bitsandbytes CUDA binary at WindowsPath('D:/Users/1/...
paul qin's user avatar
1 vote
0 answers
82 views

Fine tune llama3 with message replies like dataset (slack)

I want to fine tune llama3 on a dataset in which the data structure is a list of messages considering the below rules: there are channels. in each channel there are messages from all sort of users. ...
Ben's user avatar
  • 423
0 votes
0 answers
15 views

Fine-tunning model vs training from scrath

I used two approaches to train a YOLOv8X detection model. In the first approach, I split the dataset into three parts, trained the model from scratch on the first part, then fine-tuned it on the ...
Krilaria's user avatar

15 30 50 per page
1
2 3 4 5
18