Questions tagged [fine-tuning]
The fine-tuning tag has no usage guidance.
fine-tuning
255
questions
24
votes
3
answers
24k
views
Target modules for applying PEFT / LoRA on different models
I am looking at a few different examples of using PEFT on different models. The LoraConfig object contains a target_modules array. In some examples, the target modules are ["query_key_value"]...
22
votes
4
answers
20k
views
Difference between Instruction Tuning vs Non Instruction Tuning Large Language Models
What is the difference between instruction tuning and normal fine-tuning for large language models?
Also the instruction-tuning I'm referring to isn't the in-context/prompt one.
All the recent papers ...
9
votes
3
answers
18k
views
Fine-Tuning GPT2 - attention mask and pad token id errors
I have been trying to fine-tune GPT2 on the wikitext-2 dataset (just to help myself learn the process) and I am running into a warning message that I have not seen before:
"The attention mask and ...
8
votes
1
answer
3k
views
OpenAI GPT-3 API: Fine tune a fine tuned model? [closed]
The OpenAI documentation for the model attribute in the fine-tune API states a bit confusingly:
model
The name of the base model to fine-tune. You can select one of "ada", "babbage&...
6
votes
1
answer
4k
views
Fine-tuning BERT sentence transformer model
I am using a pre-trained BERT sentence transformer model, as described here https://www.sbert.net/docs/training/overview.html , to get embeddings for sentences.
I want to fine-tune these pre-trained ...
5
votes
3
answers
8k
views
What are the differences between fine tuning and few shot learning?
I am trying to understand the concept of fine-tuning and few-shot learning.
I understand the need for fine-tuning. It is essentially tuning a pre-trained model to a specific downstream task. However, ...
5
votes
2
answers
11k
views
Can i clear up gpu vram in colab
I'm trying to use aitextgen to finetune 774M gpt 2 on a dataset. unfortunately, no matter what i do, training fails because there are only 80 mb of vram available. how can i clear the vram without ...
5
votes
2
answers
8k
views
Fine-tuning TheBloke/Llama-2-13B-chat-GPTQ model with Hugging Face Transformers library throws Exllama error
I am trying to fine-tune the TheBloke/Llama-2-13B-chat-GPTQ model using the Hugging Face Transformers library. I am using a JSON file for the training and validation datasets. However, I am ...
4
votes
1
answer
4k
views
how to train a bert model from scratch with huggingface?
i find a answer of training model from scratch in this question:
How to train BERT from scratch on a new domain for both MLM and NSP?
one answer use Trainer and TrainingArguments like this:
from ...
4
votes
2
answers
2k
views
What are the differences between adapter tuning and prefix tuning? [closed]
I am trying to understand the concept of adapter-tuning, prompt-tuning, and prefix-tuning in the context of few-shot learning.
It appears to me that I can apply prompt tuning to a black box language ...
4
votes
0
answers
299
views
Huggingface: Fine-tuning (not enough values to unpack (expected 2, got 1))
I'm trying to fine-tune erfan226/persian-t5-paraphraser paraphrase generator model for Persian sentences. I used the Persian dataset of tapaco and reformatted it to match the glue (mrpc) dataset which ...
4
votes
3
answers
3k
views
Can I create a fine-tuned model for OpenAI API Codex models?
I'd like to translate user requests into tickets in some sort of structured data format, e.g. JSON. For example:
User: I want to order two chairs and a desk with three drawers on the left side.
...
3
votes
1
answer
7k
views
Fine-tuning a pre-trained LLM for question-answering
Objective
My goal is to fine-tune a pre-trained LLM on a dataset about Manchester United's (MU's) 2021/22 season (they had a poor season). I want to be able to prompt the fine-tuned model with ...
3
votes
2
answers
6k
views
Expected file to have JSONL format, where every line is a JSON dictionary. openai createFile for fine tune
I created file with name mydata.jsonl and I put on it these lines
{
"prompt": "aa",
"completion": "bb"
}
{
"prompt"...
3
votes
1
answer
7k
views
EasyOCR - Table extraction
I use easyocr to extract table from a photo or scanned PDF, but I have a problem in fine tuning the data as a table.
I try to make a searchable pdf according to extracted coordinates but when I ...