Highest scored 'fine-tuning' questions

24 votes

3 answers

24k views

Target modules for applying PEFT / LoRA on different models

I am looking at a few different examples of using PEFT on different models. The LoraConfig object contains a target_modules array. In some examples, the target modules are ["query_key_value"]...

ahron

1,153

asked Jul 26, 2023 at 5:23

22 votes

4 answers

20k views

Difference between Instruction Tuning vs Non Instruction Tuning Large Language Models

What is the difference between instruction tuning and normal fine-tuning for large language models? Also the instruction-tuning I'm referring to isn't the in-context/prompt one. All the recent papers ...

Flo

291

asked Jun 11, 2023 at 15:37

9 votes

3 answers

18k views

Fine-Tuning GPT2 - attention mask and pad token id errors

I have been trying to fine-tune GPT2 on the wikitext-2 dataset (just to help myself learn the process) and I am running into a warning message that I have not seen before: "The attention mask and ...

Toakley

352

asked Dec 5, 2022 at 1:57

8 votes

1 answer

3k views

OpenAI GPT-3 API: Fine tune a fine tuned model? [closed]

The OpenAI documentation for the model attribute in the fine-tune API states a bit confusingly: model The name of the base model to fine-tune. You can select one of "ada", "babbage&...

ImitationGamer

89

asked Jun 26, 2022 at 0:35

6 votes

1 answer

4k views

Fine-tuning BERT sentence transformer model

I am using a pre-trained BERT sentence transformer model, as described here https://www.sbert.net/docs/training/overview.html , to get embeddings for sentences. I want to fine-tune these pre-trained ...

Fiori

301

asked Oct 13, 2021 at 21:38

5 votes

3 answers

8k views

What are the differences between fine tuning and few shot learning?

I am trying to understand the concept of fine-tuning and few-shot learning. I understand the need for fine-tuning. It is essentially tuning a pre-trained model to a specific downstream task. However, ...

Exploring

3,041

asked Jun 14, 2022 at 3:54

5 votes

2 answers

11k views

Can i clear up gpu vram in colab

I'm trying to use aitextgen to finetune 774M gpt 2 on a dataset. unfortunately, no matter what i do, training fails because there are only 80 mb of vram available. how can i clear the vram without ...

Blazeolmo 343

51

asked Mar 6, 2022 at 15:42

5 votes

2 answers

8k views

Fine-tuning TheBloke/Llama-2-13B-chat-GPTQ model with Hugging Face Transformers library throws Exllama error

I am trying to fine-tune the TheBloke/Llama-2-13B-chat-GPTQ model using the Hugging Face Transformers library. I am using a JSON file for the training and validation datasets. However, I am ...

Patryk Wawryniuk

61

asked Aug 26, 2023 at 13:57

4 votes

1 answer

4k views

how to train a bert model from scratch with huggingface?

i find a answer of training model from scratch in this question: How to train BERT from scratch on a new domain for both MLM and NSP? one answer use Trainer and TrainingArguments like this: from ...

Jack.Sparrow

141

asked Sep 10, 2021 at 3:30

4 votes

2 answers

2k views

What are the differences between adapter tuning and prefix tuning? [closed]

I am trying to understand the concept of adapter-tuning, prompt-tuning, and prefix-tuning in the context of few-shot learning. It appears to me that I can apply prompt tuning to a black box language ...

Exploring

3,041

asked Dec 7, 2022 at 1:16

4 votes

0 answers

299 views

Huggingface: Fine-tuning (not enough values to unpack (expected 2, got 1))

I'm trying to fine-tune erfan226/persian-t5-paraphraser paraphrase generator model for Persian sentences. I used the Persian dataset of tapaco and reformatted it to match the glue (mrpc) dataset which ...

Ali Ghasemi

71

asked Dec 5, 2022 at 17:35

4 votes

3 answers

3k views

Can I create a fine-tuned model for OpenAI API Codex models?

I'd like to translate user requests into tickets in some sort of structured data format, e.g. JSON. For example: User: I want to order two chairs and a desk with three drawers on the left side. ...

xaxa

1,099

asked Apr 18, 2022 at 12:22

3 votes

1 answer

7k views

Fine-tuning a pre-trained LLM for question-answering

Objective My goal is to fine-tune a pre-trained LLM on a dataset about Manchester United's (MU's) 2021/22 season (they had a poor season). I want to be able to prompt the fine-tuned model with ...

Tom Bomer

103

asked May 31, 2023 at 11:55

3 votes

2 answers

6k views

Expected file to have JSONL format, where every line is a JSON dictionary. openai createFile for fine tune

I created file with name mydata.jsonl and I put on it these lines { "prompt": "aa", "completion": "bb" } { "prompt"...

Fatma Mahmoud

129

asked Apr 5, 2023 at 2:44

3 votes

1 answer

7k views

EasyOCR - Table extraction

I use easyocr to extract table from a photo or scanned PDF, but I have a problem in fine tuning the data as a table. I try to make a searchable pdf according to extracted coordinates but when I ...

mahya

31

asked Jul 1, 2022 at 13:57

Collectives™ on Stack Overflow

Questions tagged [fine-tuning]

Target modules for applying PEFT / LoRA on different models

Difference between Instruction Tuning vs Non Instruction Tuning Large Language Models

Fine-Tuning GPT2 - attention mask and pad token id errors

OpenAI GPT-3 API: Fine tune a fine tuned model? [closed]

Fine-tuning BERT sentence transformer model

What are the differences between fine tuning and few shot learning?

Can i clear up gpu vram in colab

Fine-tuning TheBloke/Llama-2-13B-chat-GPTQ model with Hugging Face Transformers library throws Exllama error

how to train a bert model from scratch with huggingface?

What are the differences between adapter tuning and prefix tuning? [closed]

Huggingface: Fine-tuning (not enough values to unpack (expected 2, got 1))

Can I create a fine-tuned model for OpenAI API Codex models?

Fine-tuning a pre-trained LLM for question-answering

Expected file to have JSONL format, where every line is a JSON dictionary. openai createFile for fine tune

EasyOCR - Table extraction

Hot Network Questions

Collectives™ on Stack Overflow

Questions tagged [fine-tuning]

Related Tags