Skip to main content
The 2024 Developer Survey results are live! See the results

Questions tagged [fine-tuning]

The tag has no usage guidance.

fine-tuning
24 votes
3 answers
24k views

Target modules for applying PEFT / LoRA on different models

I am looking at a few different examples of using PEFT on different models. The LoraConfig object contains a target_modules array. In some examples, the target modules are ["query_key_value"]...
ahron's user avatar
  • 1,153
22 votes
4 answers
20k views

Difference between Instruction Tuning vs Non Instruction Tuning Large Language Models

What is the difference between instruction tuning and normal fine-tuning for large language models? Also the instruction-tuning I'm referring to isn't the in-context/prompt one. All the recent papers ...
Flo's user avatar
  • 291
9 votes
3 answers
18k views

Fine-Tuning GPT2 - attention mask and pad token id errors

I have been trying to fine-tune GPT2 on the wikitext-2 dataset (just to help myself learn the process) and I am running into a warning message that I have not seen before: "The attention mask and ...
Toakley's user avatar
  • 352
8 votes
1 answer
3k views

OpenAI GPT-3 API: Fine tune a fine tuned model? [closed]

The OpenAI documentation for the model attribute in the fine-tune API states a bit confusingly: model The name of the base model to fine-tune. You can select one of "ada", "babbage&...
ImitationGamer's user avatar
6 votes
1 answer
4k views

Fine-tuning BERT sentence transformer model

I am using a pre-trained BERT sentence transformer model, as described here https://www.sbert.net/docs/training/overview.html , to get embeddings for sentences. I want to fine-tune these pre-trained ...
Fiori's user avatar
  • 301
5 votes
3 answers
8k views

What are the differences between fine tuning and few shot learning?

I am trying to understand the concept of fine-tuning and few-shot learning. I understand the need for fine-tuning. It is essentially tuning a pre-trained model to a specific downstream task. However, ...
Exploring's user avatar
  • 3,041
5 votes
2 answers
11k views

Can i clear up gpu vram in colab

I'm trying to use aitextgen to finetune 774M gpt 2 on a dataset. unfortunately, no matter what i do, training fails because there are only 80 mb of vram available. how can i clear the vram without ...
Blazeolmo 343's user avatar
5 votes
2 answers
8k views

Fine-tuning TheBloke/Llama-2-13B-chat-GPTQ model with Hugging Face Transformers library throws Exllama error

I am trying to fine-tune the TheBloke/Llama-2-13B-chat-GPTQ model using the Hugging Face Transformers library. I am using a JSON file for the training and validation datasets. However, I am ...
Patryk Wawryniuk's user avatar
4 votes
1 answer
4k views

how to train a bert model from scratch with huggingface?

i find a answer of training model from scratch in this question: How to train BERT from scratch on a new domain for both MLM and NSP? one answer use Trainer and TrainingArguments like this: from ...
Jack.Sparrow's user avatar
4 votes
2 answers
2k views

What are the differences between adapter tuning and prefix tuning? [closed]

I am trying to understand the concept of adapter-tuning, prompt-tuning, and prefix-tuning in the context of few-shot learning. It appears to me that I can apply prompt tuning to a black box language ...
Exploring's user avatar
  • 3,041
4 votes
0 answers
299 views

Huggingface: Fine-tuning (not enough values to unpack (expected 2, got 1))

I'm trying to fine-tune erfan226/persian-t5-paraphraser paraphrase generator model for Persian sentences. I used the Persian dataset of tapaco and reformatted it to match the glue (mrpc) dataset which ...
Ali Ghasemi's user avatar
4 votes
3 answers
3k views

Can I create a fine-tuned model for OpenAI API Codex models?

I'd like to translate user requests into tickets in some sort of structured data format, e.g. JSON. For example: User: I want to order two chairs and a desk with three drawers on the left side. ...
xaxa's user avatar
  • 1,099
3 votes
1 answer
7k views

Fine-tuning a pre-trained LLM for question-answering

Objective My goal is to fine-tune a pre-trained LLM on a dataset about Manchester United's (MU's) 2021/22 season (they had a poor season). I want to be able to prompt the fine-tuned model with ...
Tom Bomer's user avatar
  • 103
3 votes
2 answers
6k views

Expected file to have JSONL format, where every line is a JSON dictionary. openai createFile for fine tune

I created file with name mydata.jsonl and I put on it these lines { "prompt": "aa", "completion": "bb" } { "prompt"...
Fatma Mahmoud's user avatar
3 votes
1 answer
7k views

EasyOCR - Table extraction

I use easyocr to extract table from a photo or scanned PDF, but I have a problem in fine tuning the data as a table. I try to make a searchable pdf according to extracted coordinates but when I ...
mahya's user avatar
  • 31

15 30 50 per page
1
2 3 4 5
17