Typically, we should be able to save a merged base + PEFT model, like this:
import torch
from transformers import AutoTokenizer, AutoModel, AutoConfig
from peft import PeftModel
# Loading base MNTP model, along with custom code that enables bidirectional connections in decoder-only LLMs
tokenizer = AutoTokenizer.from_pretrained(
"McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp"
)
config = AutoConfig.from_pretrained(
"McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp", trust_remote_code=True
)
model = AutoModel.from_pretrained(
"McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp",
trust_remote_code=True,
config=config,
torch_dtype=torch.bfloat16,
device_map="cuda" if torch.cuda.is_available() else "cpu",
)
model = PeftModel.from_pretrained(
model,
"McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp",
)
model = model.merge_and_unload() # This can take several minutes on cpu
model.save_pretrained("LLM2Vec-Mistral-7B-Instruct-v2-mnt-merged")
but it's throwing an error (that looks similar to https://github.com/huggingface/transformers/issues/26972)
[out]:
/usr/local/lib/python3.10/dist-packages/transformers/integrations/peft.py:391: FutureWarning: The `active_adapter` method is deprecated and will be removed in a future version.
warnings.warn(
---------------------------------------------------------------------------
UnboundLocalError Traceback (most recent call last)
[<ipython-input-3-db27a2801af8>](https://localhost:8080/#) in <cell line: 1>()
----> 1 model.save_pretrained("LLM2Vec-Mistral-7B-Instruct-v2-mnt-merged")
3 frames
[/usr/local/lib/python3.10/dist-packages/transformers/integrations/peft.py](https://localhost:8080/#) in active_adapters(self)
383
384 # For previous PEFT versions
--> 385 if isinstance(active_adapters, str):
386 active_adapters = [active_adapters]
387
UnboundLocalError: local variable 'active_adapters' referenced before assignment
Tested on:
transformers==4.38.2
peft==0.10.0
accelerate==0.29.2