Questions tagged [pre-trained-model]

Ask Question

A machine learning model created by someone else. Questions about the practical use and implementation details (using a pretrained model as a starting point, or benchmark) are allowed; however, questions about the theory behind these models are off-topic.

514 questions

0 votes

0 answers

23 views

How can fine tune VGG16 with gray images?

I am trying to use pre trained VGG16 with SRGAN and grayscale images when I use the VGG16 in this way: def build_vgg(): input_shape = (256, 256, 3) # Load a pre-trained VGG19 model trained on '...

stella

asked 2 days ago

2 votes

1 answer

138 views

Saving Fine-tune Falcon HuggingFace LLM Model

I'm trying to save my model so it won't need to re-download the base model every time I want to use it but nothing seems to work for me, I would love your help with it. The following parameters are ...

Lidor Eliyahu Shelef

1,334

asked Jul 15 at 14:20

0 votes

0 answers

79 views

How to fix this error: KeyError: 'model.embed_tokens.weight'

This is the detailed error: Traceback (most recent call last): File "/home/cyq/zxc/SmartEdit/train/DS_MLLMSD11_train.py", line 769, in <module> train() File "/home/cyq/zxc/...

hshsh

asked Jul 6 at 19:41

-1 votes

0 answers

32 views

Mobilenetv2 transfer learning

Goal: Transfer learning Mobilenetv2 (input size 224x224 and it's own preprocessing (resize + central_crop + normalization)) as encoder for Unet with input size 512x512 using pytorch. What I've done: ...

Egor

asked Jun 30 at 13:21

2 votes

0 answers

30 views

Can I remove all special tokens from text if I want to use it for LLM continuous pretraining

I want to use text data for for LLM continuous pretraining. I have a bunch of instruction tuning datasets for that purpose. Some of them have text with special tokens like: [ { "from": "...

Anton Kostin

asked Jun 16 at 19:33

2 votes

0 answers

107 views

EOF occurred in violation of protocol (_ssl.c:2426)

I am trying to get inference from a deployed pretrained model on Sagemaker notebook environment. While executing the below line of code, response = predictor.predict(serialized_data) I am receiving an ...

Dhruv Shah

asked Jun 11 at 14:30

0 votes

0 answers

26 views

Channel-wise multiplication of metadata with intermediate layers

I am writing a code for Channel-wise multiplication of metadata with intermediate layers as mentioned in this paper https://link.springer.com/chapter/10.1007/978-3-030-59713-9_24. Here are my code: ...

jk12

asked Jun 11 at 2:10

1 vote

1 answer

39 views

Which weights change when fine-tunning a pre-trained model? (Hugging Face)

I am using the AutoModelForSequenceClassification class to fine-tune a pre-trained model (which originally is based on GPT2 architecture) model = AutoModelForSequenceClassification.from_pretrained(&...

Laura Fuentes

asked Jun 10 at 14:14

0 votes

0 answers

31 views

embedding dimension and tokenizer max length mismatch while using pretrained gpt model. RuntimeError target size mismatch

I want to evaluate pretrained gpt model. gpt model's embedding layer is (tokens_embed): Embedding(40478, 768) If I set tokenizer's max_length as 512, RuntimeError: Expected target size [2, 40478], got ...

hayo

asked Jun 9 at 12:04

0 votes

0 answers

21 views

What are the key quality metrics for large language model releases?

I am a first year PhD student working on improving the release practices of Machine Learning Models, especially pre-trained large language models. I want to understand the above concept for a ...

Eyinlojuoluwa

asked Jun 6 at 14:27

0 votes

0 answers

25 views

Pretrain model developing in tensorflow for image classification

I have a issue regarding on how to modify a pretrained model to classify 3 classes instead of 1000. These are the 2 methods i came up with so far.. im not sure which one is the best. ...

Redlightning

asked May 27 at 16:22

0 votes

0 answers

70 views

Input image size (256256) doesn't match model (224224)

i finetuned the vit model from huggingface and wanted to do some prediction # Get test image from the web test_image_url = 'something.jpg' response = requests.get(test_image_url) test_image = Image....

inter galactic

asked May 27 at 4:28

0 votes

1 answer

144 views

Roberta for sentence similarity

I have a pre trained roberta model. And have a dataset having two sentence pairs with a label that indicated whether the sentence pair is similar or not. I want to use that roberta model to do that. I ...

pycoder

asked May 5 at 12:08

0 votes

2 answers

296 views

Result of YOLO model is much worse than the pretrained model it was trained on

I trained a model on top of the pre-trained model yolov8n-seg.pt in YOLO, but the result is worse the pre-trained model on the same image. I annotated around 150 images for person detection, using the ...

Hastar

asked Apr 25 at 13:27

0 votes

0 answers

40 views

Failed to execute (TrainDeepLearningModel) ArcGIS Pro

I was using pre-trained model for human settlement landsat-8 for collecting residential area in ArcGIS Pro. Because the result was not good, i tried to follow this article (https://doc.arcgis.com/en/...

Aden Muflih

asked Apr 24 at 3:56

15 30 50 per page

2 3 4 5

…

35 Next

Collectives™ on Stack Overflow

Questions tagged [pre-trained-model]

How can fine tune VGG16 with gray images?

Saving Fine-tune Falcon HuggingFace LLM Model

How to fix this error: KeyError: 'model.embed_tokens.weight'

Mobilenetv2 transfer learning

Can I remove all special tokens from text if I want to use it for LLM continuous pretraining

EOF occurred in violation of protocol (_ssl.c:2426)

Channel-wise multiplication of metadata with intermediate layers

Which weights change when fine-tunning a pre-trained model? (Hugging Face)

embedding dimension and tokenizer max length mismatch while using pretrained gpt model. RuntimeError target size mismatch

What are the key quality metrics for large language model releases?

Pretrain model developing in tensorflow for image classification

Input image size (256256) doesn't match model (224224)

Roberta for sentence similarity

Result of YOLO model is much worse than the pretrained model it was trained on

Failed to execute (TrainDeepLearningModel) ArcGIS Pro

Hot Network Questions

Collectives™ on Stack Overflow

Questions tagged [pre-trained-model]

Related Tags