Skip to main content
The 2024 Developer Survey results are live! See the results

Questions tagged [pre-trained-model]

A machine learning model created by someone else. Questions about the practical use and implementation details (using a pretrained model as a starting point, or benchmark) are allowed; however, questions about the theory behind these models are off-topic.

pre-trained-model
0 votes
0 answers
23 views

How can fine tune VGG16 with gray images?

I am trying to use pre trained VGG16 with SRGAN and grayscale images when I use the VGG16 in this way: def build_vgg(): input_shape = (256, 256, 3) # Load a pre-trained VGG19 model trained on '...
stella's user avatar
  • 39
2 votes
1 answer
138 views

Saving Fine-tune Falcon HuggingFace LLM Model

I'm trying to save my model so it won't need to re-download the base model every time I want to use it but nothing seems to work for me, I would love your help with it. The following parameters are ...
Lidor Eliyahu Shelef's user avatar
0 votes
0 answers
79 views

How to fix this error: KeyError: 'model.embed_tokens.weight'

This is the detailed error: Traceback (most recent call last): File "/home/cyq/zxc/SmartEdit/train/DS_MLLMSD11_train.py", line 769, in <module> train() File "/home/cyq/zxc/...
hshsh's user avatar
  • 11
-1 votes
0 answers
32 views

Mobilenetv2 transfer learning

Goal: Transfer learning Mobilenetv2 (input size 224x224 and it's own preprocessing (resize + central_crop + normalization)) as encoder for Unet with input size 512x512 using pytorch. What I've done: ...
Egor's user avatar
  • 145
2 votes
0 answers
30 views

Can I remove all special tokens from text if I want to use it for LLM continuous pretraining

I want to use text data for for LLM continuous pretraining. I have a bunch of instruction tuning datasets for that purpose. Some of them have text with special tokens like: [ { "from": "...
Anton Kostin's user avatar
2 votes
0 answers
107 views

EOF occurred in violation of protocol (_ssl.c:2426)

I am trying to get inference from a deployed pretrained model on Sagemaker notebook environment. While executing the below line of code, response = predictor.predict(serialized_data) I am receiving an ...
Dhruv Shah's user avatar
0 votes
0 answers
26 views

Channel-wise multiplication of metadata with intermediate layers

I am writing a code for Channel-wise multiplication of metadata with intermediate layers as mentioned in this paper https://link.springer.com/chapter/10.1007/978-3-030-59713-9_24. Here are my code: ...
jk12's user avatar
  • 1
1 vote
1 answer
39 views

Which weights change when fine-tunning a pre-trained model? (Hugging Face)

I am using the AutoModelForSequenceClassification class to fine-tune a pre-trained model (which originally is based on GPT2 architecture) model = AutoModelForSequenceClassification.from_pretrained(&...
Laura Fuentes's user avatar
0 votes
0 answers
31 views

embedding dimension and tokenizer max length mismatch while using pretrained gpt model. RuntimeError target size mismatch

I want to evaluate pretrained gpt model. gpt model's embedding layer is (tokens_embed): Embedding(40478, 768) If I set tokenizer's max_length as 512, RuntimeError: Expected target size [2, 40478], got ...
hayo's user avatar
  • 1
0 votes
0 answers
21 views

What are the key quality metrics for large language model releases?

I am a first year PhD student working on improving the release practices of Machine Learning Models, especially pre-trained large language models. I want to understand the above concept for a ...
Eyinlojuoluwa's user avatar
0 votes
0 answers
25 views

Pretrain model developing in tensorflow for image classification

I have a issue regarding on how to modify a pretrained model to classify 3 classes instead of 1000. These are the 2 methods i came up with so far.. im not sure which one is the best. ...
Redlightning's user avatar
0 votes
0 answers
70 views

Input image size (256*256) doesn't match model (224*224)

i finetuned the vit model from huggingface and wanted to do some prediction # Get test image from the web test_image_url = 'something.jpg' response = requests.get(test_image_url) test_image = Image....
inter galactic's user avatar
0 votes
1 answer
144 views

Roberta for sentence similarity

I have a pre trained roberta model. And have a dataset having two sentence pairs with a label that indicated whether the sentence pair is similar or not. I want to use that roberta model to do that. I ...
pycoder's user avatar
0 votes
2 answers
296 views

Result of YOLO model is much worse than the pretrained model it was trained on

I trained a model on top of the pre-trained model yolov8n-seg.pt in YOLO, but the result is worse the pre-trained model on the same image. I annotated around 150 images for person detection, using the ...
Hastar's user avatar
  • 11
0 votes
0 answers
40 views

Failed to execute (TrainDeepLearningModel) ArcGIS Pro

I was using pre-trained model for human settlement landsat-8 for collecting residential area in ArcGIS Pro. Because the result was not good, i tried to follow this article (https://doc.arcgis.com/en/...
Aden Muflih's user avatar

15 30 50 per page
1
2 3 4 5
35