Questions tagged [pre-trained-model]
A machine learning model created by someone else. Questions about the practical use and implementation details (using a pretrained model as a starting point, or benchmark) are allowed; however, questions about the theory behind these models are off-topic.
pre-trained-model
514
questions
0
votes
0
answers
23
views
How can fine tune VGG16 with gray images?
I am trying to use pre trained VGG16 with SRGAN and grayscale images when I use the VGG16 in this way:
def build_vgg():
input_shape = (256, 256, 3)
# Load a pre-trained VGG19 model trained on '...
2
votes
1
answer
138
views
Saving Fine-tune Falcon HuggingFace LLM Model
I'm trying to save my model so it won't need to re-download the base model every time I want to use it but nothing seems to work for me, I would love your help with it.
The following parameters are ...
0
votes
0
answers
79
views
How to fix this error: KeyError: 'model.embed_tokens.weight'
This is the detailed error:
Traceback (most recent call last):
File "/home/cyq/zxc/SmartEdit/train/DS_MLLMSD11_train.py", line 769, in <module>
train()
File "/home/cyq/zxc/...
-1
votes
0
answers
32
views
Mobilenetv2 transfer learning
Goal:
Transfer learning Mobilenetv2 (input size 224x224 and it's own preprocessing (resize + central_crop + normalization)) as encoder for Unet with input size 512x512 using pytorch.
What I've done:
...
2
votes
0
answers
30
views
Can I remove all special tokens from text if I want to use it for LLM continuous pretraining
I want to use text data for for LLM continuous pretraining. I have a bunch of instruction tuning datasets for that purpose. Some of them have text with special tokens like:
[ { "from": "...
2
votes
0
answers
107
views
EOF occurred in violation of protocol (_ssl.c:2426)
I am trying to get inference from a deployed pretrained model on Sagemaker notebook environment. While executing the below line of code,
response = predictor.predict(serialized_data)
I am receiving an ...
0
votes
0
answers
26
views
Channel-wise multiplication of metadata with intermediate layers
I am writing a code for Channel-wise multiplication of metadata with intermediate layers as mentioned in this paper https://link.springer.com/chapter/10.1007/978-3-030-59713-9_24.
Here are my code:
...
1
vote
1
answer
39
views
Which weights change when fine-tunning a pre-trained model? (Hugging Face)
I am using the AutoModelForSequenceClassification class to fine-tune a pre-trained model (which originally is based on GPT2 architecture)
model = AutoModelForSequenceClassification.from_pretrained(&...
0
votes
0
answers
31
views
embedding dimension and tokenizer max length mismatch while using pretrained gpt model. RuntimeError target size mismatch
I want to evaluate pretrained gpt model.
gpt model's embedding layer is (tokens_embed): Embedding(40478, 768)
If I set tokenizer's max_length as 512,
RuntimeError: Expected target size [2, 40478], got ...
0
votes
0
answers
21
views
What are the key quality metrics for large language model releases?
I am a first year PhD student working on improving the release practices of Machine Learning Models, especially pre-trained large language models. I want to understand the above concept for a ...
0
votes
0
answers
25
views
Pretrain model developing in tensorflow for image classification
I have a issue regarding on how to modify a pretrained model to classify 3 classes instead of 1000. These are the 2 methods i came up with so far.. im not sure which one is the best.
...
0
votes
0
answers
70
views
Input image size (256*256) doesn't match model (224*224)
i finetuned the vit model from huggingface and wanted to do some prediction
# Get test image from the web
test_image_url = 'something.jpg'
response = requests.get(test_image_url)
test_image = Image....
0
votes
1
answer
144
views
Roberta for sentence similarity
I have a pre trained roberta model. And have a dataset having two sentence pairs with a label that indicated whether the sentence pair is similar or not. I want to use that roberta model to do that.
I ...
0
votes
2
answers
296
views
Result of YOLO model is much worse than the pretrained model it was trained on
I trained a model on top of the pre-trained model yolov8n-seg.pt in YOLO, but the result is worse the pre-trained model on the same image. I annotated around 150 images for person detection, using the ...
0
votes
0
answers
40
views
Failed to execute (TrainDeepLearningModel) ArcGIS Pro
I was using pre-trained model for human settlement landsat-8 for collecting residential area in ArcGIS Pro. Because the result was not good, i tried to follow this article (https://doc.arcgis.com/en/...