Skip to main content
The 2024 Developer Survey results are live! See the results

Questions tagged [quantization]

Use this tag for questions related to quantization of any kind, such as vector quantization.

quantization
0 votes
0 answers
4 views

Why does Elecard add +12 to the QP (Quantization Parameter) when analyzing VTM-encoded bitstreams?

I have encoded a video using the VTM encoder. However, when I use Elecard StreamEye 2023 to analyze the bitstream, I notice that it adds +12 to the QP (Quantization Parameter). I'm trying to ...
Bintou Dieng's user avatar
1 vote
0 answers
34 views

ValueError: ('Expected `model` argument to be a `Model` instance, got ', <keras.engine.sequential.Sequential object at 0x7f234263dfd0>)

I want to do Quantization Aware Training, Here's my model architecture. Model: "sequential_4" _________________________________________________________________ Layer (type) ...
Vina's user avatar
  • 11
0 votes
1 answer
18 views

Unable to build interpreter for TFLITE ViT-based image classifiers on Dart / Flutter: Didn't find op for builtin opcode 'CONV_2D' version '6'

We are trying to deploy vision transformer models (EfficientViT_B0, MobileViT_V2_175, and RepViT_M11) on our flutter application using the tflite_flutter_plus and tflite_flutter_plus_helper ...
D.Varam's user avatar
1 vote
0 answers
17 views

Convert Quantization to Onnx

I am new and want to try converting models to Onnx format and I have the following issue. I have a model that has been quantized to 4-bit, and then I converted this model to Onnx. My quantized model ...
Toàn Nguyễn Phúc's user avatar
0 votes
0 answers
31 views

compress yolov8 object detection model (.pt file)

I've tried compressing my .pt file using pruning, quantization, and various other methods, but these attempts have doubled the file size 20mb file becomes 40mb. If anyone has any ideas on how to ...
adarsh khopkar's user avatar
0 votes
0 answers
26 views

What is the difference, if any, between model.half() and model.to(dtype=torch.float16) in huggingface-transformers?

Example: # pip install transformers from transformers import AutoModelForTokenClassification, AutoTokenizer # Load model model_path = 'huawei-noah/TinyBERT_General_4L_312D' model = ...
Franck Dernoncourt's user avatar
1 vote
1 answer
76 views

ONNX-Python: Can someone explain the Calibration_Data_Reader requested by the static_quantization-function?

I am using the ONNX-Python-library. I am trying to quantize ai-models statically using the quantize_static() function imported from onnxruntime.quantization. This function takes a ...
Zylon's user avatar
  • 11
1 vote
0 answers
106 views

Cannot Export HuggingFace Model to ONNX with Optimum-CLI

Summary I am trying to export the CIDAS/clipseg-rd16 model to ONNX using optimum-cli as given in the HuggingFace documentation. However, I get an error saying ValueError: Unrecognized configuration ...
Sattwik Kumar Sahu's user avatar
3 votes
2 answers
164 views

Speeding up load time of LLMs

I am currently only able to play around with a V100 on GCP. I understand that I can load a LLM in 4bit quantization as shown below. However, (assuming due to the quantization) it is taking up to 10 ...
sachinruk's user avatar
  • 9,739
0 votes
0 answers
233 views

How to resolve Import Error when using quantization in bitsandbytes

I am trying to make a gradio chatbot in Hugging Face Spaces using Mistral-7B-v0.1 model. As this is a large model, I have to quantize, else the free 50G storage gets full. I am using bitsandbytes to ...
Anish's user avatar
  • 1
0 votes
0 answers
60 views

How to Quantize the ViT Model in timm to FP16 Precision

I am a hardware developer,and I want to map the ViT model of timm to some custom accelerators which only support FP16 precision. But I have learned that the model cannot be quantized to FP16 by torch....
xlgforever's user avatar
0 votes
0 answers
15 views

Connect discontinous amplitudes

has anyone had this problem where they sampled a discretized a sine wave and the returned wave had these huge gaps? I have tried interpolating, but this removes the step-like wave which i intend to ...
user avatar
0 votes
0 answers
18 views

Quantization effects on Gradient Descent algorithm

Im writing a research paper about the effects of quantization on Gradient descent algorithm when we reduce the precision from float to fixed point airthmatic for instance 16 bits. If anyone could ...
user3662181's user avatar
0 votes
0 answers
34 views

Fixed point vs Floating point

I have a task to analyze to effects of quantization in Madgwick filter and Mahony filter which are two orientation estimation algorithms. Madgwick uses gradient descent optimisation technique which ...
user3662181's user avatar
0 votes
0 answers
51 views

Fixed point vs Float point number

I have a project that is basically to analyze the effects of quantization on orientation estimation algorithms. I have sensor data from gyroscope that looks like this when using float datatype: gx=-0....
user3662181's user avatar

15 30 50 per page
1
2 3 4 5
32