-
Notifications
You must be signed in to change notification settings - Fork 80
Issues: NVIDIA/TensorRT-Model-Optimizer
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Explicit quantization in PyTorch before ONNX leads to slower TRT engine than ONNX PTQ
bug
Something isn't working
#207
opened Jun 8, 2025 by
liukang1811
Apart from fp8, are there any plans to directly support vllm inference with awq in the future
feature request
New feature or request
#204
opened May 28, 2025 by
dingjingzhen
llama 4 scout fp8 not work on sglang
bug
Something isn't working
roadmap
#203
opened May 27, 2025 by
zhyncs
get_modelike_from_algo_cfg
doesn't accept QuantizeAlgorithmConfig
, but it's typing says it should
bug
#201
opened May 21, 2025 by
ORippler
How can I use the HistogramCalibrator
feature request
New feature or request
#200
opened May 20, 2025 by
AnnaTrainingG
[BUG] FP8 real_quantization doesnt work with block_sizes
bug
Something isn't working
#193
opened May 9, 2025 by
ishan-modi
is fp8 quantization with block-wise/per-token/per-channel supported
feature request
New feature or request
#192
opened May 9, 2025 by
YSF-A
What is the difference of config in mtq.quantize() and config in TensorQuantizer
#190
opened May 6, 2025 by
YSF-A
Support for W4A16 and W4A8 Quantization in TensorRT Model Optimizer
feature request
New feature or request
#189
opened Apr 30, 2025 by
david-PHR
Cannot serve modelopt quantized nvfp4 model on TensorRT LLM
bug
Something isn't working
#187
opened Apr 27, 2025 by
enisaras
[BUG] modelopt restore quantized models using 'AutoModelForCausalLM.from_pretrained' doesn't work for mixtral-8x7b
bug
Something isn't working
#186
opened Apr 27, 2025 by
wanzhenchn
Support more Quantization methods for "onnx_ptq"?
feature request
New feature or request
#184
opened Apr 24, 2025 by
s101010tw
[BUG] Issue processing NF4 double quantization
bug
Something isn't working
#183
opened Apr 22, 2025 by
ishan-modi
Qwen2_MoE AWQ(w4a16/w4a8) quantization failed with Nan AssertionError
#182
opened Apr 22, 2025 by
wanzhenchn
Torch Quantization: Allow restoring quantized model and re-running calibration on new data (PTQ)
feature request
New feature or request
#179
opened Apr 16, 2025 by
david-PHR
Explicit INT8 Quantization Fails to Fuse Concat-Conv Block Compared to Implicit Mode
#174
opened Apr 9, 2025 by
patrickgrommelt
Getting Real quantization not supported for this format error when using mtq.compress(model)
#171
opened Apr 5, 2025 by
RivenSama
PyTorch Quantization Failed to Quantize Scaled Dot Product
#149
opened Mar 7, 2025 by
YixuanSeanZhou
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-05-09.