triton-inference-server / server Public

Notifications You must be signed in to change notification settings
Fork 1.4k
Star 7.8k

Code
Issues 503
Pull requests 49
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: triton-inference-server/server

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

503 Open 3,130 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Unexpected deadlock when requesting less outputs

#7482 opened Jul 29, 2024 by gizarchik

Python backend SHM memory leak

#7481 opened Jul 27, 2024 by mbahri

nv_inference_count no longer includes gpu_uuid?

#7479 opened Jul 26, 2024 by chriscarollo

Exllamav2 inference with EXL Quants

#7477 opened Jul 26, 2024 by rjmehta1993

Reduce a docker image size.

#7474 opened Jul 25, 2024 by decadance-dance

Provide native support for server-side tokenization

#7473 opened Jul 24, 2024 by WilliamOnVoyage

Triton crashes with SIGSEGV (signal 11)

#7472 opened Jul 24, 2024 by JindrichD

support auto padding for tensorflow_backend

#7468 opened Jul 23, 2024 by LinGeLin

Issue while loading the model using TIS (Triton Inference Server) : For the model to support batching the shape should have at least 1 dimension and the first dimension must be -1

#7462 opened Jul 22, 2024 by Vaishnvi

TRITON with Pytorch CPU only build not working

#7460 opened Jul 19, 2024 by ndeep27

Add Triton Backend: MindSpore

#7457 opened Jul 19, 2024 by Hsiayukoo

Tensorflow 2.16 / Keras 3 support

#7456 opened Jul 19, 2024 by marqueurs404

Triton considers max_batch_size as a number of channels for a given input image

#7450 opened Jul 17, 2024 by 12sf12

Is inferencing natively with C++ natively supported in Triton For Windows version 2.47 and ONNX backend? (Without GRPC and HTTPs calls.

#7446 opened Jul 15, 2024 by saugatapaul1010

Not able to perform multiple inferences on my ASR model

#7443 opened Jul 14, 2024 by aryan1165

[New] Discord channel for triton-inference-server, tensorrt

#7442 opened Jul 13, 2024 by geraldstanje

A fluctuating result is obtained when perf_analyze is run for a pressure test

#7436 opened Jul 11, 2024 by LinGeLin

Issue while setting up ONNX RUNTIME BACKEND natively on Windows 10.

#7431 opened Jul 9, 2024 by saugatapaul1010

Issue on page /user_guide/model_configuration.html

#7430 opened Jul 9, 2024 by JamesBowerXanda

Understanding and customize the vLLM backend question

Further information is requested

#7429 opened Jul 9, 2024 by CoolFish88

Is there a way to make the output buffer use the existing space?

#7428 opened Jul 9, 2024 by wanghuihhh

Add environment variable that allows you to append a prefix to all HTTP requests

#7426 opened Jul 8, 2024 by HeeebsInc

Get the underlying request_id associated with the corresponding InferenceResponse

#7422 opened Jul 8, 2024 by mhendrey

Not loaded: No model version was found

#7420 opened Jul 5, 2024 by jadhosn

Benchmarking VQA Model with Large Base64-Encoded Input Using perf_analyzer

#7419 opened Jul 5, 2024 by pigeonsoup

Previous 1 2 3 4 5 … 20 21 Next

Previous Next

ProTip! Adding no:label will show everything without a label.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly