Scaling LLM/GenAI deployment with NVIDIA Triton on Amazon EKS
Fill form to unlock content
Loading, please wait
Error - something went wrong!
Webinar Series Australia and New Zealand
Thank you!
Triton is open source inference serving software that simplifies the inference serving process and provides high inference performance. This session explores the synergy of NVIDIA Triton and Amazon EKS for efficient, large-scale machine learning model deployment. We discuss how Triton Inference Server standardizes machine learning inferencing across various deep learning frameworks like PyTorch, ONNX, and TensorRT to streamline GenAI/LLM model deployment.
Level: L400
Speaker: Keita Watanabe, Senior Solutions Architect, Frameworks, AWS