Name: Production-Ready LLMs on Kubernetes: Patterns, Pitfalls, and Performance : Luke Marsden, MLOps Consulting, Founder and Priya Samuel, TechnoCat, Full stack engineer
Start: 2025-02-05T14:20:00+0000
End: 2025-02-05T14:45:00+0000

Wednesday February 5, 2025 2:20pm - 2:45pm GMT

Grand Hall 2

Many orgs are evaluating running open source LLMs on their own infrastructure, and Kubernetes is a natural platform choice. However, running open source LLMs in production on Kubernetes is, honestly, a bit of an undocumented mess.

This technical presentation shares the experience of both speakers in deploying production-grade LLM infrastructure on Kubernetes. Through practical demonstrations, we'll explore the complete deployment lifecycle, from GPU setup to optimization techniques like Flash Attention, quantization tradeoffs and GPU sharing.

You'll learn:

* Architectural patterns for efficient LLM deployment using Ollama and vLLM
* Solutions for model weight management and context length optimization
* Techniques for GPU sharing and improving resource utilization
* Production approaches to fine-tuning with Axolotl and serving multiple models with LoRAX

You'll leave with a complete blueprint for building reliable, scalable LLM infrastructure on Kubernetes.

Speakers

Luke Marsden

Founder, MLOps Consulting

Technical leader and startup founder who participated in the early development of Docker and Kubernetes. Former SIG lead for SIG-cluster-lifecycle.

Priya Samuel

Full stack engineer, Software Architect, Elsevier

Priya Samuel is a seasoned technology leader with a passion for transforming complex challenges into actionable solutions. With extensive expertise in DevOps, and cloud-native technologies, and Identity and Access Management (IAM). Priya has helped organizations scale their data and... Read More →

Wednesday February 5, 2025 2:20pm - 2:45pm GMT
Grand Hall 2

AI Openness, Talk

State of Open Con 2025

Luke Marsden

Priya Samuel

Attendees (2)

State of Open Con 2025

Luke Marsden

Priya Samuel

Attendees (2)

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!