

In this episode, we talk with Abdel Sghiouar and Mofi Rahman, Developer Advocates at Google and (guest) hosts of the Kubernetes Podcast from Google. Together, we dive into one central question: can you truly run LLMs reliably and at scale on Kubernetes?
In this episode, we talk with Abdel Sghiouar and Mofi Rahman, Developer Advocates at Google and (guest) hosts of the Kubernetes Podcast from Google.
Together, we dive into one central question: can you truly run LLMs reliably and at scale on Kubernetes?
It quickly becomes clear that LLM workloads behave nothing like traditional web applications:
And then there’s the occasional request from customers who want deterministic LLM output —
to which Mofi dryly responds:
“You don’t need a model — you need a database.”