MLOps: model lifecycle for infra engineers

Sixth post in the series. In the previous one, we automated GPU cluster provisioning. Now let’s talk about what happens after the hardware is ready: how a model goes from “works on my notebook” to “running in production with an SLA.” The model with no birth certificate A data scientist drops a message in the team channel with a link to a shared drive: “Here’s the model. It’s a 15 GB PyTorch checkpoint. We need it in production by Friday.” ...

May 30, 2026 · 6 min · Ricardo Martins