Opentelemetry

Seventh post in the series. In the previous one, we put models into production with CI/CD pipelines. Now: how do you know they’re actually healthy? tl;dr Infra health is not model health. Track GPU, token, application, and answer-quality signals together or you will miss regressions while every dashboard stays green. The silent failure Your Azure OpenAI endpoint returns 200 OK on every request. Latency is normal, P95 under 800ms. CPU and memory within thresholds. Kubernetes shows healthy pods, no restarts. By every infra metric you trust, the system is perfect. ...