Deep observability & thread-aware rightsizing
The node agent watches scheduling at the same depth it enforces it: per-pod, per-thread placement and runqueue telemetry, always on, at a measured 0.13% of one core.
infera); the commands below are what works today. A rename migration is planned./metrics — Prometheus
Every agent serves a Prometheus endpoint on the node (port 9100 by default) with QoS and scheduler metrics: per-tier assignment counts, config generation, safe-mode state, PSI-derived contention signals, and placement-linter verdicts. A ServiceMonitor is helm-gated for Prometheus Operator setups:
helm upgrade infera deploy/helm/infera --reuse-values \
--set serviceMonitor.enabled=true
Generated Grafana dashboards ship with the chart, so the fleet view lands in your existing metrics stack without hand-building panels.
/observe — the thread-level snapshot
Alongside metrics, the agent publishes a JSON observation snapshot at
GET /observe on the same port: the machine shape, per-layer scheduler statistics,
the top busy threads with their placement, PSI readings, linter verdicts, and the observation
layer’s own overhead. It is the raw material for profile training and the quickest way to
answer “what is the scheduler actually doing on this node?”
kubectl -n infera port-forward ds/infera-agent 9100:9100
curl -s localhost:9100/observe | jq '.layers[] | {name, cpus, util}'
The placement linter
An always-on invariant checker audits actual thread placement against what the configuration
promises, and exports violations as metrics
(infera_lint_violation{invariant=...}) with rate-limited log warnings:
| Invariant | Catches |
|---|---|
smt_collision | Protected threads sharing a physical core with noisy siblings |
protected_fallback | Protected-tier threads running outside their fenced layer |
open_reserve | Open-layer work violating the reserved headroom |
layer_mismatch | Threads attributed to a different layer than their pod’s tier implies |
Zero violations is the steady state; a persistent violation is an alertable signal that configuration and reality have drifted.
Kernel trace capture
For the questions metrics cannot answer, the agent captures bounded perfetto kernel traces
on demand — scheduling events straight from the kernel, downloadable and openable in the
perfetto UI. Captures are duration- and size-capped, one at a time per node (concurrent
requests are refused, not queued), and triggered via the CLI, the agent RPC
(CaptureTrace), or the dashboard. Nothing extra is
installed on nodes — the tracer ships in the agent image.
Thread-aware rightsizing
Container-average rightsizers recommend requests from a pod-level CPU mean — which is blind to structure. A pod averaging 1.2 cores might be four lazy threads (fine at 1.5 cores, shared) or one hot thread pinned at 100% plus overhead (needs an exclusive core; throttling it is a latency incident). Temper’s rightsizer reads the thread-level usage the observation layer already collects, so its recommendations distinguish those cases. The identified-savings number in the dashboard is computed from the same data (declared requests vs. 6-hour measured usage).
Overhead — measured, not promised
The entire always-on observation layer — placement sampling, schedstat deltas, scheduler stats, the linter — costs under 1% CPU, measured at 0.13% of one core in production configuration. Perfetto trace bursts are bounded and only run when you ask for them or during training cycles.