Kernel-level QoS enforcement

Every pod on a Temper node belongs to one of five QoS tiers, derived from standard Kubernetes PriorityClasses and resource requests — no CRDs, no custom labels, no application changes. Tiers become kernel scheduler layers with real teeth.

Naming note. Binaries, the helm chart, and annotation keys currently ship under the project’s former name (infera); the commands below are what works today. A rename migration is planned.

The five tiers

Tier	PriorityClass	Priority value	Layer kind	Protected	Preempt	Exclusive cores	Timeslice
Critical	`infera-critical`	≥ 1,000,000	Confined	yes	yes	yes	5 ms
High	`infera-high`	≥ 100,000	Grouped	yes	yes	no	10 ms
Normal	`infera-normal`	≥ 0	Grouped	no	no	no	20 ms
Low	`infera-low`	≥ −100	Open	no	no	no	30 ms
Background	`infera-background`	< −100	Open	no	no	no	5 ms

Confined layers get their own CPU allocation, fenced. Grouped layers get a preferred CPU set they share. Open layers run on whatever is idle.
Protected + preempt means the tier’s threads kick lower tiers off a CPU the moment they wake, instead of waiting in a runqueue.
Background’s short 5 ms slice is a deliberate latency bound: outside the protected fence, the longest a higher tier can wait behind a background thread is one short timeslice.

The thresholds are configurable in the agent’s [priority_mapping] config section; the values above are the shipped defaults, matching the PriorityClasses the helm chart installs.

Assigning a tier

Assignment is one field in the pod spec:

spec:
  priorityClassName: infera-critical

Because the input is pod.spec.priority, existing PriorityClasses you already use for eviction ordering participate automatically — the same priority signal now also means something at the CPU scheduler.

Defaults for unlabeled pods

Pods with no PriorityClass (priority 0) are tiered by their Kubernetes QoS class:

Kubernetes QoS class	Default tier
Guaranteed / Burstable	Normal
BestEffort	Background

So an untouched cluster behaves sensibly on day one: nothing is fenced, best-effort work yields, and enforcement sharpens only as you hand out PriorityClasses.

Resource-aware layer parameters

Tier membership decides the kind of layer; your pods’ actual resource specs decide its size. On every assignment change the agent recomputes, per tier:

Weight — from the aggregate CPU requests of the tier’s pods, so a tier holding 6 requested cores outweighs one holding 2. Never hardcoded.
CPU range — for Confined and Grouped layers, derived from aggregate requests and limits: how many CPUs the layer may occupy.
Utilization band — from the mix of Kubernetes QoS classes in the tier: Guaranteed pods produce a tight band, Burstable a medium one, BestEffort none.

Two system layers are always appended: a small always-on layer that keeps the scheduler and agent threads serviced even under total saturation, and a catch-all Open layer for everything else (kubelet, node daemons). Empty layers are omitted. If the Critical tier requests more whole cores than the node can actually fence, Temper demotes that layer to Grouped instead of silently caging it — graceful degradation over false confinement (sizing guidance: operations).

Inspecting the result

The generated scheduler configuration is never a black box:

kubectl get node <node> -o jsonpath='{.metadata.annotations.infera\.io/qos-distribution}'
# per-tier pod counts and CPU millis, as published for the scheduler plugin

The full generated layer config is available via the agent’s gRPC (GetLayeredConfig), the CLI, or the dashboard, which also shows a diff across config generations.

Division of labor with profiles

Tiers arbitrate between workloads. To schedule the threads inside one workload differently — hot loops vs. I/O chains vs. housekeeping — use workload profiles on top of the tier system.