Tuesday, February 24, 2026

Full Dynticks Is Not "No Ticks Ever"

On a low-latency Linux host, it is easy to assume that nohz_full (full dynticks / adaptive ticks) means "this CPU will not get ticks." In practice, tick behavior depends on both idle policy and runqueue state.

This post gives a brief summary of how this works on a Rocky Linux 9.x / 5.14.x kernel, with easy-to-reproduce examples.

TL;DR

  • nohz_full + CPU isolation does not mean zero ticks in all states.
  • With idle=poll, an idle isolated CPU can still take the periodic tick (~HZ, often 1000 Hz).
  • On CFS, if an isolated CPU has exactly one runnable task, the scheduler tick can be stopped.
  • Once that CPU has 2+ runnable CFS tasks, tick dependency is re-enabled and periodic tick resumes.


Test Setup

... isolcpus=nohz,domain,managed_irq,1-63 nohz_full=1-63 rcu_nocbs=1-63 idle=poll processor.max_cstate=0 ...

Key points:
  • CPUs 1-63 are isolated/full-dynticks candidates.
  • idle=poll is enabled.
  • HZ=1000.


Observation 1: Idle Isolated CPUs Still Tick with idle=poll

When CPUs 2 and 5 were idle, this probe showed about 5000 events per CPU in 5 seconds (~1000 Hz):


bpftrace -e 'kprobe:update_process_times /cpu==5 || cpu==2/ @ticks[cpu]++; }
             interval:s:5 { print(@ticks); clear(@ticks); }'

@ticks[5]: 4999
@ticks[2]: 4999
...

Why: idle=poll forces the polling idle path, which restarts/keeps the tick while idle instead of entering deeper idle states.

Observation 2: One Spinning Task on CPU 5 Stops Scheduler Tick

A single userspace busy loop pinned to CPU 5:

taskset -c 5 sh -c 'while true; do :; done'

CPU was 100% user, and /proc/<pid>/schedstat showed:
  • field 1 (CPU runtime) increasing,
  • field 2 (runqueue wait time) not increasing,
  • field 3 (timeslice count for this task) not increasing.
(Documentation/scheduler/sched-stats.rst defines field 3 as "# of timeslices run on this cpu". In practice this is a useful proxy for "how often the task gets sliced/re-scheduled".)

Key proof that the periodic tick stopped on CPU 5 with one runnable task: update_process_times drops from ~1000 Hz to 0.

bpftrace -e 'kprobe:update_process_times /cpu == 5/ { @ticks++; }
             interval:s:5 { printf("CPU5 ticks in last 5s: %lld\n", (int64)@ticks); clear(@ticks); }'

CPU5 ticks in last 5s: 4999
CPU5 ticks in last 5s: 5000
CPU5 ticks in last 5s: 3581
CPU5 ticks in last 5s: 0
CPU5 ticks in last 5s: 0
...

Observation 3: Add a Second Spinning Task on CPU 5, Tick Comes Back

After starting a second pinned spinner on CPU 5, /proc/<pid>/schedstat for the first task started showing:
  • runqueue wait time increasing,
  • timeslice count increasing.

With the same probe still running, update_process_times on CPU 5 went back to ~1000 Hz while both were runnable.

Mechanically (CFS path): when CFS runqueue depth grows above 1 (rq->cfs.h_nr_running > 1 on this Rocky 5.14 host), the scheduler updates tick dependency and sets TICK_DEP_BIT_SCHED, which prevents full-dynticks tick suppression and kicks the CPU back into periodic tick mode. In some newer kernels/trees, you may see equivalent logic expressed with h_nr_queued instead.

This matches scheduler behavior for CFS: with more than one runnable task, periodic preemption/accounting is needed.

Why Low-Latency Systems Use idle=poll

idle=poll is used for determinism, not efficiency:
  • avoids deep C-state entry/exit latency,
  • reduces wakeup jitter variance,
  • keeps cores immediately responsive for bursty workloads.


Scheduling Class Caveats

The "0/1 runnable task => stop tick, 2+ => restart" rule is mainly CFS behavior.
  • SCHED_OTHER/CFS: 2+ runnable tasks generally require tick.
  • SCHED_FIFO: does not time-slice equal-priority tasks; one can run indefinitely until block/yield/preempted by higher-priority work. Tick may still be forced by other dependencies.
  • SCHED_RR: more than one RR task needs tick for RR semantics.
  • SCHED_DEADLINE: tick is generally required.
Also, independent tick dependencies (perf events, POSIX CPU timers, RCU, etc.) can force tick even with one runnable task.

Practical Takeaway

If you observe periodic ticks on an "isolated nohz_full" CPU, first ask:
  1. Is idle=poll enabled?
  2. Is the CPU idle right now?
  3. Does it currently have 1 runnable task or 2+?
  4. Are there extra tick dependencies active?
In other words, full dynticks is conditional behavior, not a blanket guarantee.

No comments: