Skip to content

defaultCpuLoad reports incorrect load inside Docker containers #1082

@dremonkey

Description

@dremonkey

Bug

defaultCpuLoad in worker.ts uses os.cpus() to measure CPU utilization, but inside a Docker container os.cpus() returns host CPU counters, not the container's cgroup allocation. This causes the worker to report inflated load values to the LiveKit server, which intermittently refuses to dispatch jobs with:

failed to send job request  {"error": "no servers available (received 1 responses)", "jobType": "JT_ROOM", "agentName": ""}

The worker registers successfully and the Node.js side never marks itself as WS_FULL (since loadThreshold is Infinity in dev mode), but the raw load value sent to the Go server appears to be interpreted independently, causing the server to consider the worker unavailable.

Reproduction

  1. Run @livekit/agents v1.0.43 inside a Docker container (oven/bun:1 base image)
  2. Host has many CPUs (tested with 24)
  3. Worker registers with LiveKit server v1.9.11
  4. Create a room — dispatch intermittently fails with "no servers available"
  5. Override loadFunc: async () => 0 in ServerOptions — dispatch works reliably

Root cause

os.cpus().times in Node.js/Bun is not cgroup-aware. Inside a container it reflects all host CPUs, producing unreliable utilization percentages. This is a well-known Node.js limitation.

Suggested fix

Replace os.cpus() sampling with cgroup-aware CPU measurement when running inside a container:

  • cgroup v2: Read usage_usec from /sys/fs/cgroup/cpu.stat, compute delta against wall time and CPU quota from /sys/fs/cgroup/cpu.max
  • cgroup v1: Read /sys/fs/cgroup/cpu/cpuacct.usage and /sys/fs/cgroup/cpu/cpu.cfs_quota_us
  • Detection: Check for /.dockerenv or parse /proc/1/cgroup
  • Fallback: Use current os.cpus() approach when not in a container

Workaround

Override loadFunc in ServerOptions to bypass the default measurement:

cli.runApp(new ServerOptions({
  agent: import.meta.filename,
  loadFunc: async () => 0,
}));

Environment

  • @livekit/agents: 1.0.43
  • LiveKit server: 1.9.11
  • Runtime: Bun 1.3.8 (inside oven/bun:1 Docker image)
  • Host: Linux 6.12.70, 24 CPUs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions