Skip to content

cgroup v2-aware CPU & memory metering (tracking) #1

Description

@Oblynx

cgroup v2-aware CPU & memory metering

Tracking issue for this project. Project management + design docs live here. Upstream goal: land in htop-dev/htop (relates to upstream #1538 and #1020).

Problem

htop running inside a container (Kubernetes pod, Docker, LXC) misreports resources. The Memory and CPU meters source host-wide values from procfs — /proc/meminfo, /proc/stat, and host CPU topology under /sys/devices/system/cpu/ — which the kernel does not namespace per-container. So in a pod limited to e.g. 16 vCPU / 32 GB on a 224-thread / 2 TB host, htop shows the host's 224 threads and 2 TB as if available. The meters lie.

What htop does today (research findings)

  • cgroup support is display/labeling only: CGROUP/CCGROUP/CONTAINER columns in linux/CGroupUtils.c pretty-print the cgroup path. No limit file is ever read.
  • Meters source host procfs: LinuxMachine_scanMemoryInfo() (/proc/meminfo), LinuxMachine_scanCPUTime() (/proc/stat), LinuxMachine_updateCPUcount() (host /sys/devices/system/cpu/*/online).
  • A Running_containerized flag exists but is currently used only for ZFS memory accounting in linux/Platform.c — not for the meters.
  • Zero cgroup-limit awareness anywhere in metering.

Prior art

No tool in this class has shipped cgroup v2 memory-limit-aware metering. This work is non-duplicative.

Scope & plan

Memory + CPU, delivered as 2 separate PRs:

  1. PR 1 — Memory (first, simpler). Resolve the process's own cgroup, read the effective memory limit (v1 memory.limit_in_bytes, v2 memory.max), use it as the Memory meter denominator with usage from memory.current. Walk ancestors and take the min for the effective limit. Fall back to host total when unlimited (max).
  2. PR 2 — CPU. Handle both independent limits: quota/period (cpu.max → fractional effective vCPUs) and cpuset (cpuset.cpus.effective → which cores).

Goal: upstream

Target htop-dev/htop. Follow docs/styleguide.md, keep footprint small (htop is a system tool that runs under stress), acknowledge AI assistance via Co-authored-by: per CONTRIBUTING.

UX: automatic

The meters must show the effective limit without any user action when started in a limited environment. Auto-detect; no toggle required to get correct numbers.

Design footguns to handle

  • cgroup v1 vs v2 both in the wild (modern k8s/distros are v2); support both.
  • Nested cgroups: a leaf memory.max may be max while an ancestor (kubepods.slice) holds the real cap — walk ancestors, take the min.
  • max = unlimited → fall back to host total, never divide by infinity.
  • CPU = two independent limits (quota/period vs cpuset); a container may have either, both, or neither.
  • Memory "used" should come from memory.current, not host /proc/meminfo. RAM-only vs RAM+swap (memory.swap.max) decision.
  • LXCFS deference to avoid double-correction.

Code orientation

  • linux/LinuxMachine.c — meter data sources; injection point for limit clamp.
  • linux/Platform.cPlatform_setMemoryValues()/setCPUValues(); holds Running_containerized.
  • linux/CGroupUtils.c — home for new helpers to resolve /proc/self/cgroup and parse limit files.
  • MemoryMeter.c / CPUMeter.c — rendering.

Scoping research conducted with AI assistance (Claude).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions