Architecture Overview¶
kpod-metrics runs as a Kubernetes DaemonSet. Each pod collects metrics from the node it runs on.
Components¶
Node (DaemonSet pod)
┌─────────────────────────────────────────────────┐
│ Spring Boot (JDK 21 + Virtual Threads) │
│ │
│ MetricsCollectorService (every 30s default) │
│ ├── eBPF Collectors ──► JNI ──► BPF Maps │
│ │ ├── CpuSchedulingCollector │
│ │ ├── NetworkCollector │
│ │ ├── MemoryCollector │
│ │ ├── SyscallCollector │
│ │ ├── HttpCollector / DnsCollector │
│ │ ├── KafkaCollector / MongoCollector │
│ │ ├── BiolatencyCollector │
│ │ ├── CachestatCollector │
│ │ ├── TcpdropCollector │
│ │ ├── HardirqsCollector / SoftirqsCollector │
│ │ └── ExecsnoopCollector │
│ └── Cgroup Collectors ──► /sys/fs/cgroup │
│ ├── DiskIOCollector │
│ ├── InterfaceNetworkCollector │
│ ├── FilesystemCollector │
│ └── MemoryCgroupCollector │
│ │
│ PodWatcher (K8s informer, node-scoped) │
│ CgroupResolver (cgroup ID → pod metadata) │
│ Prometheus exporter (:9090/actuator/prometheus) │
└─────────────────────────────────────────────────┘
│ JNI (libkpod_bpf.so)
┌────▼────────────────────────┐
│ Linux Kernel │
│ ├── cpu_sched.bpf.o │
│ ├── net.bpf.o / dns.bpf.o │
│ ├── mem.bpf.o │
│ └── syscall.bpf.o │
│ │
│ Tracepoints: sched_switch, │
│ tcp_sendmsg, oom_kill, │
│ sys_enter/exit, ... │
└─────────────────────────────┘
Data Flow¶
- Kernel — eBPF programs are attached to tracepoints at startup. They populate BPF hash maps keyed by cgroup ID.
- JNI Bridge —
libkpod_bpf.sowraps libbpf and exposes map read operations to the JVM via JNI. Single syscall per map read (batch API). - Collectors — Kotlin collector classes read BPF maps (via generated
MapReaderclasses) and cgroup files every collection cycle. - CgroupResolver — Maps cgroup IDs to pod metadata using the K8s informer cache and
/procfilesystem. - Prometheus — Metrics are registered in a Micrometer
PrometheusMeterRegistryand scraped via/actuator/prometheus.
Key Design Decisions¶
- DaemonSet, not sidecar — One pod per node. Minimal resource overhead. No application changes needed.
- Cgroup-based attribution — eBPF programs key data by cgroup ID, which maps 1:1 to containers. No PID tracking needed.
- Graceful degradation — If a BPF program fails to load (unsupported kernel, missing capability), other collectors continue. No hard failure.
- Virtual threads — JDK 21 virtual threads handle concurrent collector execution without thread pool sizing.
Tech Stack¶
- Runtime: Kotlin 2.1.10, Spring Boot 3.4.3, JDK 21 (virtual threads)
- eBPF: CO-RE programs generated by kotlin-ebpf-dsl, compiled with clang, loaded via libbpf + JNI
- Metrics: Micrometer + Prometheus registry
- K8s: Fabric8 Kubernetes Client 7.1.0
- Build: Gradle 8.12 (composite build), multi-stage Docker
- CI/CD: GitHub Actions — unit tests on PRs, image publish on merge to main