This repository presents flametui, an experimental Linux system profiler that attempts to visualize stack traces as flamegraphs in the terminal.
The fun part of this project is that it draws a flamegraph, in the TUI!
This tool uses eBPF to hook into the Linux kernel's perf_event subsystem, sampling stack traces across CPUs.
It tries to aggregate these traces in userspace and resolve them to human-readable symbols for visualization.
The origin story is that I always found it a bit annoying to have to install a bunch of different tools to get a flamegraph. Thus, this tool aims to provide an all-in-one solution: being able to measure and visualize in one go!
As a general disclaimer, this project is / was a huge learning experience for me, and written largely for my own enjoyment. This project has been insufficiently scrutenized to take its outputs as serious and correct. It looks cool though!
I hope to keep the motivation to keep pushing it forward, trying to be somewhat neat in the versioning. Currently, it is in a dodgy state, so we are still in ALPHA.
Please checkout a tag if you want a stable version, main regularly breaking
As vaxis pins 0.15.1, this is the required version. Note that we supply a flake.nix such that you can use my
exact zig version if you care to!
# Build the project
zig build -Doptimize=ReleaseFast
# Run the profiler (requires root/CAP_BPF privileges)
# Sample at 49Hz for 1 second
sudo zig-out/bin/flametui fixed --attach perf=49 --ms 1000
# Aggregate indefinitely — streams results to TUI, never evicts
sudo zig-out/bin/flametui aggregate --attach tracepoint=kmem:kmalloc
# Sliding window — keeps the last N time slots, evicts oldest
sudo zig-out/bin/flametui ring --attach kprobe=alloc_fd --ms 50 --n 10
# Profile specific Process IDs (space separated string). These's pids are filtered inside of the eBPF program!
sudo zig-out/bin/flametui fixed --pid "$(pidof YOUR_PROC_NAME)" --attach uprobe=/lib64/libc.so.6:malloc
If you dont want to use my profiler (I must admit it is janky), you can also try doing something like:
# Record + collapse stack traces
sudo perf record -F 99 -a -g -- sleep 5
sudo perf script | zig-out/bin/flametui(Note: CURRENTLY BROKEN, thus disabled) Note that you can click on the nodes to expand the view! You can also navigate with keyboard: hjkl / WASD / arrow keys to move, Enter to zoom in, Escape to unzoom, q to quit.
All profiling commands also accept --verbose for debug logging and --enable-idle to include idle (pid 0) samples.
I've added some support for making official flamegraphs by supporting the .collapsed format.
This is done by hitting e while in tui mode (probably should support something more headless).
This file can then be used by the flamegraph.pl program to make the svg as usual.
Before 0.0.3-alpha:
- Review files ensuring tests in place and happy, and refactor the ugly code:
- src/app.zig
- src/bpf.zig
- src/cimport.zig
- src/kmap.zig
- src/lock.zig
- src/main.zig
- src/profile.zig
- src/root.zig
- src/sharedobject.zig
- src/stacktrie.zig
- src/symboltrie.zig
- src/tui.zig
- src/umap.zig
- Fix the navigation functionality (post sorting)
- Lifecycle improvements, ensure memory usage more limited
There are several areas where this project could be improved:
- Help Menu: To see what the keybindings are. Not important currently, cause I dont have keybindings.
- Lifecycle Improvements: PIDs die, PIDs are born. We don't track that, so a PID can die, and come back and our current caching mechanism just gets it wrong... Fix this, potentially by tracking some more things in bpf.
- Off-CPU:
I want to make off-cpu flamegraphs too, this seems kind of useful.
- mode=offcpu, mode=cpu, etc. CLI arguments
- More Flamegraphs!!: I'm sure I can cook harder now that I have the basics down
- Vendoring as Library:
Plug into build.zig to have an e.g.
zig build profilestep. - Fix Bug in Navigation: Navigation is a hack, doesn't work well. Review and fix
- Improve rendering: Currently, the drawing works. However, it fully redraws recursively each time. Thats OK, but probably not needed. This causes the rendering to take a boat load of time.
I have tested on strictly recent kernels. I know there are some issues, for example on RT linux. We use dynamic allocations
to populate the eBPF ringbuffers. Dynamic allocations use spinlocks. There are some issues with the corresponding locks
on older kernel versions. One patch
fixes this by switching to raw_spinlock_t, in 6.12. Before that, a spinlock_t is used.
Code is mostly hand-written (90% / 10% split), namely tests were generated. I don't really endorse this practice, but this project was coded for my own enjoyment and I enjoy the dopamine of new features too much to resist the temptation :)...
AI was however used heavily in researching both eBPF, how to create flamegraphs, and other systems programming details. Personally, I enjoy generative AI the most for doing research and learning. I think this applies especially to software, as software allows for rapid hypothesis testing: even if the LLM barfs some nonesense, you can easily fact check it in many cases. This is less true in other disciplines.
I also commit my AI flow. It's extremely brutalist: I have a context.sh script that just bangs everything in
an easy to copy format. Currently, the whole project fits in a context window. This makes for easy prompting.
Gemini was found to be way better for research, and claude was found to be great for finding memory leaks when I was
feeling lazy.
