skills/linux-perf/SKILL.md
Linux perf profiler skill for CPU performance analysis. Use when collecting sampling profiles with perf record, generating perf report, measuring hardware counters (cache misses, branch mispredicts, IPC), identifying hot functions, or feeding perf data into flamegraph tools. Activates on queries about perf, Linux performance counters, PMU events, off-CPU profiling, perf stat, perf annotate, or sampling-based profiling on Linux.
npx skillsauth add awfixers-stuff/opencode-config linux-perfInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Guide agents through perf for CPU profiling: sampling, hardware counter measurement, hotspot identification, and integration with flamegraph generation.
perf to find hotspots?"[unknown] or [kernel] frames"# Install
sudo apt install linux-perf # Debian/Ubuntu (version-matched)
sudo dnf install perf # Fedora/RHEL
# Check permissions
# By default perf requires root or paranoid level ≤ 1
cat /proc/sys/kernel/perf_event_paranoid
# 2 = only CPU stats (not kernel), 1 = user+kernel, 0 = all, -1 = no restrictions
# Temporarily lower (session only)
sudo sysctl -w kernel.perf_event_paranoid=1
# Persistent
echo 'kernel.perf_event_paranoid=1' | sudo tee /etc/sysctl.d/99-perf.conf
sudo sysctl -p /etc/sysctl.d/99-perf.conf
Compile the target with debug symbols for useful frame data:
gcc -g -O2 -fno-omit-frame-pointer -o prog main.c
# -fno-omit-frame-pointer: essential for frame-pointer-based unwinding
# Alternative: compile with DWARF CFI and use --call-graph=dwarf
# Basic hardware counters
perf stat ./prog
# With specific events
perf stat -e cache-misses,cache-references,instructions,cycles,branch-misses ./prog
# Wall-clock comparison: N runs
perf stat -r 5 ./prog
# Attach to existing process
perf stat -p 12345 sleep 10
Interpret perf stat output:
# Default: sample at 1000 Hz (cycles event)
perf record -g ./prog
# Specify frequency
perf record -F 999 -g ./prog
# Specific event
perf record -e cache-misses -g ./prog
# Attach to running process
perf record -F 999 -g -p 12345 sleep 30
# Off-CPU profiling (time spent waiting)
perf record -e sched:sched_switch -ag sleep 10
# DWARF call graphs (better for binaries without frame pointers)
perf record -F 999 --call-graph=dwarf ./prog
# Save to named file
perf record -o myapp.perf.data -g ./prog
perf report # reads perf.data
perf report -i myapp.perf.data
perf report --no-children # self time only (not cumulative)
perf report --sort comm,dso,sym # sort by fields
perf report --stdio # non-interactive text output
Navigation in TUI:
Enter — expand a symbola — annotate (show assembly with hit counts)s — show source (needs debug info)d — filter by DSO (library)t — filter by thread? — help# Show assembly with hit percentages
perf annotate sym_name
# From report: press 'a' on a symbol
# Or directly:
perf annotate -i perf.data --symbol=hot_function --stdio
High hit count on a mov or vmovdqa suggests a cache miss at that load.
# Live top, like 'top' but for functions
sudo perf top -g
# Filter by process
sudo perf top -p 12345
# Generate perf script output
perf script > out.perf
# Use Brendan Gregg's FlameGraph tools
git clone https://github.com/brendangregg/FlameGraph
./FlameGraph/stackcollapse-perf.pl out.perf > out.folded
./FlameGraph/flamegraph.pl out.folded > flamegraph.svg
# Open flamegraph.svg in browser
See skills/profilers/flamegraphs for reading flamegraphs and interpreting results.
| Problem | Cause | Fix |
|---------|-------|-----|
| Permission denied | perf_event_paranoid too high | Lower paranoid level or run with sudo |
| [unknown] frames | Missing frame pointers or debug info | Recompile with -fno-omit-frame-pointer or use --call-graph=dwarf |
| [kernel] everywhere | Kernel symbols not visible | Use sudo perf record; install linux-image-$(uname -r)-dbgsym |
| No kallsyms | Kernel symbols unavailable | echo 0 | sudo tee /proc/sys/kernel/kptr_restrict |
| Empty report for short program | Program exits too fast | Use -F 9999 or instrument longer workload |
| DWARF unwinding slow | Large DWARF stack | Limit with --call-graph dwarf,512 |
# List all available events
perf list
# Common hardware events
cycles
instructions
cache-references
cache-misses
branch-instructions
branch-misses
stalled-cycles-frontend
stalled-cycles-backend
# Software events
context-switches
cpu-migrations
page-faults
# Tracepoints (requires root)
sched:sched_switch
syscalls:sys_enter_read
For a counter reference and interpretation guide, see references/events.md.
skills/profilers/flamegraphs for SVG flamegraph generation and readingskills/profilers/valgrind for cache simulation and memory profilingskills/compilers/gcc or skills/compilers/clang for PGO from perf data (AutoFDO)development
Use when starting dev servers, watchers, tilt, or any process expected to outlive the conversation. Provides zmx session management patterns for long-lived processes.
development
Zig testing skill for writing and running tests. Use when using zig build test, writing comptime tests, using test filters, working with test allocators to detect leaks, or using Zig's built-in fuzz testing (0.14+). Activates on queries about Zig tests, zig test, zig build test, comptime testing, test allocators, Zig fuzz testing, or detecting memory leaks in Zig tests.
development
Zig debugging skill. Use when debugging Zig programs with GDB or LLDB, interpreting Zig runtime panics, using std.debug.print for tracing, configuring debug builds, or debugging Zig programs in VS Code. Activates on queries about debugging Zig, Zig panics, zig gdb, zig lldb, std.debug.print, Zig stack traces, or Zig error return traces.
tools
Zig cross-compilation skill. Use when cross-compiling Zig programs to different targets, using Zig's built-in cross-compilation for embedded, WASM, Windows, ARM, or using zig cc to cross-compile C code without a system cross-toolchain. Activates on queries about Zig cross-compilation, zig target triples, zig cc cross-compile, Zig embedded targets, or Zig WASM.