pprof Quick Guide for HNSW Profiling¶
You're in pprof interactive mode - here's what to do:¶
Step 1: Find Top CPU Consumers¶
This shows the top 10 functions consuming CPU time. Look for: - HNSW search functions - GC-related functions (runtime.gcBgMarkWorker) - Vector operations - Lock contention
Step 2: Focus on HNSW Functions¶
Shows functions with cumulative time (includes called functions). This helps find the call chain.
Step 3: Check for GC Overhead¶
Look for: - runtime.gcBgMarkWorker - GC background work - runtime.mallocgc - Memory allocation - High percentage (>10%) indicates GC pressure
Step 4: Focus on HNSW Search¶
Shows line-by-line CPU time in the search function. This identifies hot spots.
Step 5: Generate Visual Report¶
Opens a visual call graph in your browser. Requires Graphviz installed.
Or generate SVG:
Step 6: Check Specific Functions¶
Step 7: Exit and Generate HTML Report¶
Then generate a web UI:
Opens interactive web UI at http://localhost:8080
Quick Commands Reference¶
| Command | Purpose |
|---|---|
top10 | Top 10 CPU consumers |
top20 -cum | Top 20 with cumulative time |
list <function> | Line-by-line breakdown |
web | Visual call graph (requires Graphviz) |
svg > file.svg | Generate SVG graph |
png > file.png | Generate PNG graph |
help | Show all commands |
exit or quit | Exit pprof |
What to Look For¶
GC Problems¶
runtime.gcBgMarkWorker> 10% of total timeruntime.mallocgcin top functions- Frequent GC pauses in trace
Allocation Hotspots¶
- Functions with high
flattime that allocate make()calls in hot pathsappend()on slices without pre-allocation
Lock Contention¶
sync.(*RWMutex).RLockin top functions- High time in lock acquisition
Vector Operations¶
vector.DotProductSIMDshould be fast- If slow, may need SIMD optimization
Next Steps After Profiling¶
- If GC is the problem:
- Check memory profile:
go tool pprof -alloc_space mem.prof - Look for allocation hotspots
-
Implement sync.Pool optimizations
-
If allocations are the problem:
- Use
list <function>to find exact lines - Add pools or pre-allocate buffers
-
Re-profile to verify improvements
-
If locks are the problem:
- Consider lock-free reads (advanced)
- Reduce lock scope
- Use read-only operations where possible