IntelliKit

LLM-Ready Profiling and Analysis Toolkit for AMD CPU and GPUs

IntelliKit is a collection of intelligent tools designed to make CPU and GPU code development, profiling, and validation accessible to LLMs and human developers alike. Built for AMD ROCm, these tools provide clean abstractions over complex GPU internals.

Philosophy

Traditional CPU and GPU profiling and analysis tools expose raw hardware counters and assembly. IntelliKit tools are designed to:

Decode complexity: Turn hardware metrics into human-readable insights
Enable LLM integration: Provide clean APIs suitable for LLM-driven workflows (MCP-ready)

Tools

Accordo - Automated Kernel Validation

Automated correctness validation for GPU kernel optimizations.

Use cases:

Verify optimized kernels match reference implementation
Compare performance while ensuring correctness
Test multiple optimization candidates efficiently

Quick example:

from accordo import Accordo

# Create validator (auto-extracts kernel signature)
validator = Accordo(binary="./ref", kernel_name="reduce_sum")

# Capture snapshots from reference and optimized binaries
ref = validator.capture_snapshot(binary="./ref")
opt = validator.capture_snapshot(binary="./opt")

# Compare for correctness
result = validator.compare_snapshots(ref, opt, tolerance=1e-6)

if result.is_valid:
    print(f"✓ PASS: {result.num_arrays_validated} arrays matched")
else:
    print(result.summary())

Linex - Source-Level GPU Performance Profiling

Maps GPU performance metrics to your source code lines.

Use cases:

Identify performance hotspots at source code granularity
Understand cycle-level timing for each line of code
Analyze stall patterns and execution bottlenecks

Quick example:

from linex import Linex

profiler = Linex()
profiler.profile("./my_app", kernel_filter="my_kernel")

# Show hotspots
for line in profiler.source_lines[:5]:
    print(f"{line.file}:{line.line_number}")
    print(f"  {line.total_cycles:,} cycles ({line.stall_percent:.1f}% stalled)")

Metrix - Human-Readable GPU Metrics

Decodes hardware counters into actionable performance insights.

Use cases:

Profile GPU kernels with clean, understandable metrics
Identify memory bandwidth bottlenecks
Analyze compute utilization patterns

Quick example:

from metrix import Metrix

profiler = Metrix()
results = profiler.profile("./my_app", metrics=["memory.hbm_bandwidth_utilization"])

for kernel in results.kernels:
    print(f"{kernel.name}: {kernel.duration_us.avg:.2f} μs")
    print(f"Memory BW: {kernel.metrics['memory.hbm_bandwidth_utilization'].avg:.1f}%")

Nexus - HSA Packet Source Code Extractor

Intercepts GPU kernel launches and extracts source code + assembly from HSA packets.

Use cases:

Understand what code actually runs on the GPU
Debug kernel compilation and optimization
Trace HIP, Triton, and other GPU frameworks

Quick example:

from nexus import Nexus

nexus = Nexus(log_level=1)
trace = nexus.run(["python", "gpu_app.py"])

for kernel in trace:
    print(f"{kernel.name}: {len(kernel.assembly)} instructions")
    print(kernel.hip)  # Source code

ROCm-MCP - Model Context Protocol Servers of ROCm Tools

Enables LLMs to interact with ROCm tools via MCP.

Use cases:

Compile HIP code.
Access HIP reference guide.
Query device capabilities.

Quick example:

Add to your JSON MCP config:

{
  "mcpServers": {
    "hip-compiler-mcp": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/rocm_mcp", "hip-compiler-mcp"]
    }
  }
}

uprof-MCP - Model Context Protocol Server for uProf

Enables LLMs to interact with AMD uProf via MCP.

Use cases:

Profile applications using uProf.

Quick example: Add to your JSON MCP config:

{
  "mcpServers": {
    "uprof-profiler-mcp": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/uprof_mcp", "uprof-profiler-mcp"]
    }
  }
}

Installation

Install All Tools

pip install "git+https://github.com/AMDResearch/intellikit.git#egg=intellikit[all]"

This installs: accordo, linex, metrix, nexus, rocm_mcp, and uprof_mcp.

Install Individual Tools

Install only what you need using extras:

# Accordo only
pip install "git+https://github.com/AMDResearch/intellikit.git#egg=intellikit[accordo]"

# Linex only
pip install "git+https://github.com/AMDResearch/intellikit.git#egg=intellikit[linex]"

# Metrix only
pip install "git+https://github.com/AMDResearch/intellikit.git#egg=intellikit[metrix]"

# Nexus only
pip install "git+https://github.com/AMDResearch/intellikit.git#egg=intellikit[nexus]"

# ROCm-MCP only
pip install "git+https://github.com/AMDResearch/intellikit.git#egg=intellikit[rocm_mcp]"

# uprof-MCP only
pip install "git+https://github.com/AMDResearch/intellikit.git#egg=intellikit[uprof_mcp]"

# Multiple tools
pip install "git+https://github.com/AMDResearch/intellikit.git#egg=intellikit[nexus,metrix]"

Development Installation

git clone https://github.com/AMDResearch/intellikit.git
cd intellikit

# Install all tools in editable mode
pip install -e ".[all]"

# Or install specific tools only
pip install -e ".[accordo]"
pip install -e ".[linex]"
pip install -e ".[metrix]"
pip install -e ".[nexus]"
pip install -e ".[rocm_mcp]"
pip install -e ".[uprof_mcp]"

Requirements

Python: >= 3.10
ROCm: >= 6.0 (7.0+ for linex)
Hardware: MI300+ GPUs

Documentation

Each tool has its own detailed documentation:

Example Workflow

# 1. Profile baseline kernel with Metrix
from metrix import Metrix
profiler = Metrix()
baseline_results = profiler.profile("./app_baseline")
baseline_bw = baseline_results.kernels[0].metrics['memory.hbm_bandwidth_utilization'].avg

# 2. Extract kernel source with Nexus
from nexus import Nexus
nexus = Nexus()
trace = nexus.run(["./app_baseline"])
for kernel in trace:
    print(kernel.hip)  # Source code

# 3. Apply optimization (external step)
# ... modify kernel ...

# 4. Validate with Accordo
from accordo import Accordo
validator = Accordo(binary="./app_baseline", kernel_name="my_kernel")

ref_snap = validator.capture_snapshot(binary="./app_baseline")
opt_snap = validator.capture_snapshot(binary="./app_opt")
result = validator.compare_snapshots(ref_snap, opt_snap, tolerance=1e-6)

if result.is_valid:
    opt_results = profiler.profile("./app_opt")
    opt_bw = opt_results.kernels[0].metrics['memory.hbm_bandwidth_utilization'].avg
    print(f"✓ PASS: {result.num_arrays_validated} arrays matched")
    print(f"BW Improvement: {opt_bw - baseline_bw:.1f}%")

Contributing

We welcome contributions and feedback! Open an issue or create a PR.

License

See LICENSE for full details.

Support

Need help? Here's how to reach us:

Issues: Found a bug or have a feature request? Open an issue on GitHub

Made with 🧠 for the future of LLM-assisted GPU development

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
.github		.github
accordo		accordo
apptainer		apptainer
docs		docs
linex		linex
metrix		metrix
nexus		nexus
rocm_mcp		rocm_mcp
uprof_mcp		uprof_mcp
.clang-format		.clang-format
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IntelliKit

Philosophy

Tools

Accordo - Automated Kernel Validation

Linex - Source-Level GPU Performance Profiling

Metrix - Human-Readable GPU Metrics

Nexus - HSA Packet Source Code Extractor

ROCm-MCP - Model Context Protocol Servers of ROCm Tools

uprof-MCP - Model Context Protocol Server for uProf

Installation

Install All Tools

Install Individual Tools

Development Installation

Requirements

Documentation

Example Workflow

Contributing

License

Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

AMDResearch/intellikit

Folders and files

Latest commit

History

Repository files navigation

IntelliKit

Philosophy

Tools

Accordo - Automated Kernel Validation

Linex - Source-Level GPU Performance Profiling

Metrix - Human-Readable GPU Metrics

Nexus - HSA Packet Source Code Extractor

ROCm-MCP - Model Context Protocol Servers of ROCm Tools

uprof-MCP - Model Context Protocol Server for uProf

Installation

Install All Tools

Install Individual Tools

Development Installation

Requirements

Documentation

Example Workflow

Contributing

License

Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages