Skip to content

🚀 Complete Browser Tools Feature Roadmap: Extension-Based Tools vs Google's Puppeteer MCP #70

@ahelme

Description

@ahelme

🚀 Complete Browser Tools Feature Roadmap

📋 Executive Summary

This issue documents our complete tool strategy combining:

  • Our Extension-Based Tools (unique collaborative debugging value)
  • Google's chrome-devtools-mcp (25 Puppeteer automation tools)
  • New Killer Features (visual precision + pair programming capabilities)

🎯 Core Value Proposition

"The Only Browser Tools Built for Human-AI Pair Programming"

While Puppeteer controls a separate browser, Browser Tools UI works in YOUR actual browser for true real-time collaboration with pixel-perfect precision.


📊 TOOL INVENTORY

🟢 OUR CURRENT TOOLS (Extension-Based - 2/9 Working)

Tool Status Description Notes
browser_navigate WORKING Navigate to URLs Issue #64 related
browser_screenshot ⚠️ PARTIAL Capture screenshots Works but response flow needs fix (Issue #64)
browser_click 📝 Planned Click elements CSP blocked on example.com
browser_type 📝 Planned Type into fields -
browser_wait 📝 Planned Wait for elements -
browser_evaluate 📝 Planned Execute JavaScript -
browser_get_content 📝 Planned Get page HTML/text -
browser_audit 📝 Planned Lighthouse audits UNIQUE - Puppeteer doesn't have!
browser_get_console WORKING Get console logs Tool exists, need UI sync

Current Progress: 2-3/9 tools working


🔵 GOOGLE'S CHROME-DEVTOOLS-MCP TOOLS (Puppeteer - 25 Tools)

Users can install alongside our tools via:

{
  "mcpServers": {
    "chrome-devtools": {
      "command": "npx",
      "args": ["-y", "chrome-devtools-mcp@latest"]
    },
    "browser-tools-ui": {
      "command": "node",
      "args": ["path/to/our/server.mjs"]
    }
  }
}

Input Automation (7 tools)

  1. click - Click elements
  2. drag - Drag and drop
  3. fill - Fill input fields
  4. fill_form - Fill multi-field forms
  5. handle_dialog - Handle alerts/confirms
  6. hover - Mouse hover
  7. upload_file - File uploads

Navigation Automation (7 tools)

  1. close_page - Close tabs
  2. list_pages - List all tabs
  3. navigate_page - Navigate URLs
  4. navigate_page_history - Back/forward
  5. new_page - Open new tab
  6. select_page - Switch tabs
  7. wait_for - Wait for conditions

Emulation (3 tools)

  1. emulate_cpu - CPU throttling
  2. emulate_network - Network conditions
  3. resize_page - Viewport sizing

Performance (3 tools)

  1. performance_start_trace - Start performance trace
  2. performance_stop_trace - Stop trace
  3. performance_analyze_insight - Analyze metrics (LCP, CLS, TBT)

Note: Covers Core Web Vitals but NOT full Lighthouse (no accessibility/SEO/best-practices)

Network (2 tools)

  1. get_network_request - Get specific request
  2. list_network_requests - List all requests

Debugging (3 tools)

  1. evaluate_script - Execute JavaScript
  2. list_console_messages - Get console logs snapshot
  3. take_screenshot - Capture screenshot
  4. take_snapshot - Get page HTML

⭐ NEW KILLER FEATURES (Proposed - High Value!)

Priority 1: Visual Precision Tools 🎯

Feature Tool Name Difficulty Value Description
Visual Diff Engine browser_visual_diff 🟢 Easy ⭐⭐⭐⭐⭐ Pixel-level before/after comparison using pixelmatch library
Layout Measurement browser_measure_layout 🟢 Easy ⭐⭐⭐⭐⭐ Precise alignment/spacing data via getBoundingClientRect()
Computed Styles browser_analyze_styles 🟡 Medium ⭐⭐⭐⭐ Actual applied CSS + conflict detection
Accessibility Overlay browser_accessibility_overlay 🟡 Medium ⭐⭐⭐⭐ Visual indicators for a11y issues on screenshots

Priority 2: Collaborative Debugging 🤝

Feature Tool Name Difficulty Value Description
Collaborative Debug browser_collaborative_debug 🟡 Medium ⭐⭐⭐⭐⭐ Real-time pair programming via chrome.debugger API
Service Worker Debug browser_debug_service_worker 🟡 Medium ⭐⭐⭐ Debug SW in user's actual browser context
Live State Inspection browser_get_application_state 🟢 Easy ⭐⭐⭐⭐ localStorage, sessionStorage, cookies, etc.

🆚 COMPARISON: Our Tools vs Google's MCP

Capability Google's MCP Our Tools
Total Tools 25 9 core + 7 new = 16
Architecture Puppeteer (separate browser) Extension (user's actual browser)
Use Case Headless automation, CI/CD Interactive pair programming
Visibility Headless OR intrusive windows Non-intrusive background tabs
Collaboration ❌ No (isolated automation) ✅ Yes (real-time in same browser)
Performance Analysis Core Web Vitals only Full Lighthouse (perf + a11y + SEO + best-practices)
Visual Precision ❌ None ✅ Pixel-diff, measurements, overlay
CSS Debugging ❌ None ✅ Computed styles + conflicts
Real Context Access ❌ Isolated ✅ User's actual SW, storage, network
Pair Programming ❌ No ✅ Breakpoints, shared console, live state

💡 WHY THESE NEW FEATURES ARE KILLER

Problem Statement:

"Claude always thinks 'it's fixed!' when it isn't" - @user

AI agents can't accurately judge pixel-perfect visual results, leading to wasted iteration loops.

How We Solve It:

1. browser_visual_diff (Pixelmatch Integration)

What Claude Gets:

{
  "pixelsChanged": 1247,
  "percentChanged": 3.2,
  "diffImageBase64": "...",
  "verdict": "CHANGED - verify if intentional",
  "analysis": {
    "majorChanges": ["Header moved 15px down"],
    "minorChanges": ["Button color shifted #ff0000 → #ff0033"]
  }
}

Instead of: "Hmm, looks centered now!" ❌
Claude says: "Modified 1,247 pixels (3.2%). Here's the visual diff showing the changes. Header moved 15px down - is this intentional?" ✅

Integration:

  • Library: pixelmatch (150 lines, zero dependencies, ISC license)
  • Before/after automatic snapshots
  • Runs in browser (no server needed)

2. browser_measure_layout (Precise Measurements)

What Claude Gets:

{
  "selector": "#hero-button",
  "measurements": {
    "centerX": 447,
    "centerY": 144,
    "width": 200,
    "height": 48
  },
  "viewport": {
    "centerX": 700,
    "centerY": 450
  },
  "analysis": {
    "horizontalAlignment": "13px left of center ❌",
    "verticalAlignment": "centered ✅",
    "recommendations": ["Add margin-left: 13px to center horizontally"]
  }
}

Instead of: "Looks centered!" ❌
Claude says: "Button is 13px left of horizontal center. Adding margin-left: 13px..." ✅

Integration:

  • Native getBoundingClientRect() API
  • Simple math calculations
  • No dependencies

3. browser_analyze_styles (Computed CSS + Conflicts)

What Claude Gets:

{
  "selector": ".button",
  "computedStyles": {
    "display": "flex",
    "justifyContent": "flex-start", // AH! Not "center"!
    "backgroundColor": "#ff0000"
  },
  "styleOrigins": {
    "justifyContent": "styles.css:145 (.button)",
    "backgroundColor": "inline style (overriding)"
  },
  "conflicts": [
    {
      "property": "backgroundColor",
      "declared": "#00ff00",
      "actual": "#ff0000",
      "reason": "Overridden by inline style (higher specificity)"
    }
  ]
}

Instead of: "I set justify-content: center, should be centered!" ❌
Claude says: "Found it! Another rule is setting justify-content: flex-start at styles.css:145. The inline style is overriding your background color. Fixing specificity..." ✅

Integration:

  • Native getComputedStyle() API
  • CSS cascade analysis
  • Specificity calculations

4. browser_collaborative_debug (Real-Time Pair Programming)

Puppeteer CAN'T do this!

What This Enables:

// Claude sets breakpoint in USER'S browser
await browser_set_breakpoint({
  file: "app.js",
  line: 42,
  condition: "user.role === 'admin'"
});

// Code runs → hits breakpoint → user's DevTools pauses
// User: "Claude, look at this variable value!"
// Claude sees it through console/screenshot
// We debug TOGETHER in the same browser!

Key Capabilities:

  • ✅ Access user's real service workers
  • ✅ Debug with real localStorage/sessionStorage/cookies
  • ✅ See actual network requests (with auth tokens, CORS issues)
  • ✅ Set breakpoints collaboratively
  • ✅ True pair programming (same browser, same context)

Integration:

  • chrome.debugger API
  • Chrome DevTools Protocol
  • Careful permission handling

🎨 UI IMPROVEMENTS

Rename Extension Panel Console

Issue: Confusing to have two "Console" tabs (extension panel + DevTools)

Proposal: Rename extension panel console to:

  • Option A: "Activity Log" (shows what Claude is doing)
  • Option B: "AI Console" (makes purpose clear)
  • Option C: "Session Log" (broader than just console)
  • Option D: "Pair Log" (emphasizes collaboration)

User preference?

Console Sync for Pair Programming

Current:

  • Claude gets console via browser_get_console tool
  • User sees console in DevTools
  • Not synchronized!

Proposal:

  • Extension panel shows SAME console errors/logs
  • Real-time updates as they happen
  • Color-coded by level (error=red, warn=yellow, etc.)
  • Click to jump to DevTools console for details
  • Claude and user literally "looking at same console"

Benefits:

  • ✅ True pair programming experience
  • ✅ User sees what Claude is seeing
  • ✅ Faster debugging (no "did you check the console?" confusion)

📅 IMPLEMENTATION ROADMAP

Phase 1: Foundation (✅ Done)

  • Chrome extension + HTTP bridge + MCP server
  • browser_navigate working
  • browser_screenshot working (needs response flow fix)
  • browser_get_console working

Phase 2: Complete Core Tools (2 weeks)

Phase 3: Visual Precision Tools (2 weeks) ⭐ KILLER FEATURES

  • browser_visual_diff (pixelmatch integration)
  • browser_measure_layout (precise measurements)
  • browser_analyze_styles (computed CSS)
  • browser_accessibility_overlay (visual a11y indicators)

Phase 4: Collaborative Features (2 weeks) 🤝

  • Rename extension panel console (decide naming)
  • Console sync (real-time updates in panel)
  • browser_collaborative_debug (chrome.debugger API)
  • browser_get_application_state (live state inspection)

Phase 5: Polish & Documentation (1 week)

  • Integration testing
  • Documentation updates
  • Example workflows
  • Beta testing

🎯 SUCCESS METRICS

For Users:

  • ✅ Pixel-perfect results first try (no more "Claude thinks it's fixed" loops)
  • ✅ True pair programming experience (same browser, real-time)
  • ✅ Full Lighthouse + visual precision (unique value vs Google's MCP)

For Claude:

  • ✅ Quantitative data instead of "looks good to me!"
  • ✅ Precise measurements (13px off center vs "centered")
  • ✅ CSS debugging clarity (see actual applied styles + conflicts)
  • ✅ Visual diffs (before/after pixel comparison)

🚀 COMPETITIVE POSITIONING

Google's chrome-devtools-mcp:

  • ✅ 25 automation tools
  • ✅ Puppeteer-based
  • ✅ Great for CI/CD, headless testing
  • ❌ Separate browser (isolated)
  • ❌ No real-time collaboration
  • ❌ No visual precision tools
  • ❌ Limited performance analysis (Core Web Vitals only)

Our browser-tools-ui:

  • ✅ 16 tools (9 core + 7 unique)
  • ✅ Extension-based (user's real browser)
  • Real-time pair programming
  • Pixel-perfect visual precision
  • Full Lighthouse (perf + a11y + SEO + best-practices)
  • True collaborative debugging
  • Access to real context (SW, storage, network)

Tagline:

"Google's MCP is for automation. Ours is for collaboration."


📚 TECHNICAL REFERENCES

Libraries & APIs:

Research:


🤝 DECISION POINTS

1. Extension Panel Console Naming

What should we rename the extension panel console to avoid confusion with DevTools console?

  • Activity Log
  • AI Console
  • Session Log
  • Pair Log
  • Other: ___________

2. Priority Order for Phase 3 Features

Which visual precision tool should we build first?

  • Visual Diff (before/after comparison)
  • Layout Measurement (precise alignment data)
  • Computed Styles (CSS debugging)
  • Accessibility Overlay

3. Google MCP Integration Strategy

How should we position relative to Google's chrome-devtools-mcp?

  • Complementary (recommend using both)
  • Alternative (our extension-based approach vs their Puppeteer)
  • Hybrid (our unique tools + optional Google MCP for automation)

📊 NEXT STEPS

  1. User feedback on naming (extension panel console)
  2. Prioritize Phase 3 features (visual precision tools)
  3. Fix Issue 📸 Enhance Screenshot Directory UI and Fix File Saving #64 (screenshot response flow)
  4. Implement visual diff MVP (prove killer feature value)
  5. Document pair programming workflows

💬 DISCUSSION

Open for team discussion on priorities, naming, and implementation strategy!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions