-
Notifications
You must be signed in to change notification settings - Fork 0
Description
🚀 Complete Browser Tools Feature Roadmap
📋 Executive Summary
This issue documents our complete tool strategy combining:
- Our Extension-Based Tools (unique collaborative debugging value)
- Google's chrome-devtools-mcp (25 Puppeteer automation tools)
- New Killer Features (visual precision + pair programming capabilities)
🎯 Core Value Proposition
"The Only Browser Tools Built for Human-AI Pair Programming"
While Puppeteer controls a separate browser, Browser Tools UI works in YOUR actual browser for true real-time collaboration with pixel-perfect precision.
📊 TOOL INVENTORY
🟢 OUR CURRENT TOOLS (Extension-Based - 2/9 Working)
| Tool | Status | Description | Notes |
|---|---|---|---|
browser_navigate |
✅ WORKING | Navigate to URLs | Issue #64 related |
browser_screenshot |
Capture screenshots | Works but response flow needs fix (Issue #64) | |
browser_click |
📝 Planned | Click elements | CSP blocked on example.com |
browser_type |
📝 Planned | Type into fields | - |
browser_wait |
📝 Planned | Wait for elements | - |
browser_evaluate |
📝 Planned | Execute JavaScript | - |
browser_get_content |
📝 Planned | Get page HTML/text | - |
browser_audit |
📝 Planned | Lighthouse audits | UNIQUE - Puppeteer doesn't have! |
browser_get_console |
✅ WORKING | Get console logs | Tool exists, need UI sync |
Current Progress: 2-3/9 tools working
🔵 GOOGLE'S CHROME-DEVTOOLS-MCP TOOLS (Puppeteer - 25 Tools)
Users can install alongside our tools via:
{
"mcpServers": {
"chrome-devtools": {
"command": "npx",
"args": ["-y", "chrome-devtools-mcp@latest"]
},
"browser-tools-ui": {
"command": "node",
"args": ["path/to/our/server.mjs"]
}
}
}Input Automation (7 tools)
click- Click elementsdrag- Drag and dropfill- Fill input fieldsfill_form- Fill multi-field formshandle_dialog- Handle alerts/confirmshover- Mouse hoverupload_file- File uploads
Navigation Automation (7 tools)
close_page- Close tabslist_pages- List all tabsnavigate_page- Navigate URLsnavigate_page_history- Back/forwardnew_page- Open new tabselect_page- Switch tabswait_for- Wait for conditions
Emulation (3 tools)
emulate_cpu- CPU throttlingemulate_network- Network conditionsresize_page- Viewport sizing
Performance (3 tools)
performance_start_trace- Start performance traceperformance_stop_trace- Stop traceperformance_analyze_insight- Analyze metrics (LCP, CLS, TBT)
Note: Covers Core Web Vitals but NOT full Lighthouse (no accessibility/SEO/best-practices)
Network (2 tools)
get_network_request- Get specific requestlist_network_requests- List all requests
Debugging (3 tools)
evaluate_script- Execute JavaScriptlist_console_messages- Get console logs snapshottake_screenshot- Capture screenshottake_snapshot- Get page HTML
⭐ NEW KILLER FEATURES (Proposed - High Value!)
Priority 1: Visual Precision Tools 🎯
| Feature | Tool Name | Difficulty | Value | Description |
|---|---|---|---|---|
| Visual Diff Engine | browser_visual_diff |
🟢 Easy | ⭐⭐⭐⭐⭐ | Pixel-level before/after comparison using pixelmatch library |
| Layout Measurement | browser_measure_layout |
🟢 Easy | ⭐⭐⭐⭐⭐ | Precise alignment/spacing data via getBoundingClientRect() |
| Computed Styles | browser_analyze_styles |
🟡 Medium | ⭐⭐⭐⭐ | Actual applied CSS + conflict detection |
| Accessibility Overlay | browser_accessibility_overlay |
🟡 Medium | ⭐⭐⭐⭐ | Visual indicators for a11y issues on screenshots |
Priority 2: Collaborative Debugging 🤝
| Feature | Tool Name | Difficulty | Value | Description |
|---|---|---|---|---|
| Collaborative Debug | browser_collaborative_debug |
🟡 Medium | ⭐⭐⭐⭐⭐ | Real-time pair programming via chrome.debugger API |
| Service Worker Debug | browser_debug_service_worker |
🟡 Medium | ⭐⭐⭐ | Debug SW in user's actual browser context |
| Live State Inspection | browser_get_application_state |
🟢 Easy | ⭐⭐⭐⭐ | localStorage, sessionStorage, cookies, etc. |
🆚 COMPARISON: Our Tools vs Google's MCP
| Capability | Google's MCP | Our Tools |
|---|---|---|
| Total Tools | 25 | 9 core + 7 new = 16 |
| Architecture | Puppeteer (separate browser) | Extension (user's actual browser) |
| Use Case | Headless automation, CI/CD | Interactive pair programming |
| Visibility | Headless OR intrusive windows | Non-intrusive background tabs |
| Collaboration | ❌ No (isolated automation) | ✅ Yes (real-time in same browser) |
| Performance Analysis | Core Web Vitals only | Full Lighthouse (perf + a11y + SEO + best-practices) |
| Visual Precision | ❌ None | ✅ Pixel-diff, measurements, overlay |
| CSS Debugging | ❌ None | ✅ Computed styles + conflicts |
| Real Context Access | ❌ Isolated | ✅ User's actual SW, storage, network |
| Pair Programming | ❌ No | ✅ Breakpoints, shared console, live state |
💡 WHY THESE NEW FEATURES ARE KILLER
Problem Statement:
"Claude always thinks 'it's fixed!' when it isn't" - @user
AI agents can't accurately judge pixel-perfect visual results, leading to wasted iteration loops.
How We Solve It:
1. browser_visual_diff (Pixelmatch Integration)
What Claude Gets:
{
"pixelsChanged": 1247,
"percentChanged": 3.2,
"diffImageBase64": "...",
"verdict": "CHANGED - verify if intentional",
"analysis": {
"majorChanges": ["Header moved 15px down"],
"minorChanges": ["Button color shifted #ff0000 → #ff0033"]
}
}Instead of: "Hmm, looks centered now!" ❌
Claude says: "Modified 1,247 pixels (3.2%). Here's the visual diff showing the changes. Header moved 15px down - is this intentional?" ✅
Integration:
- Library:
pixelmatch(150 lines, zero dependencies, ISC license) - Before/after automatic snapshots
- Runs in browser (no server needed)
2. browser_measure_layout (Precise Measurements)
What Claude Gets:
{
"selector": "#hero-button",
"measurements": {
"centerX": 447,
"centerY": 144,
"width": 200,
"height": 48
},
"viewport": {
"centerX": 700,
"centerY": 450
},
"analysis": {
"horizontalAlignment": "13px left of center ❌",
"verticalAlignment": "centered ✅",
"recommendations": ["Add margin-left: 13px to center horizontally"]
}
}Instead of: "Looks centered!" ❌
Claude says: "Button is 13px left of horizontal center. Adding margin-left: 13px..." ✅
Integration:
- Native
getBoundingClientRect()API - Simple math calculations
- No dependencies
3. browser_analyze_styles (Computed CSS + Conflicts)
What Claude Gets:
{
"selector": ".button",
"computedStyles": {
"display": "flex",
"justifyContent": "flex-start", // AH! Not "center"!
"backgroundColor": "#ff0000"
},
"styleOrigins": {
"justifyContent": "styles.css:145 (.button)",
"backgroundColor": "inline style (overriding)"
},
"conflicts": [
{
"property": "backgroundColor",
"declared": "#00ff00",
"actual": "#ff0000",
"reason": "Overridden by inline style (higher specificity)"
}
]
}Instead of: "I set justify-content: center, should be centered!" ❌
Claude says: "Found it! Another rule is setting justify-content: flex-start at styles.css:145. The inline style is overriding your background color. Fixing specificity..." ✅
Integration:
- Native
getComputedStyle()API - CSS cascade analysis
- Specificity calculations
4. browser_collaborative_debug (Real-Time Pair Programming)
Puppeteer CAN'T do this!
What This Enables:
// Claude sets breakpoint in USER'S browser
await browser_set_breakpoint({
file: "app.js",
line: 42,
condition: "user.role === 'admin'"
});
// Code runs → hits breakpoint → user's DevTools pauses
// User: "Claude, look at this variable value!"
// Claude sees it through console/screenshot
// We debug TOGETHER in the same browser!Key Capabilities:
- ✅ Access user's real service workers
- ✅ Debug with real localStorage/sessionStorage/cookies
- ✅ See actual network requests (with auth tokens, CORS issues)
- ✅ Set breakpoints collaboratively
- ✅ True pair programming (same browser, same context)
Integration:
chrome.debuggerAPI- Chrome DevTools Protocol
- Careful permission handling
🎨 UI IMPROVEMENTS
Rename Extension Panel Console
Issue: Confusing to have two "Console" tabs (extension panel + DevTools)
Proposal: Rename extension panel console to:
- Option A: "Activity Log" (shows what Claude is doing)
- Option B: "AI Console" (makes purpose clear)
- Option C: "Session Log" (broader than just console)
- Option D: "Pair Log" (emphasizes collaboration)
User preference?
Console Sync for Pair Programming
Current:
- Claude gets console via
browser_get_consoletool - User sees console in DevTools
- Not synchronized!
Proposal:
- Extension panel shows SAME console errors/logs
- Real-time updates as they happen
- Color-coded by level (error=red, warn=yellow, etc.)
- Click to jump to DevTools console for details
- Claude and user literally "looking at same console"
Benefits:
- ✅ True pair programming experience
- ✅ User sees what Claude is seeing
- ✅ Faster debugging (no "did you check the console?" confusion)
📅 IMPLEMENTATION ROADMAP
Phase 1: Foundation (✅ Done)
- Chrome extension + HTTP bridge + MCP server
-
browser_navigateworking -
browser_screenshotworking (needs response flow fix) -
browser_get_consoleworking
Phase 2: Complete Core Tools (2 weeks)
- Fix
browser_screenshotWebSocket response (📸 Enhance Screenshot Directory UI and Fix File Saving #64) - Implement
browser_click - Implement
browser_type - Implement
browser_wait - Implement
browser_evaluate - Implement
browser_get_content - Implement
browser_audit(Lighthouse)
Phase 3: Visual Precision Tools (2 weeks) ⭐ KILLER FEATURES
-
browser_visual_diff(pixelmatch integration) -
browser_measure_layout(precise measurements) -
browser_analyze_styles(computed CSS) -
browser_accessibility_overlay(visual a11y indicators)
Phase 4: Collaborative Features (2 weeks) 🤝
- Rename extension panel console (decide naming)
- Console sync (real-time updates in panel)
-
browser_collaborative_debug(chrome.debugger API) -
browser_get_application_state(live state inspection)
Phase 5: Polish & Documentation (1 week)
- Integration testing
- Documentation updates
- Example workflows
- Beta testing
🎯 SUCCESS METRICS
For Users:
- ✅ Pixel-perfect results first try (no more "Claude thinks it's fixed" loops)
- ✅ True pair programming experience (same browser, real-time)
- ✅ Full Lighthouse + visual precision (unique value vs Google's MCP)
For Claude:
- ✅ Quantitative data instead of "looks good to me!"
- ✅ Precise measurements (13px off center vs "centered")
- ✅ CSS debugging clarity (see actual applied styles + conflicts)
- ✅ Visual diffs (before/after pixel comparison)
🚀 COMPETITIVE POSITIONING
Google's chrome-devtools-mcp:
- ✅ 25 automation tools
- ✅ Puppeteer-based
- ✅ Great for CI/CD, headless testing
- ❌ Separate browser (isolated)
- ❌ No real-time collaboration
- ❌ No visual precision tools
- ❌ Limited performance analysis (Core Web Vitals only)
Our browser-tools-ui:
- ✅ 16 tools (9 core + 7 unique)
- ✅ Extension-based (user's real browser)
- ✅ Real-time pair programming
- ✅ Pixel-perfect visual precision
- ✅ Full Lighthouse (perf + a11y + SEO + best-practices)
- ✅ True collaborative debugging
- ✅ Access to real context (SW, storage, network)
Tagline:
"Google's MCP is for automation. Ours is for collaboration."
📚 TECHNICAL REFERENCES
Libraries & APIs:
- pixelmatch: https://github.com/mapbox/pixelmatch (ISC license, 150 lines)
- getBoundingClientRect: https://developer.mozilla.org/en-US/docs/Web/API/Element/getBoundingClientRect
- getComputedStyle: https://developer.mozilla.org/en-US/docs/Web/API/Window/getComputedStyle
- chrome.debugger API: https://developer.chrome.com/docs/extensions/reference/api/debugger
- Lighthouse: https://github.com/GoogleChrome/lighthouse (Apache-2.0)
Research:
- Google's MCP: https://github.com/ChromeDevTools/chrome-devtools-mcp
- Visual regression tools: Percy, Applitools, VisBug
- CSS debugging: Pesticide, PerfectPixel, CSS Peeper
🤝 DECISION POINTS
1. Extension Panel Console Naming
What should we rename the extension panel console to avoid confusion with DevTools console?
- Activity Log
- AI Console
- Session Log
- Pair Log
- Other: ___________
2. Priority Order for Phase 3 Features
Which visual precision tool should we build first?
- Visual Diff (before/after comparison)
- Layout Measurement (precise alignment data)
- Computed Styles (CSS debugging)
- Accessibility Overlay
3. Google MCP Integration Strategy
How should we position relative to Google's chrome-devtools-mcp?
- Complementary (recommend using both)
- Alternative (our extension-based approach vs their Puppeteer)
- Hybrid (our unique tools + optional Google MCP for automation)
📊 NEXT STEPS
- User feedback on naming (extension panel console)
- Prioritize Phase 3 features (visual precision tools)
- Fix Issue 📸 Enhance Screenshot Directory UI and Fix File Saving #64 (screenshot response flow)
- Implement visual diff MVP (prove killer feature value)
- Document pair programming workflows
💬 DISCUSSION
Open for team discussion on priorities, naming, and implementation strategy!