Feat/fix issue 1028 by ilonae · Pull Request #1349 · FoundationAgents/OpenManus

ilonae · 2026-04-11T17:10:46Z

Issue

Agent gets stuck in an infinite loop when attempting job searches due to Playwright browser initialization failure.

Fixes #1028

Root Cause Analysis

The issue had 5 interconnected layers:

Generic Error Handling - BrowserUseTool couldn't distinguish fatal initialization errors from recoverable operation errors
Incomplete Stuck Detection - is_stuck() only detected exact duplicate messages, missing repeated error patterns
No Tool Failure Tracking - Agent had no mechanism to count consecutive failures per tool
Vague System Prompt - Guidance to use "web_search" created dead-ends since it's a browser_use sub-action
CRITICAL - Prompt Override Bug - Manus.think() was switching LLM response format (JSON vs tool selection), breaking recovery

Solutions Implemented

1. Error Classification in BrowserUseTool

Added _classify_error() method to categorize errors:

INIT_FAILED: Playwright initialization errors (fatal, non-recoverable)
OPERATION_FAILED: Browser operation errors (may be recoverable)
ELEMENT_NOT_FOUND: Element interaction errors

Detects Playwright-specific patterns:

"BrowserType.launch"
"Executable doesn't exist"
"playwright install"
"No such file or directory"

Impact: Browser failures are now clearly identified as fatal, triggering recovery mechanisms.

2. Multi-Criteria Stuck Detection (BaseAgent.is_stuck)

Enhanced detection from single criterion to three independent criteria:

Criterion 1: Exact duplicate messages (original)
Criterion 2: 3+ error messages in last 5 messages (new)
Criterion 3: Repeated error patterns in tool observations (new)

Impact: Catches stuck states caused by error loops, not just duplicates.

3. Tool Failure Tracking (ToolCallAgent)

Added _tool_failures dict with:

_increment_tool_failure() - Track consecutive failures per tool
_reset_tool_failure() - Reset counter on success
Max threshold: 3 consecutive failures

Uses Pydantic v2 compatible PrivateAttr for private attributes.

Impact: Prevents infinite retries of broken tools; guides toward alternatives.

4. Enhanced System Prompt (NEXT_STEP_PROMPT)

Updated guidance with:

Clear Browser Error Handling: "STOP using browser_use immediately. Do NOT retry it."
Concrete Alternatives:
- Use python_execute with requests/urllib
- Use ask_human for user assistance
- Check MCP tools (search_jobs, job_api, etc.)
Tool Priority When One Fails:
- Browser fails → Try python_execute or ask_human
- Tool fails 3+ times → Different tool entirely
- Stuck → ask_human for help
Terminal Error Guidance: Playwright errors are terminal and cannot be fixed by agent

Impact: Prevents dead-ends; guides LLM toward viable recovery paths.

5. CRITICAL - Removed Prompt Override Bug (Manus.think)

The Bug: Manus.think() was switching to BrowserAgent's JSON response format, breaking the ToolCallAgent interface.

The Fix: Removed the entire prompt override block. Manus now uses standard ToolCallAgent prompts.

# Before (BROKEN):
self.next_step_prompt = "{json_format_prompt}"  # Breaks tool selection

# After (FIXED):
# Note: We intentionally do NOT override next_step_prompt here.
# When browser fails, we want to switch tools using normal mechanism.
result = await super().think()

Impact: Agent can now properly switch between tools when one fails.

Files Modified

File	Changes	Purpose
`app/agent/base.py`	+46 lines	Multi-criteria stuck detection
`app/agent/toolcall.py`	+35 lines	Tool failure tracking with _tool_failures dict
`app/agent/manus.py`	-18 lines	Removed prompt override bug
`app/prompt/manus.py`	+21 lines	Enhanced recovery guidance
`app/tool/browser_use_tool.py`	+44 lines	Error classification logic
`requirements.txt`	1 line	Fixed pillow version conflict with crawl4ai

Total: 143 insertions, 28 deletions

Testing

Comprehensive test coverage created:

test_issue_1028_standalone.py - Standalone logic validation (no dependencies)
- Error classification
- Multi-criteria stuck detection
- Tool failure tracking
- Recovery flow simulation
test_stuck_detection.py - Unit test for stuck detection (fixed to use concrete ToolCallAgent class)
test_live_recovery.py - Integration test with actual agent code

All tests demonstrate the fix prevents infinite loops and enables proper recovery.

Expected Behavior After Fix

When browser fails during job search:

Browser initialization fails with clear error message
Tool failure counter increments (1, 2, 3...)
After 3 consecutive failures, tool is marked as broken
Stuck detection identifies error pattern
System prompt guides agent to alternatives
Agent switches to python_execute or ask_human
Task completes without infinite loop

Checklist

All 5 fixes implemented and tested
Code follows project conventions
Error messages are clear and actionable
Recovery mechanisms are documented
No breaking changes to existing API
Dependencies updated (pillow conflict resolved)

…tuck loop during browser operations This fix addresses issue FoundationAgents#1028 where agents get stuck in a loop when attempting web searches or browser operations with unavailable Playwright. **1. Enhanced Error Classification in BrowserUseTool** (app/tool/browser_use_tool.py) - Added _classify_error() method to distinguish between: * Playwright initialization errors (fatal - switch to web_search) * Operation failures (may be recoverable) - Wrapped browser initialization in separate try-catch for better error handling - Replaced generic exception handler with categorized error responses - Result: Agent receives clear signal when browser is unavailable **2. Enhanced Stuck-State Detection in BaseAgent** (app/agent/base.py) - Expanded is_stuck() from simple duplicate detection to multi-criteria: * Criterion 1: Exact duplicate messages (existing) * Criterion 2: 3+ error messages in recent history * Criterion 3: Repeated error patterns in tool observations - Updated handle_stuck_state() to guide agent away from retrying same tools - Result: Agent detects stuck states earlier and attempts recovery strategies **3. Tool Failure Tracking in ToolCallAgent** (app/agent/toolcall.py) - Added _tool_failures dict (using PrivateAttr) to track consecutive failures per tool - Added helper methods: _increment_tool_failure, _reset_tool_failure, _get_tool_failure_count - Modified observe_tool_results() to: * Track failures when tool returns errors * Reset counter on success * Alert agent after 3 consecutive failures - Result: Agent recognizes when a tool is broken and tries alternatives **4. System Prompt Updates** (app/prompt/manus.py) - Added explicit guidance for handling browser initialization errors - Documented when to switch away from failing tools - Clarified that repeated failures indicate unrecoverable errors - Result: Agent behavior guided toward recovery strategies instead of retries Previously, when Playwright browser binary was unavailable, the agent would: 1. Receive generic error from browser_use tool 2. Not recognize it as a terminal/fatal error 3. Attempt browser operations repeatedly 4. Fail to detect stuck state (errors weren't exact duplicates) 5. Loop until max steps exceeded Now the agent: 1. Receives clear "Browser initialization failed" message 2. Detects stuck state via error pattern recognition 3. Recognizes tool has failed 3+ times consecutively 4. Switches to web_search or other alternative tools 5. Completes task without getting stuck - Error classification: Correctly identifies Playwright initialization errors - Stuck-state detection: Detects multiple errors and exact duplicates - Tool failure tracking: Correctly tracks and resets failure counts per tool - All modified files compile successfully with Python 3.12 and Pydantic v2

…and improve error handling guidance Critical fixes for agent stuck loop when Playwright browser fails: 1. **Remove prompt override in Manus.think()** (app/agent/manus.py) - Manus was overriding the ToolCallAgent prompt with BrowserAgent's JSON response format - This caused LLM to output JSON instead of tool selections when browser was used - This prevented the agent from switching to alternative tools after browser failures - Solution: Use consistent ToolCallAgent prompt so tool selection works reliably 2. **Enhance system prompt with clear recovery strategies** (app/prompt/manus.py) - Previous prompt said "use web_search tool" but that's an action within browser_use - When browser_use fails (Playwright missing), web_search action also fails - New prompt clearly lists available alternatives: python_execute, ask_human, MCP tools - Explicit guidance on tool failure recovery and when to stop retrying Technical details: - BrowserAgent expects JSON with action/state format - Manus/ToolCallAgent expects tool function calls - Mixing these formats confuses the LLM response parsing - Removing the override ensures consistent tool selection mechanism - System prompt now gives concrete alternative tools, not dead ends This combined fix enables the agent to: - Detect browser initialization failures - Switch to python_execute (requests/urllib) or ask_human - Not get stuck in retry loops when tools fail - Properly utilize tool failure tracking already in place

…isort and black

ilonae added 4 commits April 11, 2026 23:54

fix: resolve pillow version conflict with crawl4ai

73a1480

fix: reorder pre-commit hooks to resolve circular dependency between …

024270c

…isort and black

ilonae force-pushed the feat/fix-issue-1028 branch from c53f183 to 024270c Compare April 20, 2026 06:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/fix issue 1028#1349

Feat/fix issue 1028#1349
ilonae wants to merge 4 commits into
FoundationAgents:mainfrom
ilonae:feat/fix-issue-1028

ilonae commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ilonae commented Apr 11, 2026

Issue

Root Cause Analysis

Solutions Implemented

1. Error Classification in BrowserUseTool

2. Multi-Criteria Stuck Detection (BaseAgent.is_stuck)

3. Tool Failure Tracking (ToolCallAgent)

4. Enhanced System Prompt (NEXT_STEP_PROMPT)

5. CRITICAL - Removed Prompt Override Bug (Manus.think)

Files Modified

Testing

Expected Behavior After Fix

Related

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant