-
-
Notifications
You must be signed in to change notification settings - Fork 6.2k
Add requesting-human-help skill for structured human-in-the-loop collaboration #673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
pratyush618
wants to merge
2
commits into
obra:main
Choose a base branch
from
pratyush618:feat/requesting-human-help-skill
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,124 @@ | ||
| --- | ||
| name: requesting-human-help | ||
| description: Use when blocked by capability limits (UI testing, local execution, VPN-only systems, MFA/captcha), or before irreversible/high-risk actions (deleting data, deploying to production, sending external messages, handling credentials) that require human judgment or approval | ||
| --- | ||
|
|
||
| # Requesting Human Help | ||
|
|
||
| ## Overview | ||
|
|
||
| Ad hoc help requests fail: they're inconsistent, lack context, and return unverifiable responses. | ||
|
|
||
| **Core principle:** Turn human collaboration into a structured, evidence-driven, auditable request with explicit acceptance criteria. | ||
|
|
||
| ## When to Use | ||
|
|
||
| **Capability/access boundaries:** | ||
| - Testing UI on a real device or browser you cannot control | ||
| - Running commands on a local machine or VPN-only system | ||
| - Completing flows requiring MFA, CAPTCHA, or physical hardware | ||
| - Subjective visual checks ("does this look right?") | ||
|
|
||
| **High-risk / high-uncertainty steps:** | ||
| - Deleting data, dropping tables, wiping storage | ||
| - Deploying to production or staging environments | ||
| - Sending external emails, Slack messages, or notifications | ||
| - Handling or rotating sensitive credentials | ||
| - Any irreversible action where being wrong is costly | ||
|
|
||
| **Do NOT use for:** | ||
| - Questions you can answer by reading files, docs, or web search | ||
| - Low-risk, reversible local actions you can attempt yourself | ||
| - Anything recoverable you should just try first | ||
|
|
||
| ## The Request Format | ||
|
|
||
| Present every help request as a structured block. Include ALL fields — missing fields are the top cause of execution errors. | ||
|
|
||
| ```text | ||
| ## Human Help Needed | ||
|
|
||
| **Goal:** [One sentence: what outcome is needed] | ||
|
|
||
| **Involvement level:** [clarification | execution | approval/takeover] | ||
| - clarification: human answers a question so the agent can continue | ||
| - execution: human performs steps the agent cannot | ||
| - approval/takeover: human must approve or own the action before agent proceeds | ||
|
|
||
| **Why I can't do this:** [Specific blocker — capability limit or risk reason] | ||
|
|
||
| **Context:** | ||
| - [Relevant state: what has already been done, what the system looks like] | ||
| - [File paths, URLs, service names, environment] | ||
|
|
||
| **Prerequisites before starting:** | ||
| - [ ] [What must be true / set up before the human begins] | ||
|
|
||
| **Steps:** | ||
| 1. [Explicit, numbered, unambiguous instruction] | ||
| 2. [Each step should be doable without guessing] | ||
| 3. ... | ||
|
|
||
| **Expected output / evidence needed:** | ||
| - [What to capture: screenshot, log output, command result, confirmation text] | ||
| - [Format: paste text output, attach screenshot, confirm yes/no] | ||
|
|
||
| **Acceptance criteria:** | ||
| - [ ] [Specific, verifiable condition that means "this worked"] | ||
| - [ ] [What distinguishes success from partial success] | ||
|
|
||
| **If something goes wrong:** [Who to contact or how to escalate] | ||
| ``` | ||
|
|
||
| ## Validating the Human Response | ||
|
|
||
| When the human responds, verify before proceeding: | ||
|
|
||
| ```text | ||
| FOR EACH acceptance criterion: | ||
| - Is it addressed in the response? | ||
| - Is evidence provided (log, screenshot, output)? | ||
| - Does the evidence confirm the criterion? | ||
|
|
||
| IF any criterion unmet: | ||
| → Request ONLY the missing piece (minimal follow-up) | ||
| → Do NOT re-ask everything | ||
|
|
||
| IF all criteria met: | ||
| → State: "Confirmed: [criterion 1], [criterion 2]. Proceeding." | ||
| → Continue workflow | ||
| ``` | ||
|
|
||
| **Never accept "looks good" or "done" without artifacts.** A screenshot or pasted output is the minimum bar for irreversible actions. | ||
|
|
||
| ## The Audit Chain | ||
|
|
||
| Every request creates a record: | ||
|
|
||
| ```text | ||
| REQUEST → [structured block above] | ||
| HUMAN ACTION → [what they did] | ||
| EVIDENCE → [artifact they returned] | ||
| AGENT DECISION → [what you decided based on evidence] | ||
| ``` | ||
|
|
||
| Log this chain in your response so future debugging has a clear trail. | ||
|
|
||
| ## Red Flags — STOP | ||
|
|
||
| - Attempting irreversible action without explicit human approval | ||
| - Proceeding because human said "go ahead" with no evidence | ||
| - Asking for help without prerequisites listed (human will get stuck) | ||
| - Accepting partial confirmation and assuming the rest is fine | ||
| - Re-asking the entire request when only one piece is missing | ||
|
|
||
| ## Common Mistakes | ||
|
|
||
| | Mistake | Fix | | ||
| |---------|-----| | ||
| | Vague goal ("deploy the thing") | One-sentence outcome with system + environment | | ||
| | Missing prerequisites | List what must be true before step 1 | | ||
| | Ambiguous steps ("configure it") | Exact commands, menu paths, field values | | ||
| | No evidence requested | Always specify what to capture and how | | ||
| | Accepting "done" without artifact | Ask for the specific log or screenshot | | ||
| | Over-escalating routine actions | Only escalate capability limits and irreversible risks | | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.