Skip to content

fix: improve PPT generation skill loading for agent#1471

Open
jjjojoj wants to merge 1 commit intobytedance:mainfrom
jjjojoj:fix/ppt-skill-loading
Open

fix: improve PPT generation skill loading for agent#1471
jjjojoj wants to merge 1 commit intobytedance:mainfrom
jjjojoj:fix/ppt-skill-loading

Conversation

@jjjojoj
Copy link
Copy Markdown

@jjjojoj jjjojoj commented Mar 27, 2026

Summary

Fixes issue #424: PPT generation fails because the agent enters a CLARIFY → PLAN → ACT loop instead of directly loading the ppt-generation skill.

As analyzed by @Lntanohuang in issue #424, the root cause is:

  1. Over-aggressive clarification prompt: The agent asks users for style/page count before acting, instead of loading the skill immediately
  2. Passive skill loading: The agent must self-identify and read SKILL.md — weaker models (e.g. doubao) often fail to do this reliably
  3. PR fix: support PPT generation without GEMINI_API_KEY by adding Path B fallback #1171 only fixed Path B fallback: It handled missing GEMINI_API_KEY but did not address proactive skill loading

Changes

Modified backend/packages/harness/deerflow/agents/lead_agent/prompt.py:

1. get_skills_prompt_section() — added SKILL-FIRST PRIORITY block:

  • Explicit PPT keyword matching (Chinese + English)
  • Direct skill file path: /mnt/skills/public/ppt-generation/SKILL.md
  • Explicitly forbids asking for style/page count (the skill has sensible defaults)

2. <critical_reminders> — added Skill-First Priority reminder:

  • Reinforces that PPT tasks must load the skill immediately, without a clarification round-trip

Result

  • User says "帮我做个PPT" / "make a PPT" → agent immediately loads the skill
  • Avoids the clarification → replanning → no-action loop
  • Makes PPT generation much more reliable, especially for weaker models

Closes #424

Fix issue bytedance#424 - agent does not load ppt-generation skill proactively

Root cause (Lntanohuang's analysis):
- Agent's strong CLARIFY → PLAN → ACT system prompt causes it to ask
  clarifying questions before loading the ppt-generation skill
- Skill matching is fully passive (prompt-guided) — the agent must
  recognize and read_file SKILL.md on its own, which weaker models fail to do
- PR bytedance#1171 only addressed Path B fallback for missing GEMINI_API_KEY

Changes:
- Add SKILL-FIRST PRIORITY section in get_skills_prompt_section() with
  explicit PPT keyword table and immediate skill loading directive
- Add 'Skill-First Priority' reminder in critical_reminders section
- Specify exact skill path /mnt/skills/public/ppt-generation/SKILL.md
- Instruct agent to NOT ask clarification (style/slide count) — skill has
  reasonable defaults and handles parameters itself
- Bilingual keywords: English + Chinese (幻灯片, 生成PPT, 制作PPT)

This makes PPT generation requests recognized and acted upon immediately,
bypassing the CLARIFY step that caused re-planning loops.
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 27, 2026

CLA assistant check
All committers have signed the CLA.

@WillemJiang
Copy link
Copy Markdown
Collaborator

@jjjojoj thanks for your contribution, please click the CLA button to sign the CLA first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

生成PPT功能无法使用

3 participants