Skip to content

feat(chatgpt): add --conv and --aspect to image command#1557

Open
ele-yufo wants to merge 1 commit into
jackwener:mainfrom
ele-yufo:feat/chatgpt-image-conv-aspect
Open

feat(chatgpt): add --conv and --aspect to image command#1557
ele-yufo wants to merge 1 commit into
jackwener:mainfrom
ele-yufo:feat/chatgpt-image-conv-aspect

Conversation

@ele-yufo
Copy link
Copy Markdown
Contributor

Summary

Two related additions to opencli chatgpt image:

Flag Behaviour
--conv <id|url> Continue an existing conversation instead of always opening /new. Accepts bare id, /c/<id>, or full chatgpt.com/c/<id> URL.
--aspect <ratio> Drive ChatGPT's Choose image aspect ratio picker through the real UI. Accepts auto, 1:1, 3:4, 9:16, 4:3, 16:9, plus named aliases square, portrait, story, landscape, widescreen.

The Image tool is now always toggled on (idempotent) so the aspect picker surfaces and the response is routed deterministically through the image generator — instead of relying on the model to interpret "Generate an image of:" prose.

When --conv is supplied the user prompt is sent verbatim (the conversation already provides context). Fresh chats keep the legacy Generate an image of: / Edit the attached image: wrappers for backward compatibility.

Why

The previous command had two limitations:

  1. Always opened a fresh /new — agents couldn't keep iterating on a generated image ("make it darker", "now in 9:16") in the same thread.
  2. Couldn't pick a sub-mode — Auto is the only ratio reachable through the model; the Choose image aspect ratio menu (Auto / Square 1:1 / Portrait 3:4 / Story 9:16 / Landscape 4:3 / Widescreen 16:9) was only operable manually.

Implementation notes

  • Radix menus on chatgpt.com swallow synthetic button.click(). Every menu interaction (plus-menu, aspect-ratio menu) dispatches a full pointer sequence — pointerdownmousedownpointerupmouseupclick — via a shared POINTER_FIRE_SNIPPET inlined into page.evaluate.
  • Image tool detection: the post-condition for activateChatGPTImageTool is the Image pill (aria-label=\"Image, click to remove\"). Activation is idempotent — if the pill is already present, the helper early-returns without re-opening the menu.
  • Aspect resolution: each user alias maps to ChatGPT's exact aria-label on the menuitemradio (e.g. 9:16 and story both map to \"Story 9:16\"); unknown values throw a typed ArgumentError with the choice list.

Validation

  • npx tsc --noEmit — clean
  • npx vitest run --project adapter clis/chatgpt/34/34 pass (9 new cases: conv navigation, aspect aliases, invalid aspect, activation failure, aspect-apply failure)
  • npm test3528 passing across 352 files (1 pre-existing skip unrelated to this PR)
  • Live smoke on the browser bridge:
    • `opencli chatgpt image "a fluffy orange cat..." --aspect 9:16` — produces a 941×1672 PNG (0.5628 ≈ 9/16), proving the aspect menuitem is actually clicked rather than hinted in prose
    • `opencli chatgpt image "make it sleeping" --conv --aspect 9:16` — appends to the existing thread, keeps the ratio
    • `opencli chatgpt image "a cute robot mascot"` — default behaviour unchanged (no flags) → same outputs as before
    • `opencli chatgpt image "cat" --aspect panorama` → rejected fast with typed `code: ARGUMENT`

Backwards compatibility

All new args are optional with defaults that match prior behaviour. Existing callers see no change.

One internal behavioural change worth flagging for reviewers: the Image tool pill is now visibly toggled on for every `opencli chatgpt image` call, even when no `--aspect` is passed. Previously the command relied on the `Generate an image of:` prefix to trigger model-side routing; this is more deterministic and produced no regressions in live smoke, but I'm happy to gate the activation behind `--aspect` if reviewers prefer to keep the old default.

Related

Builds naturally on top of #1556 (which fixes a separate `getChatGPTImageAssets` bridge-arg bug that breaks the download path on opencli 1.7.x — without that fix, this PR's live smoke for downloading wouldn't have completed). They're independent and can merge in either order.

Two related additions to `opencli chatgpt image`:

1. `--conv <id|url>` — continue an existing conversation rather than
   always opening a fresh `/new`. Accepts a bare id, `/c/<id>`, or a
   full `chatgpt.com/c/<id>` URL.
2. `--aspect <ratio>` — drive ChatGPT's "Choose image aspect ratio"
   picker through the real UI (real pointer-event sequence on the
   `[role=menuitemradio]` items). Accepts the six ChatGPT presets:
   `auto`, `1:1`, `3:4`, `9:16`, `4:3`, `16:9`, plus the named aliases
   `square`, `portrait`, `story`, `landscape`, `widescreen`.

The Image tool is now always toggled on (idempotent) so the aspect
picker surfaces and the response is routed deterministically through
the image generator regardless of model auto-routing heuristics.

Prompt wrapping is preserved for fresh chats (`Generate an image of:`
/ `Edit the attached image:`) for backward compatibility; when
`--conv` is supplied the prompt is sent verbatim because the
conversation already provides context.

## Implementation notes

- Radix menus on chatgpt.com swallow synthetic `button.click()`. Every
  menu interaction dispatches a full pointer sequence
  (`pointerdown` → `mousedown` → `pointerup` → `mouseup` → `click`)
  via a shared `POINTER_FIRE_SNIPPET` inlined into `page.evaluate`.
- The Image tool pill (`aria-label="Image, click to remove"`) is used
  as the post-condition for `activateChatGPTImageTool`; aspect picker
  is keyed by exact `aria-label` (`Story 9:16`, etc.) on the
  menuitemradio nodes.

## Validation

- `npx tsc --noEmit` clean
- `npx vitest run --project adapter clis/chatgpt/` — 34/34 pass
  (9 new cases covering conv navigation, aspect aliases, invalid
  aspect, activation failure, aspect-apply failure)
- `npm test` — 3528 passing across 352 files
- Live smoke (logged-in browser bridge):
  - `opencli chatgpt image "a fluffy orange cat..." --aspect 9:16`
    produces a 941×1672 PNG (0.5628 ≈ 9/16) — confirms the aspect
    menuitem is being clicked, not just hinted in the prompt.
  - `opencli chatgpt image "make it sleeping" --conv <id> --aspect 9:16`
    appends to the conversation and keeps the requested ratio.
  - `opencli chatgpt image "a cute robot mascot"` — default behaviour
    unchanged, generates and downloads as before.
  - `opencli chatgpt image "cat" --aspect panorama` rejects fast with
    a typed `ArgumentError`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant