Skip to content

Guard against empty/None/truncated provider responses in rto, self_consistency, reread, leap#311

Open
SuperMarioYL wants to merge 1 commit into
algorithmicsuperintelligence:mainfrom
SuperMarioYL:fix/response-validation-remaining-approaches
Open

Guard against empty/None/truncated provider responses in rto, self_consistency, reread, leap#311
SuperMarioYL wants to merge 1 commit into
algorithmicsuperintelligence:mainfrom
SuperMarioYL:fix/response-validation-remaining-approaches

Conversation

@SuperMarioYL
Copy link
Copy Markdown

Summary

Several optimization approaches access response.choices[0].message.content
directly without validating the provider response first. When an upstream
provider returns an empty choices list, a None content, or a
length-truncated completion, this raises an uncaught IndexError /
TypeError and aborts the request with an opaque error.

This extends the response-validation idiom already merged in moa.py,
bon.py, plansearch.py and mcts.py to the four remaining approach modules:

File Before After
rto.py 4 raw .choices[0] accesses in the sequential C1→Q2→C2→C3 pipeline; empty choices → IndexError, None content corrupts the next prompt each step validated, raises an informative error
self_consistency.py None content appended to the sample list then fed to SequenceMatcher in calculate_similarityTypeError empty/None/truncated samples are skipped; degrades to the existing "No consistent answer found." fallback
reread.py None.strip() / empty choices crash; multi-choice path keeps Nones validated; multi-choice path keeps only usable, non-truncated choices
leap.py 5 raw accesses, several feeding extract_output / .split()TypeError on None validated via a small _extract_content helper at all five call sites

The guard mirrors the one introduced in #266:

if (response is None or
        not response.choices or
        response.choices[0].message.content is None or
        response.choices[0].finish_reason == "length"):
    ...

Single-response steps (rto / reread n=1 / leap) raise an informative error
because they have no alternative to fall back on; the in-loop sampler
(self_consistency) skips the bad sample and continues, matching the
continue behavior in bon/moa.

Behavior note

For rto, reread and leap, a finish_reason == "length" response was
previously returned as a (partial) success and is now treated as a failure.
This is intentional: a truncated intermediate result silently corrupts every
downstream step of these pipelines, so failing loudly is preferable to
returning unreliable output. self_consistency simply drops the truncated
sample.

Scope

Tightly scoped to the four approach modules above plus tests
(~65 functional LoC). It deliberately does not touch the files in #310
(mars/agent, coc_plugin, deep_research, spl) — the two PRs are
complementary and cover disjoint files.

Testing

Added tests/test_approaches.py::test_approaches_handle_bad_responses, which
drives all four approaches with a MockBadClient for the empty-choices,
None-content and length-truncated cases. The test is red on main (uncaught
IndexError from round_trip_optimization on empty choices) and green
with this change. The existing approach tests continue to pass
(finish_reason="stop" was added to the shared MockClient so the happy path
exercises the new guard).

…nsistency, reread, leap

Several optimization approaches accessed response.choices[0].message.content
directly. When an upstream provider returns an empty choices list, a None
content, or a length-truncated completion, this raised an uncaught IndexError
or TypeError (e.g. a None content flowing into extract_output or
SequenceMatcher), aborting the request with an opaque error.

Apply the same response-validation idiom already used in moa/bon/plansearch:
- rto: raise an informative error for each of the four round-trip steps
- self_consistency: skip empty/None/truncated samples so they do not corrupt
  similarity clustering (falls back to 'No consistent answer found.')
- reread: validate before access; drop None contents in the multi-choice path
- leap: validate via a small _extract_content helper at all five call sites

Add tests/test_approaches.py::test_approaches_handle_bad_responses covering
empty choices, None content, and finish_reason=='length' across the four
approaches.
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants