Guard against empty/None/truncated provider responses in rto, self_consistency, reread, leap#311
Open
SuperMarioYL wants to merge 1 commit into
Conversation
…nsistency, reread, leap Several optimization approaches accessed response.choices[0].message.content directly. When an upstream provider returns an empty choices list, a None content, or a length-truncated completion, this raised an uncaught IndexError or TypeError (e.g. a None content flowing into extract_output or SequenceMatcher), aborting the request with an opaque error. Apply the same response-validation idiom already used in moa/bon/plansearch: - rto: raise an informative error for each of the four round-trip steps - self_consistency: skip empty/None/truncated samples so they do not corrupt similarity clustering (falls back to 'No consistent answer found.') - reread: validate before access; drop None contents in the multi-choice path - leap: validate via a small _extract_content helper at all five call sites Add tests/test_approaches.py::test_approaches_handle_bad_responses covering empty choices, None content, and finish_reason=='length' across the four approaches.
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Several optimization approaches access
response.choices[0].message.contentdirectly without validating the provider response first. When an upstream
provider returns an empty
choiceslist, aNonecontent, or alength-truncated completion, this raises an uncaught
IndexError/TypeErrorand aborts the request with an opaque error.This extends the response-validation idiom already merged in
moa.py,bon.py,plansearch.pyandmcts.pyto the four remaining approach modules:rto.py.choices[0]accesses in the sequential C1→Q2→C2→C3 pipeline; empty choices →IndexError,Nonecontent corrupts the next promptself_consistency.pyNonecontent appended to the sample list then fed toSequenceMatcherincalculate_similarity→TypeErrorreread.pyNone.strip()/ empty choices crash; multi-choice path keepsNonesleap.pyextract_output/.split()→TypeErroronNone_extract_contenthelper at all five call sitesThe guard mirrors the one introduced in #266:
Single-response steps (rto / reread n=1 / leap) raise an informative error
because they have no alternative to fall back on; the in-loop sampler
(
self_consistency) skips the bad sample and continues, matching thecontinuebehavior inbon/moa.Behavior note
For
rto,rereadandleap, afinish_reason == "length"response waspreviously returned as a (partial) success and is now treated as a failure.
This is intentional: a truncated intermediate result silently corrupts every
downstream step of these pipelines, so failing loudly is preferable to
returning unreliable output.
self_consistencysimply drops the truncatedsample.
Scope
Tightly scoped to the four approach modules above plus tests
(~65 functional LoC). It deliberately does not touch the files in #310
(
mars/agent,coc_plugin,deep_research,spl) — the two PRs arecomplementary and cover disjoint files.
Testing
Added
tests/test_approaches.py::test_approaches_handle_bad_responses, whichdrives all four approaches with a
MockBadClientfor the empty-choices,None-content and length-truncated cases. The test is red on
main(uncaughtIndexErrorfromround_trip_optimizationon empty choices) and greenwith this change. The existing approach tests continue to pass
(
finish_reason="stop"was added to the sharedMockClientso the happy pathexercises the new guard).