feat: Add structured output (constrained decoding) support for agentic memory fact extraction by ehavener · Pull Request #4824 · opensearch-project/ml-commons

ehavener · 2026-05-17T22:29:34Z

Description

Adds structured output (constrained decoding) support for agentic memory fact extraction. When a connector's predict action has supports_structured_output: true, MemoryContainerHelper resolves the provider from the connector URL and injects a provider-specific JSON schema into fact-extraction requests. HttpConnector reads _*_json / _*_additions_json parameters and injects them as top-level fields in the outgoing request body, so the provider enforces output structure at the token level rather than relying on prompt instructions.

Implemented providers: OpenAI, Azure OpenAI, DeepSeek, Ollama, Cohere v2, Google Gemini/Vertex AI, Amazon Bedrock Converse

How it works

ConnectorAction — adds a supportsStructuredOutput boolean field (default false); wire-serialized for clusters ≥ 3.7.0 only.
HttpConnector.injectStructuredOutputParams — reads _<X>_json / _<X>_additions_json parameters and injects them as top-level fields in the outgoing request body.
MemoryContainerHelper.getStructuredOutputParameters — async lookup that resolves the model's connector, checks the flag, and returns the provider-specific schema parameters. Returns empty map on any failure; callers fall back to prompt enforcement.
MemoryProcessingService — merges structured output params into the predict request when available, skipping the prompt enforcement sentence. Falls back gracefully on lookup failure.

Manual testing

Ollama : A smoke test via a direct _predict call with _response_format_json confirmed that HttpConnector.injectStructuredOutputParams correctly reads the _*_json naming convention and injects response_format into the Ollama request body. Ollama then enforces the schema at the token level, returning {"facts": ["The user prefers dark mode."]} instead of free-forming the structure.
Bedrock Converse (claude-sonnet-4-6)

Related Issues

Resolves #4799

Check List

New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

github-actions · 2026-05-17T22:30:39Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 96c6ff8.

Path	Line	Severity	Description
plugin/src/main/java/org/opensearch/ml/helper/MemoryContainerHelper.java	533	medium	getStructuredOutputParameters stashes the thread context via stashContext() before calling modelManager.getModel(), effectively bypassing tenant/user-level security checks enforced by the thread context. While this pattern is common in OpenSearch for background system operations, its use here grants the model and connector lookup elevated effective permissions. Warrants review to confirm this aligns with the project's access control model for this operation.
ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/remote/ConnectorUtils.java	274	medium	tryReadResponseFilter silently swallows PathNotFoundException and falls back to returning the full raw API response when the configured response_filter path does not match. Previously, a path miss would throw and halt processing. The new behavior may expose unfiltered provider responses (including metadata, headers, or other fields not intended for callers) in error or format-mismatch scenarios. Intent appears legitimate but the silent fallback widens the data surface returned to callers.
common/src/main/java/org/opensearch/ml/common/connector/HttpConnector.java	381	low	injectStructuredOutputParams iterates caller-supplied parameters and injects JSON objects into the outgoing provider request payload. The allowlist (STRUCTURED_OUTPUT_ALLOWED_FIELDS: response_format, generationConfig, toolConfig) restricts which top-level fields can be injected. However, the merge pass (Pass 2) adds arbitrary sub-keys from the parameter value into allowed top-level objects without further validation of the nested content. A caller with the ability to set connector parameters could inject arbitrary nested JSON under an allowed field name. The requires supports_structured_output=true on the action (admin-controlled), limiting blast radius, but the nested content is not validated.

The table above displays the top 10 most important findings.

Total: 3 | Critical: 0 | High: 0 | Medium: 2 | Low: 1

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

github-actions · 2026-05-17T22:31:36Z

PR Reviewer Guide 🔍

(Review updated until commit `96c6ff8`)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review Possible Issue The URL_PARTS regex captures host and path but does not handle URLs with credentials (e.g., https://user:pass@host/path). If a connector URL embeds credentials, group(1) will include "user:pass@host" instead of just "host", causing hostHasSegment to fail or mis-match. Credentials in URLs are rare but valid; if encountered, the provider detection logic silently returns an empty map, falling back to prompt enforcement without logging the parse failure. private static final Pattern URL_PARTS = Pattern.compile("^[^:]+://([^/?#])(/?[^?#])"); Possible Issue injectStructuredOutputParams modifies the request body only when payload is a JSON object. If the payload is a JSON array or invalid JSON, the method returns the original payload unchanged. However, if parameters contain _response_format_json or similar keys, those parameters remain in the map and may be sent to the upstream provider as query parameters or in an unexpected location, potentially causing a 4xx error or silent parameter rejection. The method does not remove matched parameters from the map, so callers must ensure they are not inadvertently forwarded. // Note: matched parameters are read but not removed from the map. private String injectStructuredOutputParams(Map<String, String> parameters, String payload) { JsonElement parsed = JsonParser.parseString(payload); if (!parsed.isJsonObject()) return payload; JsonObject body = parsed.getAsJsonObject(); boolean modified = false; // Pass 1: _<field>_json — inject or replace a top-level field for (Map.Entry<String, String> entry : parameters.entrySet()) { if (entry.getKey().endsWith("_additions_json")) continue; String fieldName = allowedFieldName(entry.getKey(), "_json"); JsonObject value = asJsonObject(entry.getValue()); if (fieldName == null \|\| value == null) continue; body.add(fieldName, value); modified = true; } // Pass 2: _<field>_additions_json — merge into an existing top-level object for (Map.Entry<String, String> entry : parameters.entrySet()) { String fieldName = allowedFieldName(entry.getKey(), "_additions_json"); JsonObject additions = asJsonObject(entry.getValue()); if (fieldName == null \|\| additions == null) continue; JsonObject target = body.has(fieldName) && body.get(fieldName).isJsonObject() ? body.getAsJsonObject(fieldName) : new JsonObject(); additions.entrySet().forEach(e -> target.add(e.getKey(), e.getValue())); body.add(fieldName, target); modified = true; } return modified ? body.toString() : payload; } Possible Issue In sendFactExtractionRequest, if structuredOutputResultPath is non-null but the actual response does not contain that path (e.g., the model returns an error or a different response shape), parseFactsFromLLMResponse will attempt to read the path via JsonPath.read, which may throw PathNotFoundException. The catch block in ConnectorUtils.tryReadResponseFilter returns null on PathNotFoundException, but parseFactsFromLLMResponse does not handle a null filteredResult gracefully—it will pass null to JsonPath.read again, causing a NullPointerException or similar failure. This results in an onFailure callback with a generic "Internal server error" message, masking the actual issue. private void sendFactExtractionRequest( String tenantId, String llmModelId, Map<String, String> stringParameters, String structuredOutputResultPath, MemoryStrategy strategy, MemoryConfiguration memoryConfig, ActionListener<List<String>> listener ) { MLInput mlInput = MLInput .builder() .algorithm(FunctionName.REMOTE) .inputDataset(RemoteInferenceInputDataSet.builder().parameters(stringParameters).build()) .build(); MLPredictionTaskRequest predictionRequest = MLPredictionTaskRequest .builder() .modelId(llmModelId) .mlInput(mlInput) .tenantId(tenantId) .build(); client.execute(MLPredictionTaskAction.INSTANCE, predictionRequest, ActionListener.wrap(response -> { try { log.debug("Received LLM response, parsing facts..."); List<String> facts = parseFactsFromLLMResponse(strategy, memoryConfig, structuredOutputResultPath, response.getOutput()); log.debug("Extracted {} facts from LLM response", facts.size()); listener.onResponse(facts); } catch (Exception e) { if (e instanceof OpenSearchException) { OpenSearchException osException = (OpenSearchException) e; if (osException.status().getStatus() >= 400 && osException.status().getStatus() < 500) { listener.onFailure(e); return; } } log.error("Failed to parse facts from LLM response", e); listener.onFailure(new OpenSearchStatusException("Internal server error", RestStatus.INTERNAL_SERVER_ERROR)); } }, e -> { log.error("Failed to call LLM for fact extraction", e); listener.onFailure(new OpenSearchStatusException("Internal server error", RestStatus.INTERNAL_SERVER_ERROR)); })); }

github-actions · 2026-05-17T22:32:15Z

PR Code Suggestions ✨

Latest suggestions up to 96c6ff8

Explore these optional code suggestions:

Category	Suggestion	Impact
General	Catch broader exceptions in response filter The `tryReadResponseFilter` method only catches `PathNotFoundException` but does not handle other potential exceptions from `JsonPath.parse()` or `read()`, such as `InvalidJsonException` or `InvalidPathException`. If the response is malformed JSON or the filter path is syntactically invalid, these exceptions will propagate uncaught, potentially causing the entire fact extraction flow to fail. Consider catching a broader exception type (e.g., `Exception`) to ensure robustness and log the error appropriately. ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/remote/ConnectorUtils.java [491-499] private static Object tryReadResponseFilter(String response, String responseFilter) { try { return JsonPath.parse(response).read(responseFilter); } catch (PathNotFoundException e) { log.debug("response_filter path '{}' not found in response, using full response", responseFilter); return null; + } catch (Exception e) { + log.warn("Failed to apply response_filter '{}', using full response", responseFilter, e); + return null; } } Suggestion importance[1-10]: 7 __ Why: Good suggestion to catch broader exceptions beyond `PathNotFoundException`. The method could fail with `InvalidJsonException` or `InvalidPathException` which would propagate uncaught. Adding a catch-all `Exception` handler with appropriate logging would improve robustness and prevent the fact extraction flow from failing unexpectedly.	Medium
	Handle URLs with embedded credentials correctly The regex pattern `^[^:]+://([^/?#])(/?[^?#])` captures the host and path but does not handle URLs with credentials (e.g., `https://user:pass@host/path`). If a connector URL contains embedded credentials, the host capture group will include `user:pass@host`, causing incorrect host-based matching. Consider updating the pattern to exclude credentials from the host capture group, or document that connector URLs must not contain credentials. plugin/src/main/java/org/opensearch/ml/helper/MemoryContainerHelper.java [108] -private static final Pattern URL_PARTS = Pattern.compile("^[^:]+://([^/?#])(/?[^?#])"); +private static final Pattern URL_PARTS = Pattern.compile("^[^:]+://(?:[^@/?#]@)?([^/?#])(/?[^?#]*)"); Suggestion importance[1-10]: 6 __ Why: Valid observation that the regex doesn't handle URLs with embedded credentials (e.g., `https://user:pass@host/path`). The suggested pattern improvement would correctly exclude credentials from the host capture group. However, connector URLs with embedded credentials are uncommon in practice, and the current implementation works for all documented use cases.	Low
	Validate non-empty field name explicitly The `allowedFieldName` method does not validate that the extracted field name is non-empty before checking the allowlist. If `key` is exactly `"_" + suffix` (e.g., `"_json"`), `end` equals 1, and `substring(1, 1)` returns an empty string. This empty string is then checked against `STRUCTURED_OUTPUT_ALLOWED_FIELDS`, which does not contain it, so `null` is returned. However, the logic should explicitly reject empty field names earlier to avoid unnecessary allowlist lookups and improve clarity. common/src/main/java/org/opensearch/ml/common/connector/HttpConnector.java [433-441] private static String allowedFieldName(String key, String suffix) { if (!key.startsWith("_") \|\| !key.endsWith(suffix)) return null; int end = key.length() - suffix.length(); if (end <= 1) return null; String fieldName = key.substring(1, end); + if (fieldName.isEmpty()) + return null; return STRUCTURED_OUTPUT_ALLOWED_FIELDS.contains(fieldName) ? fieldName : null; } Suggestion importance[1-10]: 4 __ Why: The suggestion correctly identifies that an empty field name check could improve clarity, but the current logic already handles this case correctly by returning `null` when `end <= 1`. The additional check is redundant since `substring(1, 1)` returns an empty string which won't be in `STRUCTURED_OUTPUT_ALLOWED_FIELDS`. The improvement is minimal and primarily stylistic.	Low

Previous suggestions

Suggestions up to commit 28b697b

Category	Suggestion	Impact
General	Propagate errors instead of silent failure When `buildUserPrompt` or `buildUserPromptWithEnforcement` throws an exception, the code calls `listener.onResponse(new ArrayList<>())` and returns, but the outer `sendFactExtractionRequest` is never called. This silently swallows the error and returns an empty fact list, which may mask critical issues. Consider calling `listener.onFailure(e)` instead to propagate the error to the caller for proper handling. plugin/src/main/java/org/opensearch/ml/action/memorycontainer/memory/MemoryProcessingService.java [130-146] memoryContainerHelper.getStructuredOutputParameters(llmModelId, ActionListener.wrap(rawStructuredOutputParams -> { Map<String, String> structuredOutputParams = new HashMap<>(rawStructuredOutputParams); String structuredOutputResultPath = structuredOutputParams.remove("_structured_output_result_path"); try { if (!structuredOutputParams.isEmpty()) { stringParameters.putAll(structuredOutputParams); stringParameters.put("user_prompt", buildUserPrompt(serializeMessagesToJson(messages))); } else { stringParameters.put("user_prompt", buildUserPromptWithEnforcement(messages, strategy.getType())); } } catch (Exception e) { log.error("Failed to build messages JSON", e); - listener.onResponse(new ArrayList<>()); + listener.onFailure(new OpenSearchStatusException("Failed to build fact extraction request", RestStatus.INTERNAL_SERVER_ERROR)); return; } sendFactExtractionRequest(tenantId, llmModelId, stringParameters, structuredOutputResultPath, strategy, memoryConfig, listener); }, e -> { ... })); Suggestion importance[1-10]: 8 __ Why: Important suggestion to call `listener.onFailure()` instead of `listener.onResponse(new ArrayList<>())` when message building fails. Silently returning empty facts masks critical errors and prevents proper error handling upstream.	Medium
General	Document immutability of allowlist set The `STRUCTURED_OUTPUT_ALLOWED_FIELDS` set is defined as a static constant but is not immutable at the collection level. While `Set.of()` returns an unmodifiable set, explicitly documenting this immutability or using a more defensive pattern (e.g., `Collections.unmodifiableSet`) can prevent future modifications if the initialization changes. Consider adding a comment or using a defensive copy pattern. common/src/main/java/org/opensearch/ml/common/connector/HttpConnector.java [69-74] +// Immutable allowlist of top-level JSON fields for structured output injection. static final Set<String> STRUCTURED_OUTPUT_ALLOWED_FIELDS = Set .of( "response_format", "generationConfig", "toolConfig" ); Suggestion importance[1-10]: 3 __ Why: Minor documentation improvement. `Set.of()` already returns an immutable set, so the suggestion only adds a clarifying comment without changing behavior.	Low
Possible issue	Handle JSON parse exceptions gracefully The `injectStructuredOutputParams` method does not handle `JsonSyntaxException` when parsing the payload. If the payload is malformed JSON, `JsonParser.parseString(payload)` will throw an unchecked exception, potentially causing the entire fact extraction request to fail. Wrap the parsing in a try-catch block and return the original payload on parse failure to ensure graceful degradation. common/src/main/java/org/opensearch/ml/common/connector/HttpConnector.java [396-430] private String injectStructuredOutputParams(Map<String, String> parameters, String payload) { - JsonElement parsed = JsonParser.parseString(payload); + JsonElement parsed; + try { + parsed = JsonParser.parseString(payload); + } catch (JsonSyntaxException e) { + log.warn("Failed to parse payload as JSON for structured output injection, returning unchanged", e); + return payload; + } if (!parsed.isJsonObject()) return payload; JsonObject body = parsed.getAsJsonObject(); boolean modified = false; - - // Pass 1: _<field>_json — inject or replace a top-level field - for (Map.Entry<String, String> entry : parameters.entrySet()) { - if (entry.getKey().endsWith("_additions_json")) - continue; - String fieldName = allowedFieldName(entry.getKey(), "_json"); - JsonObject value = asJsonObject(entry.getValue()); - if (fieldName == null \|\| value == null) - continue; - body.add(fieldName, value); - modified = true; - } ... } Suggestion importance[1-10]: 7 __ Why: Valid suggestion to add exception handling for `JsonParser.parseString()`. While `isJson()` is called before this method, adding explicit try-catch for `JsonSyntaxException` improves robustness and prevents unexpected failures.	Medium

Suggestions up to commit f3621f5

Category	Suggestion	Impact
General	Report prompt build failures to caller When `buildUserPrompt` or `buildUserPromptWithEnforcement` throws an exception, the code calls `listener.onResponse(new ArrayList<>())` which silently returns an empty fact list. This masks the error from the caller. Consider calling `listener.onFailure(e)` instead so the caller is aware that fact extraction failed due to a message serialization error. plugin/src/main/java/org/opensearch/ml/action/memorycontainer/memory/MemoryProcessingService.java [136-146] memoryContainerHelper.getStructuredOutputParameters(llmModelId, ActionListener.wrap(rawStructuredOutputParams -> { Map<String, String> structuredOutputParams = new HashMap<>(rawStructuredOutputParams); String structuredOutputResultPath = structuredOutputParams.remove("_structured_output_result_path"); try { if (!structuredOutputParams.isEmpty()) { stringParameters.putAll(structuredOutputParams); stringParameters.put("user_prompt", buildUserPrompt(serializeMessagesToJson(messages))); } else { stringParameters.put("user_prompt", buildUserPromptWithEnforcement(messages, strategy.getType())); } } catch (Exception e) { log.error("Failed to build messages JSON", e); - listener.onResponse(new ArrayList<>()); + listener.onFailure(new OpenSearchStatusException("Failed to build user prompt", RestStatus.INTERNAL_SERVER_ERROR)); return; } sendFactExtractionRequest(tenantId, llmModelId, stringParameters, structuredOutputResultPath, strategy, memoryConfig, listener); }, e -> { ... Suggestion importance[1-10]: 8 __ Why: This is a valid improvement. Returning an empty list on error (`listener.onResponse(new ArrayList<>())`) masks the failure from the caller, making debugging difficult. Calling `listener.onFailure(e)` properly propagates the error, allowing the caller to handle it appropriately. This improves error handling and observability.	Medium
General	Add null checks for regex groups Validate that `m.group(1)` and `m.group(2)` are not null before calling `toLowerCase()`. Although the regex pattern should always capture these groups when `find()` returns true, defensive null checks prevent potential `NullPointerException` if the regex behavior changes or if an edge case is encountered. plugin/src/main/java/org/opensearch/ml/helper/MemoryContainerHelper.java [587-596] private Map<String, String> schemaForUrl(String url) { if (url == null) { return Map.of(); } Matcher m = URL_PARTS.matcher(url); if (!m.find()) { return Map.of(); } - String host = m.group(1).toLowerCase(Locale.ROOT); - String path = m.group(2).toLowerCase(Locale.ROOT); + String host = m.group(1); + String path = m.group(2); + if (host == null \|\| path == null) { + return Map.of(); + } + host = host.toLowerCase(Locale.ROOT); + path = path.toLowerCase(Locale.ROOT); ... Suggestion importance[1-10]: 3 __ Why: While defensive null checks are generally good practice, the regex pattern `URL_PARTS` is designed to always capture both groups when `find()` returns true. The suggestion adds unnecessary complexity for an edge case that should not occur with the current regex pattern. The impact is minimal.	Low
Possible issue	Handle JSON parsing exceptions gracefully Wrap the `JsonParser.parseString(payload)` call in a try-catch block to handle malformed JSON gracefully. If parsing fails, log a warning and return the original payload unchanged instead of propagating the exception, which could cause the entire fact extraction request to fail. common/src/main/java/org/opensearch/ml/common/connector/HttpConnector.java [381-383] private String injectStructuredOutputParams(Map<String, String> parameters, String payload) { - JsonElement parsed = JsonParser.parseString(payload); + JsonElement parsed; + try { + parsed = JsonParser.parseString(payload); + } catch (Exception e) { + log.warn("Failed to parse payload as JSON for structured output injection, returning unchanged", e); + return payload; + } if (!parsed.isJsonObject()) return payload; JsonObject body = parsed.getAsJsonObject(); boolean modified = false; - - // Pass 1: _<field>_json — inject or replace a top-level field - for (Map.Entry<String, String> entry : parameters.entrySet()) { - if (entry.getKey().endsWith("_additions_json")) - continue; - String fieldName = allowedFieldName(entry.getKey(), "_json"); - JsonObject value = asJsonObject(entry.getValue()); - if (fieldName == null \|\| value == null) - continue; - body.add(fieldName, value); - modified = true; - } ... Suggestion importance[1-10]: 7 __ Why: Adding exception handling for `JsonParser.parseString(payload)` is a good defensive practice. However, the current code already handles non-JSON payloads by checking `isJsonObject()`, and the `isJson(payload)` check at line 381 should prevent most malformed JSON from reaching this point. The suggestion improves robustness but has moderate impact.	Medium

Suggestions up to commit e999b8d

Category	Suggestion	Impact
General	Propagate prompt build errors properly When `buildUserPrompt` or `buildUserPromptWithEnforcement` throws an exception, the listener is called with an empty list, but the error is only logged. Consider calling `listener.onFailure(e)` instead to properly propagate the error to the caller, allowing them to handle it appropriately. plugin/src/main/java/org/opensearch/ml/action/memorycontainer/memory/MemoryProcessingService.java [130-146] memoryContainerHelper.getStructuredOutputParameters(llmModelId, ActionListener.wrap(rawStructuredOutputParams -> { Map<String, String> structuredOutputParams = new HashMap<>(rawStructuredOutputParams); String structuredOutputResultPath = structuredOutputParams.remove("_structured_output_result_path"); try { if (!structuredOutputParams.isEmpty()) { stringParameters.putAll(structuredOutputParams); stringParameters.put("user_prompt", buildUserPrompt(serializeMessagesToJson(messages))); } else { stringParameters.put("user_prompt", buildUserPromptWithEnforcement(messages, strategy.getType())); } } catch (Exception e) { log.error("Failed to build messages JSON", e); - listener.onResponse(new ArrayList<>()); + listener.onFailure(new OpenSearchStatusException("Failed to build user prompt", RestStatus.INTERNAL_SERVER_ERROR)); return; } sendFactExtractionRequest(tenantId, llmModelId, stringParameters, structuredOutputResultPath, strategy, memoryConfig, listener); }, e -> { ... })); Suggestion importance[1-10]: 7 __ Why: Good suggestion to call `listener.onFailure(e)` instead of `listener.onResponse(new ArrayList<>())` when prompt building fails. This properly propagates the error to the caller, allowing better error handling and visibility of the failure.	Medium
	Handle JSON parsing exceptions gracefully The `JsonParser.parseString` call can throw `JsonSyntaxException` if the JSON is malformed, even after `isJson` validation. Wrap the parsing in a try-catch block to handle malformed JSON gracefully and prevent exceptions from propagating. common/src/main/java/org/opensearch/ml/common/connector/HttpConnector.java [443-449] private static JsonObject asJsonObject(String json) { if (json == null \|\| !isJson(json)) return null; - JsonElement el = JsonParser.parseString(json); - return el.isJsonObject() ? el.getAsJsonObject() : null; + try { + JsonElement el = JsonParser.parseString(json); + return el.isJsonObject() ? el.getAsJsonObject() : null; + } catch (JsonSyntaxException e) { + return null; + } } Suggestion importance[1-10]: 6 __ Why: Valid suggestion to add explicit exception handling for `JsonParser.parseString`, even though `isJson` should catch most malformed JSON. The try-catch adds defensive programming and prevents unexpected exceptions from propagating.	Low
	Optimize parameter map iteration The method iterates over the entire `parameters` map twice, which is inefficient for large maps. Consider collecting the relevant keys in a single pass and then processing them, or use a more efficient filtering approach to reduce redundant iterations. common/src/main/java/org/opensearch/ml/common/connector/HttpConnector.java [396-430] private String injectStructuredOutputParams(Map<String, String> parameters, String payload) { JsonElement parsed = JsonParser.parseString(payload); if (!parsed.isJsonObject()) return payload; JsonObject body = parsed.getAsJsonObject(); boolean modified = false; - // Pass 1: _<field>_json — inject or replace a top-level field + List<Map.Entry<String, String>> replaceEntries = new ArrayList<>(); + List<Map.Entry<String, String>> mergeEntries = new ArrayList<>(); + for (Map.Entry<String, String> entry : parameters.entrySet()) { - if (entry.getKey().endsWith("_additions_json")) - continue; + if (entry.getKey().endsWith("_additions_json")) { + mergeEntries.add(entry); + } else if (entry.getKey().endsWith("_json")) { + replaceEntries.add(entry); + } + } + + // Pass 1: _<field>_json — inject or replace + for (Map.Entry<String, String> entry : replaceEntries) { String fieldName = allowedFieldName(entry.getKey(), "_json"); JsonObject value = asJsonObject(entry.getValue()); if (fieldName == null \|\| value == null) continue; body.add(fieldName, value); modified = true; } - ... + + // Pass 2: _<field>_additions_json — merge + for (Map.Entry<String, String> entry : mergeEntries) { + String fieldName = allowedFieldName(entry.getKey(), "_additions_json"); + JsonObject additions = asJsonObject(entry.getValue()); + if (fieldName == null \|\| additions == null) + continue; + JsonObject target = body.has(fieldName) && body.get(fieldName).isJsonObject() + ? body.getAsJsonObject(fieldName) + : new JsonObject(); + additions.entrySet().forEach(e -> target.add(e.getKey(), e.getValue())); + body.add(fieldName, target); + modified = true; + } + + return modified ? body.toString() : payload; } Suggestion importance[1-10]: 4 __ Why: The suggestion correctly identifies that the method iterates over `parameters` twice, which could be optimized. However, the performance impact is likely minimal for typical parameter map sizes, and the current implementation is clearer and more maintainable.	Low
Possible issue	Ensure ThreadContext restoration on exceptions The `ThreadContext.StoredContext` is restored in the `runBefore` callback, but if an exception occurs before the async callbacks complete, the context may not be properly restored. Wrap the entire method body in a try-catch to ensure the context is always restored, even on unexpected exceptions. plugin/src/main/java/org/opensearch/ml/helper/MemoryContainerHelper.java [540-567] public void getStructuredOutputParameters(String modelId, ActionListener<Map<String, String>> listener) { - try (ThreadContext.StoredContext context = client.threadPool().getThreadContext().stashContext()) { + ThreadContext.StoredContext context = client.threadPool().getThreadContext().stashContext(); + try { modelManager.getModel(modelId, ActionListener.runBefore(ActionListener.wrap(mlModel -> { Connector connector = mlModel.getConnector(); if (connector != null) { listener.onResponse(schemaForConnector(connector)); } else if (mlModel.getConnectorId() != null) { modelManager .getConnector( mlModel.getConnectorId(), null, ActionListener.wrap(c -> listener.onResponse(schemaForConnector(c)), e -> { log.warn("Failed to fetch connector {} for structured output detection", mlModel.getConnectorId(), e); listener.onResponse(Map.of()); }) ); } else { listener.onResponse(Map.of()); } }, e -> { log.warn("Failed to fetch model {} for structured output detection, falling back to prompt enforcement", modelId, e); listener.onResponse(Map.of()); }), context::restore)); + } catch (Exception e) { + context.restore(); + log.error("Unexpected error in getStructuredOutputParameters", e); + listener.onResponse(Map.of()); } } Suggestion importance[1-10]: 3 __ Why: The suggestion raises a valid concern about exception handling, but the `try-with-resources` statement already ensures `context.restore()` is called via the `runBefore` callback. The additional try-catch may be redundant and could complicate the code without significant benefit.	Low

Suggestions up to commit 559c9ad

Category	Suggestion	Impact
General	Propagate serialization errors correctly Call `listener.onFailure(e)` instead of `listener.onResponse(new ArrayList<>())` when message JSON serialization fails. Returning an empty list silently hides the error from the caller, making debugging difficult. Propagating the exception ensures proper error handling upstream. plugin/src/main/java/org/opensearch/ml/action/memorycontainer/memory/MemoryProcessingService.java [130-146] memoryContainerHelper.getStructuredOutputParameters(llmModelId, ActionListener.wrap(rawStructuredOutputParams -> { Map<String, String> structuredOutputParams = new HashMap<>(rawStructuredOutputParams); String structuredOutputResultPath = structuredOutputParams.remove("_structured_output_result_path"); try { if (!structuredOutputParams.isEmpty()) { stringParameters.putAll(structuredOutputParams); stringParameters.put("user_prompt", buildUserPrompt(serializeMessagesToJson(messages))); } else { stringParameters.put("user_prompt", buildUserPromptWithEnforcement(messages, strategy.getType())); } } catch (Exception e) { log.error("Failed to build messages JSON", e); - listener.onResponse(new ArrayList<>()); + listener.onFailure(new OpenSearchStatusException("Failed to build messages JSON", RestStatus.INTERNAL_SERVER_ERROR)); return; } sendFactExtractionRequest(tenantId, llmModelId, stringParameters, structuredOutputResultPath, strategy, memoryConfig, listener); }, e -> { ... })); Suggestion importance[1-10]: 8 __ Why: The suggestion correctly identifies that returning an empty list (`listener.onResponse(new ArrayList<>())`) on serialization failure silently hides the error, making debugging difficult. Calling `listener.onFailure()` instead properly propagates the error to the caller, which is consistent with the error handling pattern used elsewhere in the same method (lines 149-156).	Medium
General	Add null checks for regex groups Validate that `m.group(1)` and `m.group(2)` are not null before calling `toLowerCase()`. Although the regex pattern should always capture these groups when `find()` returns true, defensive null checks prevent potential `NullPointerException` if the regex behavior changes or if the pattern is modified. plugin/src/main/java/org/opensearch/ml/helper/MemoryContainerHelper.java [583-596] private Map<String, String> schemaForUrl(String url) { if (url == null) { return Map.of(); } Matcher m = URL_PARTS.matcher(url); if (!m.find()) { return Map.of(); } - String host = m.group(1).toLowerCase(Locale.ROOT); - String path = m.group(2).toLowerCase(Locale.ROOT); + String host = m.group(1); + String path = m.group(2); + if (host == null \|\| path == null) { + return Map.of(); + } + host = host.toLowerCase(Locale.ROOT); + path = path.toLowerCase(Locale.ROOT); ... } Suggestion importance[1-10]: 3 __ Why: While defensive null checks are generally good practice, the `URL_PARTS` regex pattern explicitly captures two groups, and when `find()` returns true, both groups are guaranteed to be non-null (though they may be empty strings). This suggestion adds unnecessary defensive code that doesn't address a realistic failure scenario given the regex pattern used.	Low
Possible issue	Handle malformed JSON gracefully Wrap the JSON parsing in a try-catch block to handle malformed JSON gracefully. If `JsonParser.parseString(payload)` throws a `JsonSyntaxException`, the method should return the original payload unchanged rather than propagating the exception, ensuring robustness when the payload is not valid JSON. common/src/main/java/org/opensearch/ml/common/connector/HttpConnector.java [396-426] private String injectStructuredOutputParams(Map<String, String> parameters, String payload) { - JsonElement parsed = JsonParser.parseString(payload); - if (!parsed.isJsonObject()) return payload; - JsonObject body = parsed.getAsJsonObject(); - boolean modified = false; + try { + JsonElement parsed = JsonParser.parseString(payload); + if (!parsed.isJsonObject()) return payload; + JsonObject body = parsed.getAsJsonObject(); + boolean modified = false; - // Pass 1: _<field>_json — inject or replace a top-level field - for (Map.Entry<String, String> entry : parameters.entrySet()) { - if (entry.getKey().endsWith("_additions_json")) continue; - String fieldName = allowedFieldName(entry.getKey(), "_json"); - JsonObject value = asJsonObject(entry.getValue()); - if (fieldName == null \|\| value == null) continue; - body.add(fieldName, value); - modified = true; + // Pass 1: _<field>_json — inject or replace a top-level field + for (Map.Entry<String, String> entry : parameters.entrySet()) { + if (entry.getKey().endsWith("_additions_json")) continue; + String fieldName = allowedFieldName(entry.getKey(), "_json"); + JsonObject value = asJsonObject(entry.getValue()); + if (fieldName == null \|\| value == null) continue; + body.add(fieldName, value); + modified = true; + } + + // Pass 2: _<field>_additions_json — merge into an existing top-level object + for (Map.Entry<String, String> entry : parameters.entrySet()) { + String fieldName = allowedFieldName(entry.getKey(), "_additions_json"); + JsonObject additions = asJsonObject(entry.getValue()); + if (fieldName == null \|\| additions == null) continue; + JsonObject target = body.has(fieldName) && body.get(fieldName).isJsonObject() + ? body.getAsJsonObject(fieldName) + : new JsonObject(); + additions.entrySet().forEach(e -> target.add(e.getKey(), e.getValue())); + body.add(fieldName, target); + modified = true; + } + + return modified ? body.toString() : payload; + } catch (JsonSyntaxException e) { + return payload; } - - // Pass 2: _<field>_additions_json — merge into an existing top-level object - for (Map.Entry<String, String> entry : parameters.entrySet()) { - String fieldName = allowedFieldName(entry.getKey(), "_additions_json"); - JsonObject additions = asJsonObject(entry.getValue()); - if (fieldName == null \|\| additions == null) continue; - JsonObject target = body.has(fieldName) && body.get(fieldName).isJsonObject() - ? body.getAsJsonObject(fieldName) - : new JsonObject(); - additions.entrySet().forEach(e -> target.add(e.getKey(), e.getValue())); - body.add(fieldName, target); - modified = true; - } - - return modified ? body.toString() : payload; } Suggestion importance[1-10]: 7 __ Why: The suggestion correctly identifies that `JsonParser.parseString(payload)` can throw `JsonSyntaxException` when the payload is malformed. Wrapping this in a try-catch block and returning the original payload on error is a reasonable defensive approach. However, the `isJson(payload)` check on line 381 should already validate JSON syntax before this method is called, making this a secondary safety measure rather than a critical fix.	Medium

Suggestions up to commit af9507c

Category	Suggestion	Impact
General	Eliminate duplicated error handling code The error handling logic for building the user prompt is duplicated in both the success and failure branches of the `ActionListener`. Extract the prompt-building and error-handling logic into a helper method to reduce code duplication and improve maintainability. plugin/src/main/java/org/opensearch/ml/action/memorycontainer/memory/MemoryProcessingService.java [130-154] memoryContainerHelper.getStructuredOutputParameters(llmModelId, ActionListener.wrap(structuredOutputParams -> { + if (buildAndSetUserPrompt(stringParameters, structuredOutputParams, messages, strategy, listener)) { + sendFactExtractionRequest(tenantId, llmModelId, stringParameters, strategy, memoryConfig, listener); + } +}, e -> { + log.warn("Unexpected error fetching structured output parameters, falling back to prompt enforcement", e); + if (buildAndSetUserPrompt(stringParameters, Map.of(), messages, strategy, listener)) { + sendFactExtractionRequest(tenantId, llmModelId, stringParameters, strategy, memoryConfig, listener); + } +})); + +private boolean buildAndSetUserPrompt( + Map<String, String> stringParameters, + Map<String, String> structuredOutputParams, + List<MessageInput> messages, + MemoryStrategy strategy, + ActionListener<List<String>> listener +) { try { if (!structuredOutputParams.isEmpty()) { stringParameters.putAll(structuredOutputParams); stringParameters.put("user_prompt", buildUserPrompt(serializeMessagesToJson(messages))); } else { stringParameters.put("user_prompt", buildUserPromptWithEnforcement(messages, strategy.getType())); } + return true; } catch (Exception e) { log.error("Failed to build messages JSON", e); listener.onResponse(new ArrayList<>()); - return; + return false; } - sendFactExtractionRequest(tenantId, llmModelId, stringParameters, strategy, memoryConfig, listener); -}, e -> { - log.warn("Unexpected error fetching structured output parameters, falling back to prompt enforcement", e); - try { - stringParameters.put("user_prompt", buildUserPromptWithEnforcement(messages, strategy.getType())); - } catch (Exception buildEx) { - log.error("Failed to build messages JSON", buildEx); - listener.onResponse(new ArrayList<>()); - return; - } - sendFactExtractionRequest(tenantId, llmModelId, stringParameters, strategy, memoryConfig, listener); -})); +} Suggestion importance[1-10]: 7 __ Why: The suggestion correctly identifies duplicated error handling logic in both success and failure branches. Extracting this into a helper method reduces duplication and improves maintainability. The boolean return pattern is a good approach for handling the early-exit case when prompt building fails.	Medium
	Extract nested async callback logic The nested `ActionListener.wrap` inside the `getConnector` call can lead to listener callback chains that are hard to trace and debug. Consider extracting the connector lookup logic into a separate helper method to improve readability and maintainability of the async flow. plugin/src/main/java/org/opensearch/ml/helper/MemoryContainerHelper.java [536-563] public void getStructuredOutputParameters(String modelId, ActionListener<Map<String, String>> listener) { try (ThreadContext.StoredContext context = client.threadPool().getThreadContext().stashContext()) { modelManager.getModel(modelId, ActionListener.runBefore(ActionListener.wrap(mlModel -> { Connector connector = mlModel.getConnector(); if (connector != null) { listener.onResponse(schemaForConnector(connector)); } else if (mlModel.getConnectorId() != null) { - // getConnector stashes its own ThreadContext internally, so this secondary - // async call is safe even though the outer stashed context has already been - // restored by the runBefore above. - modelManager - .getConnector( - mlModel.getConnectorId(), - null, - ActionListener.wrap(c -> listener.onResponse(schemaForConnector(c)), e -> { - log.warn("Failed to fetch connector {} for structured output detection", mlModel.getConnectorId(), e); - listener.onResponse(Map.of()); - }) - ); + fetchConnectorAndRespond(mlModel.getConnectorId(), listener); } else { listener.onResponse(Map.of()); } }, e -> { log.warn("Failed to fetch model {} for structured output detection, falling back to prompt enforcement", modelId, e); listener.onResponse(Map.of()); }), context::restore)); } } +private void fetchConnectorAndRespond(String connectorId, ActionListener<Map<String, String>> listener) { + modelManager.getConnector(connectorId, null, ActionListener.wrap( + c -> listener.onResponse(schemaForConnector(c)), + e -> { + log.warn("Failed to fetch connector {} for structured output detection", connectorId, e); + listener.onResponse(Map.of()); + } + )); +} + Suggestion importance[1-10]: 6 __ Why: The suggestion improves code readability by extracting the nested `getConnector` callback into a separate method. This makes the async flow easier to follow and maintain. However, the improvement is moderate since the original code is still reasonably clear, and the nesting level is not excessive.	Low
	Optimize parameter map iteration The method iterates over the entire `parameters` map twice, which can be inefficient for large parameter sets. Consider collecting the relevant keys in a single pass and then processing them, or use a more efficient filtering approach to avoid redundant iterations over entries that don't match the naming conventions. common/src/main/java/org/opensearch/ml/common/connector/HttpConnector.java [395-425] private String injectStructuredOutputParams(Map<String, String> parameters, String payload) { JsonElement parsed = JsonParser.parseString(payload); if (!parsed.isJsonObject()) return payload; JsonObject body = parsed.getAsJsonObject(); boolean modified = false; + Map<String, String> jsonParams = new HashMap<>(); + Map<String, String> additionsParams = new HashMap<>(); + + for (Map.Entry<String, String> entry : parameters.entrySet()) { + if (entry.getKey().endsWith("_additions_json")) { + additionsParams.put(entry.getKey(), entry.getValue()); + } else if (entry.getKey().endsWith("_json")) { + jsonParams.put(entry.getKey(), entry.getValue()); + } + } + // Pass 1: _<field>_json — inject or replace a top-level field - for (Map.Entry<String, String> entry : parameters.entrySet()) { - if (entry.getKey().endsWith("_additions_json")) continue; + for (Map.Entry<String, String> entry : jsonParams.entrySet()) { String fieldName = allowedFieldName(entry.getKey(), "_json"); JsonObject value = asJsonObject(entry.getValue()); if (fieldName == null \|\| value == null) continue; body.add(fieldName, value); modified = true; } // Pass 2: _<field>_additions_json — merge into an existing top-level object - for (Map.Entry<String, String> entry : parameters.entrySet()) { + for (Map.Entry<String, String> entry : additionsParams.entrySet()) { String fieldName = allowedFieldName(entry.getKey(), "_additions_json"); JsonObject additions = asJsonObject(entry.getValue()); if (fieldName == null \|\| additions == null) continue; JsonObject target = body.has(fieldName) && body.get(fieldName).isJsonObject() ? body.getAsJsonObject(fieldName) : new JsonObject(); additions.entrySet().forEach(e -> target.add(e.getKey(), e.getValue())); body.add(fieldName, target); modified = true; } return modified ? body.toString() : payload; } Suggestion importance[1-10]: 5 __ Why: The suggestion correctly identifies that the method iterates over `parameters` twice, which could be optimized. However, the performance impact is likely minimal for typical parameter map sizes, and the current implementation is clear and maintainable. The optimization adds complexity without significant benefit in most real-world scenarios.	Low