Skip to content

Feat: LLM Generation Client Returns Json #570

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

par4m
Copy link
Contributor

@par4m par4m commented May 28, 2025

Implements #400

Note: Still testing OpenAI and Anthropic clients, Gemini and Ollama work as expected.

Return Type: LlmGenerationResponse

All client files (anthropic.rs, gemini.rs, ollama.rs, openai.rs) implement the LlmGenerationClient trait and their generate() methods return LlmGenerationResponse, which supports both Json(serde_json::Value) and Text(String).
This is consistent with the enum definition in mod.rs.

Strict JSON Prompt Handling

Anthropic:

  • Checks if output_format is JsonSchema and prepends STRICT_JSON_PROMPT to the system prompt.
  • Uses payload["system"] = ... for the system prompt.
  • json5 fallback removed as no longer required.

Gemini:

  • Prepends
  • Uses payload["systemInstruction"] = ... for the system prompt.

Ollama:

  • Prepends
  • Uses system: Some(system_prompt.as_str()) in the request.

OpenAI:

  • Does not prepend STRICT_JSON_PROMPT. Instead, uses OpenAI’s native strict JSON mode via ResponseFormat::JsonSchema { ... strict: Some(true), ... }.
  • This matches the intended design: strict prompt only for Anthropic, Gemini, and Ollama.

Prompt Utility (prompt_utils.rs)

pub const STRICT_JSON_PROMPT: &str = "IMPORTANT: Output ONLY valid JSON that matches the schema. Do NOT say anything else. Do NOT explain. Do NOT preface. Do NOT add comments. If you cannot answer, output an empty JSON object: {}.";
  • This is imported and used where needed (Anthropic, Gemini, Ollama).

Follow Up: Requires Unit Tests to test all LLM Clients at once at once after its merged.

  • Request Response Format testing using httpmock crate
  • End to End testing with Schema Validation using jsonschema

if let Some(super::OutputFormat::JsonSchema { .. }) = request.output_format {
match serde_json::from_str::<serde_json::Value>(&json.response) {
Ok(val) => Ok(super::LlmGenerateResponse::Json(val)),
Err(_) => Ok(super::LlmGenerateResponse::Text(json.response)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to just pop up the error in this case. When the caller wants JSON, returning a text will leave more burden on the caller side: more cases need to be handled, and when they perform exactly the same way to parse (using serde_json::from_str()) they will fail again.

@@ -64,8 +64,10 @@ impl LlmGenerationClient for Client {
},
));

// Save output_format before it is moved.
let output_format = request.output_format.clone();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we only need to save a boolean here to indicate if JSON is expected. No need to clone the entire thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants