Skip to content

[BUG] model interface validation failed when there is integer within text #3758

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mingshl opened this issue Apr 23, 2025 · 4 comments · May be fixed by #3761
Open

[BUG] model interface validation failed when there is integer within text #3758

mingshl opened this issue Apr 23, 2025 · 4 comments · May be fixed by #3761
Assignees
Labels
bug Something isn't working v3.1.0 Issues targeting release v3.1.0

Comments

@mingshl
Copy link
Collaborator

mingshl commented Apr 23, 2025

What is the bug?

That’s a bug in model interface when there is number within the text field, it has false negative evaluation showing it's expected string, but got integer in the field

How can one reproduce the bug?

here is the configs:

connector setting

{
  "name": "Amazon Bedrock - Titan",
  "version": "1",
  "description": "The connector to Bedrock Titan embedding model",
  "protocol": "aws_sigv4",
  "parameters": {
    "service_name": "bedrock",
    "model": "amazon.titan-embed-text-v1",
    "region": "us-west-2"
  },
  "actions": [
    {
      "action_type": "PREDICT",
      "method": "POST",
      "url": "https://bedrock-runtime.${parameters.region}.amazonaws.com/model/${parameters.model}/invoke",
      "headers": {
        "x-amz-content-sha256": "required",
        "content-type": "application/json"
      },
      "request_body": """{ "inputText": "${parameters.inputText}" }"""
    }
  ],
  "created_time": 1744071279238,
  "last_updated_time": 1744071279238
}

model interface

{
  "name": "Amazon Bedrock - Titan Text Embedding",
  "model_group_id": "XSy6EpYBEp3IjPWiWsSJ",
  "algorithm": "REMOTE",
  "model_version": "2",
  "description": "",
  "model_state": "DEPLOYED",
  "created_time": 1744071290744,
  "last_updated_time": 1744970510529,
  "last_deployed_time": 1744970510529,
  "auto_redeploy_retry_times": 0,
  "planning_worker_node_count": 3,
  "current_worker_node_count": 3,
  "planning_worker_nodes": [
    "D-LsE8ubRJCtA6oTC-1JdA",
    "dgCx-hB7T_WKo294LiUAOQ",
    "x5zinlCIT-yUr2wERv2N-Q"
  ],
  "deploy_to_all_nodes": true,
  "is_hidden": false,
  "connector_id": "Xie_EpYBXaeskhPiwjGI",
  "interface": {
    "output": """{"type":"object","properties":{"inference_results":{"type":"array","items":{"type":"object","properties":{"output":{"type":"array","items":{"type":"object","properties":{"name":{"type":"string"},"dataAsMap":{"type":"object","properties":{"embedding":{"type":"array"}},"required":["embedding"]}},"required":["name","dataAsMap"]}},"status_code":{"type":"integer"}},"required":["output","status_code"]}}},"required":["inference_results"]}""",
    "input": """{"type":"object","properties":{"parameters":{"additionalProperties":true,"type":"object","properties":{"inputText":{"type":"string"}},"required":["inputText"]}}}"""
  }
}

predict call

POST /_plugins/_ml/models/A9y_EpYBMfB3_phs7_V5/_predict
{
  "parameters": {
    "inputText" : "5.11"
  }
}

here is the exception message:

{
  "error": {
    "root_cause": [
      {
        "type": "status_exception",
        "reason": "Error validating input schema, if you think this is expected, please update your 'input' field in the 'interface' field for this model: Validation failed: [$.parameters.inputText: number found, string expected] for instance: {\"algorithm\":\"REMOTE\",\"parameters\":{\"inputText\":5.11},\"action_type\":\"PREDICT\"} with schema: {\"type\":\"object\",\"properties\":{\"parameters\":{\"additionalProperties\":true,\"type\":\"object\",\"properties\":{\"inputText\":{\"type\":\"string\"}},\"required\":[\"inputText\"]}}}"
      }
    ],
    "type": "status_exception",
    "reason": "Error validating input schema, if you think this is expected, please update your 'input' field in the 'interface' field for this model: Validation failed: [$.parameters.inputText: number found, string expected] for instance: {\"algorithm\":\"REMOTE\",\"parameters\":{\"inputText\":5.11},\"action_type\":\"PREDICT\"} with schema: {\"type\":\"object\",\"properties\":{\"parameters\":{\"additionalProperties\":true,\"type\":\"object\",\"properties\":{\"inputText\":{\"type\":\"string\"}},\"required\":[\"inputText\"]}}}"
  },
  "status": 400
}

another predict failing payload

POST /_plugins/_ml/models/A9y_EpYBMfB3_phs7_V5/_predict
{
  "parameters": {
    "inputText" : "5.11 Tactical #48095 Reversible High Vis Duty Jacket. Introducing the new 5.11 Reversible High Vis Duty Jacket. The law enforcement market today needs products that will perform consistently time and time again at and exceptional value and this new Reversible High Vis Duty Jacket from 5.11 meets that need. With its wind and water repellent outer shell above the belt duty length this new duty jacket is sure to have broad appeal for any law enforcement department. With two large duty style bellowed cargo pockets incorporating the 5.11 BBS system, and two large hand warmer pockets behind each this jacket will keep you warm in cold temperatures. On the reverse you will find the ANSI/ISEA 107-2010 Class II certified high vis jacket with all of the same great features that are on the solid color side of the jacket. Incorporating 3M Scotchlite tape and high vis yellow material this jacket will keep you safe while working."
  }
}

same exception message

{
  "error": {
    "root_cause": [
      {
        "type": "status_exception",
        "reason": "Error validating input schema, if you think this is expected, please update your 'input' field in the 'interface' field for this model: Validation failed: [$.parameters.inputText: number found, string expected] for instance: {\"algorithm\":\"REMOTE\",\"parameters\":{\"inputText\":5.11},\"action_type\":\"PREDICT\"} with schema: {\"type\":\"object\",\"properties\":{\"parameters\":{\"additionalProperties\":true,\"type\":\"object\",\"properties\":{\"inputText\":{\"type\":\"string\"}},\"required\":[\"inputText\"]}}}"
      }
    ],
    "type": "status_exception",
    "reason": "Error validating input schema, if you think this is expected, please update your 'input' field in the 'interface' field for this model: Validation failed: [$.parameters.inputText: number found, string expected] for instance: {\"algorithm\":\"REMOTE\",\"parameters\":{\"inputText\":5.11},\"action_type\":\"PREDICT\"} with schema: {\"type\":\"object\",\"properties\":{\"parameters\":{\"additionalProperties\":true,\"type\":\"object\",\"properties\":{\"inputText\":{\"type\":\"string\"}},\"required\":[\"inputText\"]}}}"
  },
  "status": 400
}

What is the expected behavior?
the model interface for the input validation should pass

What is your host/environment?

  • OS: [e.g. iOS]
  • Version [e.g. 22]
  • Plugins

Do you have any screenshots?
If applicable, add screenshots to help explain your problem.

Do you have any additional context?
Add any other context about the problem.

@mingshl mingshl added bug Something isn't working untriaged labels Apr 23, 2025
@dbwiddis
Copy link
Member

dbwiddis commented Apr 23, 2025

The error is thrown from this method:

public static void validateSchema(String schemaString, String instanceString) throws IOException {
ObjectMapper mapper = new ObjectMapper();
// parse the schema JSON as string
JsonNode schemaNode = mapper.readTree(schemaString);
JsonSchema schema = JsonSchemaFactory.getInstance(VersionFlag.V202012).getSchema(schemaNode);
// JSON data to validate
JsonNode jsonNode = mapper.readTree(instanceString);
// Validate JSON node against the schema
Set<ValidationMessage> errors = schema.validate(jsonNode);
if (!errors.isEmpty()) {
throw new OpenSearchParseException(
"Validation failed: "
+ Arrays.toString(errors.toArray(new ValidationMessage[0]))
+ " for instance: "
+ instanceString
+ " with schema: "
+ schemaString
);
}
}

This relies on a dependency https://github.com/networknt/json-schema-validator . But that seems to properly handle strings containing numbers. I suspect the Jackson ObjectMapper is stripping the quotes to get {\"inputText\":5.11}:

Validation failed: [$.parameters.inputText: number found, string expected] for instance: {"algorithm": "REMOTE", "parameters": {"inputText":5.11}, "action_type" : "PREDICT"} with schema: {"type": "object", "properties)": {"parameters" : {"additionalProperties)": true, "type": "object", "properties)": {"inputText"
: {"type": "string"}}, "required": ["inputText"]}}}"

This seems related but in the opposite direction, still it points to some config that can be changed: FasterXML/jackson-databind#796

@dbwiddis
Copy link
Member

dbwiddis commented Apr 23, 2025

Actually, here's where it's being converted to a number:

/**
* This method processes the input JSON string and replaces the string values of the parameters with JSON objects if the string is a valid JSON.
* @param inputJson The input JSON string
* @return The processed JSON string
*/
public static String processRemoteInferenceInputDataSetParametersValue(String inputJson) throws IOException {
ObjectMapper mapper = new ObjectMapper();

That method is called prior to passing it to validation.

@dbwiddis dbwiddis linked a pull request Apr 23, 2025 that will close this issue
2 tasks
@dbwiddis
Copy link
Member

Proposed fix: #3761

@Zhangxunmt Zhangxunmt added v3.1.0 Issues targeting release v3.1.0 and removed untriaged labels May 6, 2025
@Zhangxunmt Zhangxunmt moved this to In Progress in ml-commons projects May 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working v3.1.0 Issues targeting release v3.1.0
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

4 participants