Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Use include files in substitution templates to reduce repetition. #1071

Open
dbwiddis opened this issue Feb 26, 2025 · 1 comment
Labels
enhancement New feature or request

Comments

@dbwiddis
Copy link
Member

dbwiddis commented Feb 26, 2025

Is your feature request related to a problem?

A core principle in software development is "Don't Repeat Yourself" (DRY).

Our substitution templates contain a lot of repetition. For example, this JSON is exactly identical in 9 of the 17 substitution templates.

{
  "id": "register_model",
  "type": "register_remote_model",
  "previous_node_inputs": {
    "create_connector": "parameters"
  },
  "user_inputs": {
    "name": "${{register_remote_model.name}}",
    "function_name": "remote",
    "description": "${{register_remote_model.description}}"
  }
}

This JSON is common to the two bedrock templates:

{
  "id": "create_connector",
  "type": "create_connector",
  "user_inputs": {
    "name": "${{create_connector.name}}",
    "description": "${{create_connector.description}}",
    "version": "1",
    "protocol": "aws_sigv4",
    "parameters": {
      "region": "${{create_connector.region}}",
      "service_name": "bedrock",
      "input_docs_processed_step_size": "${{create_connector.input_docs_processed_step_size}}"
    },
    "credential": {
      "access_key": "${{create_connector.credential.access_key}}",
      "secret_key": "${{create_connector.credential.secret_key}}",
      "session_token": "${{create_connector.credential.session_token}}"
    },
    "actions": [
      {
        "action_type": "predict",
        "method": "POST",
        "url": "${{create_connector.actions.url}}",
        "headers": {
          "content-type": "application/json",
          "x-amz-content-sha256": "required"
        },
        "request_body": "${{create_connector.actions.request_body}}",
        "pre_process_function": "${{create_connector.actions.pre_process_function}}",
        "post_process_function": "${{create_connector.actions.post_process_function}}"
      }
    ]
  }
}

While this version is very similar with just a few different fields, common to 3 other templates (with a fourth containing only additional minor differences):

{
  "id": "create_connector",
  "type": "create_connector",
  "user_inputs": {
    "name": "${{create_connector.name}}",
    "description": "${{create_connector.description}}",
    "version": "1",
    "protocol": "${{create_connector.protocol}}",
    "parameters": {
      "endpoint": "${{create_connector.endpoint}}",
      "model": "${{create_connector.model}}",
      "input_type": "search_document",
      "truncate": "END"
    },
    "credential": {
      "key": "${{create_connector.credential.key}}"
    },
    "actions": [
      {
        "action_type": "predict",
        "method": "POST",
        "url": "${{create_connector.actions.url}}",
        "headers": {
          "Authorization": "Bearer ${credential.key}",
          "Request-Source": "unspecified:opensearch"
        },
        "request_body": "${{create_connector.actions.request_body}}",
        "pre_process_function": "${{create_connector.actions.pre_process_function}}",
        "post_process_function": "${{create_connector.actions.post_process_function}}"
      }
    ]
  }
}

Every additional type of model (bedrock, cohere, etc.), use case (chat, hybrid search, multimodal search, semantic search), produces a combinatorial explosion of substitution templates with only minor differences between the steps, limiting the practicality of creating every possible combination.

The original design of the front-end included modular components for each of these workflow steps; there's no reason they can't be equally modular on the back-end.

What solution would you like?

  1. Review the existing substitution templates as well as sample templates in the documentation. Identify overlapping workflow steps that are repeated across multiple templates in different combinations.
    • Consider not only exactly identical code but code where steps could be renamed to be identical
  2. Design a means of using an Include Directive to add these values to a template file. Consider multiple designs such as:
    • simple text-based inclusion when reading files prior to parsing the JSON
    • smaller JSON component files that can be referenced during JSON parsing
    • include entire templates
    • some other idea I haven't thought of
  3. Refactor the existing templates to use these components.
  4. Using lessons learned in the above steps, consider the possibility to design further modularity to add additional step(s) simply by adding a suffix to an existing template name.

What alternatives have you considered?

Do similar steps to create the modular pieces, but assemble them together with automatic code generation. This would allow a one-time generation of the templates (possibly automated as part of CI) to keep the existing code the same and eliminate run-time parsing steps.

Do you have any additional context?

These issues essentially ask for the same thing more broadly; this feature attempts to build a structure to support them

This design should consider the possibility that these workflow steps might be assembled in a higher-level "scripting" or front-end UI sequencer.

This effort should focus primarily on entire workflow steps with identical textual content; if there are minor variations (e.g., many connectors are identical except for adding "input_type": "search_document", and "truncate": "END" fields) we should consider ways of including those without adding complexity.

@dbwiddis dbwiddis added enhancement New feature or request untriaged labels Feb 26, 2025
@krisfreedain
Copy link
Member

[Catch All Triage - 1, 2]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants