Skip to content

[FEATURE] created_by Provenance Tag Support in ML Commons #4752

@dbwiddis

Description

@dbwiddis

Is your feature request related to a problem?

The ML Commons stats framework (MLStatsJobProcessor) publishes adoption metrics for models, agents, and connectors as OTel counters with rich tags describing what was created: service provider, model type, deployment mode, etc. However, there is no way to attribute which plugin or caller provisioned a given resource. This makes it impossible to distinguish, in the metrics, between resources created by an automated plugin provisioning flow (e.g., Flow Framework plugin) vs. resources created directly by users via the API, or other plugins.

The MachineLearningClient interface (used by all plugins integrating with ML Commons) provides no mechanism to pass caller provenance. The underlying input objects, MLCreateConnectorInput, MLRegisterModelInput, and MLAgent, have no provisioned_by field. The transport actions that persist these objects (TransportCreateConnectorAction, TransportRegisterModelAction, TransportRegisterAgentAction) never record provenance. And MLModel.getTags() / MLAgent.getTags() have no such dimension to emit.

As a concrete example: a plugin (like Flow Framework) that automates ML resource provisioning (connectors, models, agents) as a "one-and-done" setup step wants to measure how many users are in active continued use of the resources it provisioned, as distinct from resources provisioned by other means. This is currently impossible with the existing stats framework.

What solution would you like?

Add an optional provisioned_by field as first-class metadata across the ML resource creation path, surfaced as a tag in the adoption metrics framework. The changes required span four areas:

  1. Domain objects and input classes (common module)

Add String provisionedBy to MLCreateConnectorInput, MLRegisterModelInput, MLAgent, and MLModel. (Given that Connectors are currently not used in stats and have a tight relationship to models, we can leave them out.) Implement toXContent, parse, writeTo, and StreamInput constructors in each class, version-gated on a new VERSION_X_Y_Z constant following the existing pattern.

  1. Transport actions (plugin module)

    • TransportRegisterModelAction: copy provisionedBy from MLRegisterModelInput onto MLModel before indexing
    • TransportRegisterAgentAction: MLAgent is indexed directly, so no additional propagation is needed beyond Step 1
  2. Tag emission (common module)

  • MLModel.getTags()/ getTags(Connector): add provisioned_by tag to all three tag-building paths (remote, pre-trained, custom)
  • MLAgent.getTags(): add provisioned_by tag
  1. Connector metrics in MLStatsJobProcessor (plugin module)

AdoptionMetric.CONNECTOR_COUNT is currently defined but never incremented. As part of this work, add connector collection to MLStatsJobProcessor parallel to the existing model collection, reading provisioned_by from the stored connector document and emitting it as a tag. This completes coverage for all three resource types.

Note: Connector-level metrics were intentionally omitted. Connectors have a tight 1:1 relationship with models — a connector is not useful without a model that references it. The model's getTags(connector) path already emits provisioned_by from the model when counting MODEL_COUNT, so connector provenance is already captured through model-level metrics. Adding a separate CONNECTOR_COUNT emission would largely duplicate the same attribution data. If a use case for independent connector counting emerges (e.g., shared connectors used by multiple models), this can be added as a follow-up.

With these changes, a plugin provisioning ML resources via the ML Client simply sets the field on the input builder:

MLCreateConnectorInput.builder()
    // ... existing fields ...
    .provisionedBy("my-plugin")
    .build();

MLRegisterModelInput.builder()
    // ... existing fields ...
    .provisionedBy("my-plugin")
    .build();

MLAgent.builder()
    // ... existing fields ...
    .provisionedBy("my-plugin")
    .build();

The stats framework then emits metrics like:

ml.commons.MODEL_COUNT{provisioned_by="my-plugin", deployment="remote", service_provider="bedrock", type="llm", ...}
ml.commons.AGENT_COUNT{provisioned_by="my-plugin", type="conversational", ...}
# ml.commons.CONNECTOR_COUNT{provisioned_by="my-plugin", service_provider="bedrock", ...}  -- covered by MODEL_COUNT via getTags(connector)

What alternatives have you considered?

  1. Using the existing app_type field on MLAgent: MLAgent already has an appType field, but it is a user-facing classification of the agent's functional purpose (e.g. "chatbot"), not a record of which plugin provisioned it. Overloading it for provenance would conflate two distinct concepts and would not cover connectors or models, which have no equivalent field.

  2. Tagging via connector/model parameters: A plugin could embed a provisioned_by key in the parameters map of a connector or model. However, this is an undocumented convention with no guarantee of surviving updates, no first-class support in getTags(), and no way to filter it out of functional parameters passed to the remote endpoint.

  3. Tracking provenance outside ML Commons: The calling plugin could maintain its own index of resource IDs it provisioned and join that against ML Commons data at query time. This is fragile, requires the plugin to manage additional state, and produces metrics that are disconnected from the rich tag context (service provider, model type, etc.) that MLStatsJobProcessor already computes.

Do you have any additional context?

  • provisioned_by is purely informational metadata — a free-form string with no validation or enforcement by ML Commons. The framework does not need to know or care about the value.
  • The field follows the exact version-gating pattern already used, ensuring backward compatibility in mixed-version clusters where older nodes simply ignore the field.
  • provisioned_by will be visible in GET model/agent/connector API responses, which is desirable for operator visibility into resource provenance.
  • This is not a security boundary. Any caller can set any value. It is not intended to replace or interact with the existing owner/user access control fields.

Metadata

Metadata

Assignees

Labels

3.7Items marked for 3.7 releaseenhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions