[stale] [feature] Add snippets (using legacy apps) #2904

jp-agenta · 2025-11-11T15:28:38Z

[feature] Add snippets (using legacy apps)

web/ee/src/services/evaluations/api/index.ts

-    const response = await axios.get(`/evaluations/${encodeURIComponent(id)}`, {
-        params: {project_id: projectId},
-    })
+    const response = await axios.get(`/evaluations/${evaluationId}?project_id=${projectId}`)


To prevent server-side request forgery (SSRF) and related vulnerabilities, input values used to build the path for the outgoing request must be validated or sanitized before use. In this case, the evaluationId parameter, derived ultimately from user input, should be validated to ensure it matches a known-safe format—ideally a UUID, as that appears to be the expected format.

The best way to fix the problem without altering existing functionality is:

Add a dedicated validator function that checks whether the provided evaluationId string is a valid UUID.

Reject, sanitize, or throw an error if an invalid value is detected, so requests are never constructed with unsafe path segments.

This validation should be performed in every exposed function in web/ee/src/services/evaluations/api/index.ts that takes evaluationId as a path parameter—including fetchEvaluation, fetchEvaluationStatus, and fetchAllEvaluationScenarios.

Add the reusable isValidUuid function to this file.

Validate all elements of arrays passed as evaluationIds, such as in fetchAllComparisonResults, to ensure all IDs are individually checked.

Implementation plan:

In web/ee/src/services/evaluations/api/index.ts, add a function isValidUuid(id: string): boolean (can use a simple regex for UUIDs).

Before building the API URL in each affected function, check that the given ID is valid, and throw if not.

For functions receiving arrays (fetchAllComparisonResults, fetchAllEvaluationScenarios), validate every element.

web/ee/src/services/evaluations/api/index.ts

-            params: {project_id: projectId},
-        }),
-        fetchEvaluation(id),
+        axios.get(`/evaluations/${evaluationId}/evaluation_scenarios?project_id=${projectId}`),


General Approach:
Restrict user-controlled input in outgoing request URLs, especially when inserted into path segments. Accept only values that match a known-safe format—here, a UUID string for evaluationId.

Best way to fix:

Validate each evaluationId against a strict regular expression for UUIDs or a well-defined server-side ID format before including it in a URL.

Optionally, filter out any IDs that fail this check, or even throw an error to prevent the potentially dangerous request.

In EvaluationCompare.tsx, before handing evaluationIds to API functions, filter and restrict them.

In fetchAllEvaluationScenarios (in api/index.ts), ensure only sanitized IDs are used to generate URLs.

Implementation points:

Implement a function (ideally in api/index.ts) to validate UUIDs.

In fetchAllComparisonResults, sanitize/validate every received id (since it is called with direct user data).

Alternatively, in the UI, filter evaluationIdsArray before using it (defense-in-depth).

Required changes:

Add a UUID validation function to web/ee/src/services/evaluations/api/index.ts.

In fetchAllComparisonResults, filter out IDs that do not match the UUID pattern.

Optionally log or throw if any values are filtered, for debugging/monitoring.

web/ee/src/services/human-evaluations/api/index.ts

        return await axios
-            .get(`${getAgentaApiUrl()}/human-evaluations/${encodeURIComponent(id)}`, {
-                params: {project_id: project},
-            })
+            .get(`${getAgentaApiUrl()}/human-evaluations/${evaluationId}?project_id=${projectId}`)


To fix this SSRF vulnerability, we should validate and sanitize the evaluationId parameter before using it as part of the outgoing API request path. Specifically, ensure that evaluationId conforms to an expected format (for example, a MongoDB ObjectId, which is a 24-character hexadecimal string). This validation should occur at the application boundary—immediately after the value is obtained from router.query in the React component where it's first sourced. If the ID fails validation, do not make the API call or surface an error to the user. This fix involves editing both the consumer file (index.tsx)—ensuring only safe values are passed to fetchLoadEvaluation—and potentially adding a reusable validator utility. The change should be localized: (1) add the validator function and (2) apply it to sanitize evaluationTableId in the main component, gating all downstream usage.

web/ee/src/services/human-evaluations/api/index.ts

            .then((responseData) => {
                return fromEvaluationResponseToEvaluation(responseData.data)
            })
    } catch (error) {
        if (axios.isCancel?.(error) || (error as any)?.code === "ERR_CANCELED") {
            return null
        }
-        console.error(`Error fetching evaluation ${id}:`, error)
+        console.error(`Error fetching evaluation ${evaluationId}:`, error)


To fix this issue, change the log statement to use a fixed format string and supply the potentially untrusted value, evaluationId, as a separate argument in the call to console.error. For example, instead of using string interpolation in the format string, format it as: console.error("Error fetching evaluation %s:", evaluationId, error);. This ensures that if evaluationId contains % characters or other format specifiers, they will not affect the formatting of the log message. Change this on line 53 of web/ee/src/services/human-evaluations/api/index.ts. No other code changes, imports, or supporting definitions are required.

Copilot

Pull Request Overview

This PR adds support for snippets as a new application type by introducing the SNIPPET enum value to replace SDK_CUSTOM. The changes refactor evaluation-related code from OSS to EE, update database models for consistency (e.g., TestsetDB → TestSetDB), and improve various services including workflows, evaluations, and testsets. The PR also removes deprecated evaluation models and converters, simplifies configuration handling, and updates several router endpoints.

Key Changes

Introduced SNIPPET as a new AppType, replacing SDK_CUSTOM
Renamed TestsetDB to TestSetDB throughout the codebase for consistency
Moved evaluation-related functionality from OSS to EE edition
Added new workflow service infrastructure with built-in evaluators registry
Simplified metadata handling by removing separate query flags classes

Reviewed Changes

Copilot reviewed 132 out of 1470 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
`api/oss/src/services/db_manager.py`	Major refactoring: removed evaluation functions, updated testset references, renamed functions, added snippets filtering to `list_apps`
`api/oss/src/services/converters.py`	Removed entire file containing evaluation converter functions
`api/oss/src/models/shared_models.py`	Updated `AppType` enum: replaced `SDK_CUSTOM` with `SNIPPET`
`api/oss/src/models/db_models.py`	Renamed `TestsetDB` to `TestSetDB`, removed evaluation-related database models
`api/oss/src/routers/app_router.py`	Added snippets parameter to `list_apps` endpoint, simplified app output models
`api/oss/src/routers/variants_router.py`	Added resolve parameter to variant endpoints, refactored configs endpoints
`api/oss/src/routers/testset_router.py`	Updated testset model references, simplified function signatures
`api/oss/src/core/workflows/service.py`	Implemented `invoke_workflow` with handler registry, replaced SDK dependencies
`api/oss/src/core/services/v0.py`	New file with built-in evaluator implementations
`api/oss/src/core/shared/dtos.py`	Moved shared DTOs from SDK, added `Status`, `Schema`, and `Mappings` types
`api/oss/databases/postgres/migrations/core/versions/f286e830f0bc_extend_app_type_bis.py`	New migration to add `SNIPPET` to app_type enum

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-11T15:30:53Z

api/oss/src/services/db_manager.py

@@ -2541,7 +2496,7 @@ async def list_app_variant_revisions_by_variant(
                variant_id=uuid.UUID(variant_id), project_id=uuid.UUID(project_id)
            )
            .options(
-                joinedload(AppVariantRevisionsDB.variant_revision.of_type(AppVariantDB))
+                joinedload(AppVariantRevisionsDB.variant.of_type(AppVariantDB))


Changed relationship name from variant_revision to variant, but this may not match the actual relationship defined in the model. Verify that AppVariantRevisionsDB has a variant relationship, not variant_revision.

Suggested change

joinedload(AppVariantRevisionsDB.variant.of_type(AppVariantDB))

joinedload(AppVariantRevisionsDB.variant_revision.of_type(AppVariantDB))

Copilot · 2025-11-11T15:30:54Z

api/oss/src/routers/app_router.py

-        created_at=str(app_db.created_at),
-        updated_at=str(app_db.updated_at),
-    )
+    return CreateAppOutput(app_id=str(app_db.id), app_name=str(app_db.app_name))


Removed app_type, created_at, and updated_at fields from the response model, but these fields are still defined in CreateAppOutput. This will cause the response to omit these fields even though the model expects them to be optional. Consider updating the model definition to match the actual response, or include these fields in the response.

Suggested change

return CreateAppOutput(app_id=str(app_db.id), app_name=str(app_db.app_name))

return CreateAppOutput(

app_id=str(app_db.id),

app_name=str(app_db.app_name),

app_type=None,

created_at=None,

updated_at=None,

)

Copilot · 2025-11-11T15:30:54Z

api/oss/src/dbs/postgres/tracing/utils.py

-                if (pcts[v[1]] + pcts[v[0]]) != 0
-                else 0.0
-            )
+            pscs[k] = (pcts[v[1]] - pcts[v[0]]) / (pcts[v[1]] + pcts[v[0]])


Removed division-by-zero check. This will raise ZeroDivisionError when pcts[v[1]] + pcts[v[0]] equals zero. The original code had a safeguard that returned 0.0 in this case.

Suggested change

pscs[k] = (pcts[v[1]] - pcts[v[0]]) / (pcts[v[1]] + pcts[v[0]])

denom = pcts[v[1]] + pcts[v[0]]

if denom == 0:

pscs[k] = 0.0

else:

pscs[k] = (pcts[v[1]] - pcts[v[0]]) / denom

Copilot · 2025-11-11T15:30:54Z

api/oss/src/dbs/postgres/tracing/utils.py

-            pscs[k] = (
-                (pcts[v[1]] - pcts[v[0]]) / pcts["p50"] if pcts["p50"] != 0 else 0.0
-            )
+            pscs[k] = (pcts[v[1]] - pcts[v[0]]) / pcts["p50"]


Removed division-by-zero check. This will raise ZeroDivisionError when pcts['p50'] equals zero. The original safeguard returned 0.0 in this case.

Suggested change

pscs[k] = (pcts[v[1]] - pcts[v[0]]) / pcts["p50"]

if pcts["p50"] != 0:

pscs[k] = (pcts[v[1]] - pcts[v[0]]) / pcts["p50"]

else:

pscs[k] = 0.0

Copilot · 2025-11-11T15:30:55Z

api/oss/src/apis/fastapi/observability/utils/serialization.py

            return value
        try:
-            value = value
+            value = loads(value)


This line appears to be redundant - it attempts to parse value as JSON twice. The first attempt at line 58-59 already assigns the decoded value, so this second loads() call will likely fail or produce unexpected results. The original code just returned value directly when the first parse succeeded.

Copilot · 2025-11-11T15:30:55Z

api/oss/src/core/workflows/errors.py

            code=self.code,
            type=self.type,
-            message=f"Invalid parameters:\nExpected '{expected}'\nGot ('{type(got).__name__}') '{got}'.",
+            message=f"Invalid parameters: type must be {expected}, got {got}.",


Error message uses type names directly without using repr() or type name extraction. When expected or got are type objects, the message will show unclear representations like <class 'dict'> instead of dict. Consider using expected.__name__ or repr(expected) for clearer error messages.

Copilot · 2025-11-11T15:30:55Z

api/oss/src/apis/fastapi/workflows/router.py

+        else:
+            workflow_revision = WorkflowRevision(
+                data=workflow_revision_data,


Creating a WorkflowRevision with only data field may be incomplete. The WorkflowRevision model likely expects additional required fields like id, version, or other metadata. Verify that all required fields are properly initialized or that the model allows partial initialization.

Suggested change

else:

workflow_revision = WorkflowRevision(

data=workflow_revision_data,

else:

# Ensure all required fields for WorkflowRevision are provided

workflow_revision = WorkflowRevision(

data=workflow_revision_data,

id=getattr(workflow_revision_data, "id", None),

version=getattr(workflow_revision_data, "version", None),

# Add other required fields here as needed, with sensible defaults or from workflow_revision_data

Copilot · 2025-11-11T15:30:56Z

api/oss/src/core/evaluations/service.py

+                if is_ee():
+                    celery_dispatch.send_task(  # type: ignore


The task dispatch is now conditional on is_ee() without an else clause. If the condition is false in OSS installations, the evaluation will be silently skipped. Consider adding logging or an alternative handling mechanism for OSS environments to make the behavior explicit.

CLAassistant · 2025-11-11T15:38:39Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

GitHub CI seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

poc/snippets-using-legacy-apps

4faafe6

Copilot AI review requested due to automatic review settings November 11, 2025 15:28

github-advanced-security bot found potential problems Nov 11, 2025

View reviewed changes

Copilot AI reviewed Nov 11, 2025

View reviewed changes

jp-agenta changed the title ~~[feature] Add snippets (using legacy apps)~~ [STALE] [feature] Add snippets (using legacy apps) Nov 11, 2025

junaway changed the title ~~[STALE] [feature] Add snippets (using legacy apps)~~ [stale] [feature] Add snippets (using legacy apps) Nov 14, 2025

@@ -1,6 +1,7 @@
             import uniqBy from "lodash/uniqBy"
             import {v4 as uuidv4} from "uuid"
             import axios from "@/oss/lib/api/assets/axiosConfig"
             import {getTagColors} from "@/oss/lib/helpers/colors"
             import {calcEvalDuration} from "@/oss/lib/helpers/evaluate"
@@ -161,6 +162,9 @@
             export const fetchEvaluation = async (evaluationId: string) => {
                 const {projectId} = getProjectValues()
+                if (!isValidUuid(evaluationId)) {
+                    throw new Error("Invalid evaluationId parameter")
+                }
                 const response = await axios.get(`/evaluations/${evaluationId}?project_id=${projectId}`)
                 return evaluationTransformer(response.data) as _Evaluation
             }
@@ -168,6 +172,9 @@
             export const fetchEvaluationStatus = async (evaluationId: string) => {
                 const {projectId} = getProjectValues()
+                if (!isValidUuid(evaluationId)) {
+                    throw new Error("Invalid evaluationId parameter")
+                }
                 const response = await axios.get(`/evaluations/${evaluationId}/status?project_id=${projectId}`)
                 return response.data as {status: _Evaluation["status"]}
             }
@@ -213,6 +220,9 @@
             export const fetchAllEvaluationScenarios = async (evaluationId: string) => {
                 const {projectId} = getProjectValues()
+                if (!isValidUuid(evaluationId)) {
+                    throw new Error("Invalid evaluationId parameter")
+                }
                 const [{data: evaluationScenarios}, evaluation] = await Promise.all([
                     axios.get(`/evaluations/${evaluationId}/evaluation_scenarios?project_id=${projectId}`),
                     fetchEvaluation(evaluationId),
@@ -239,6 +249,9 @@
             // Comparison
             export const fetchAllComparisonResults = async (evaluationIds: string[]) => {
+                if (!Array.isArray(evaluationIds) || evaluationIds.length === 0 || !evaluationIds.every(isValidUuid)) {
+                    throw new Error("Invalid evaluationIds array")
+                }
                 const scenarioGroups = await Promise.all(evaluationIds.map(fetchAllEvaluationScenarios))
                 const testset: TestSet = await fetchTestset(scenarioGroups[0][0].evaluation?.testset?.id)

@@ -26,6 +26,12 @@
             import {fetchTestset} from "@/oss/services/testsets/api"
             import {getProjectValues} from "@/oss/state/project"
+            // Strict UUIDv4 validation (defensive against SSRF)
+            const UUID_V4_REGEX = /^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$/i;
+            function isValidUUID (id: string): boolean {
+                return typeof id === "string" && UUID_V4_REGEX.test(id);
+            }
             //Prefix convention:
             //  - fetch: GET single entity from server
             //  - fetchAll: GET all entities from server
@@ -239,7 +245,14 @@
             // Comparison
             export const fetchAllComparisonResults = async (evaluationIds: string[]) => {
-                const scenarioGroups = await Promise.all(evaluationIds.map(fetchAllEvaluationScenarios))
+                // Only use valid UUIDs to prevent SSRF via URL injection
+                const sanitizedEvaluationIds = evaluationIds.filter(isValidUUID)
+                if (sanitizedEvaluationIds.length !== evaluationIds.length) {
+                    // Optionally, you could throw an error here or log this event
+                    // throw new Error("Invalid evaluation IDs detected in request.");
+                    console.warn("Some provided evaluation IDs were invalid and have been ignored for security reasons.");
+                }
+                const scenarioGroups = await Promise.all(sanitizedEvaluationIds.map(fetchAllEvaluationScenarios))
                 const testset: TestSet = await fetchTestset(scenarioGroups[0][0].evaluation?.testset?.id)
                 const inputsNameSet = new Set<string>()

@@ -25,9 +25,14 @@
             export default function Evaluation() {
                 const router = useRouter()
                 const projectId = useAtomValue(projectIdAtom)
-                const evaluationTableId = router.query.evaluation_id
+                // Validator: check if string is a valid MongoDB ObjectId (24 hex chars)
+                function isValidObjectId(id) {
+                    return typeof id === "string" && /^[a-fA-F0-9]{24}$/.test(id);
+                }
+                const rawEvaluationId = router.query.evaluation_id
                     ? router.query.evaluation_id.toString()
-                    : ""
+                    : "";
+                const evaluationTableId = isValidObjectId(rawEvaluationId) ? rawEvaluationId : "";
                 const [evaluationScenarios, setEvaluationScenarios] = useAtom(evaluationScenariosAtom)
                 const [evaluation, setEvaluation] = useAtom(evaluationAtom)
                 const [isLoading, setIsLoading] = useState(true)

	joinedload(AppVariantRevisionsDB.variant.of_type(AppVariantDB))
	joinedload(AppVariantRevisionsDB.variant_revision.of_type(AppVariantDB))

-    return CreateAppOutput(app_id=str(app_db.id), app_name=str(app_db.app_name))
+    return CreateAppOutput(
+        app_id=str(app_db.id),
+        app_name=str(app_db.app_name),
+        app_type=None,
+        created_at=None,
+        updated_at=None,
+    )

-            pscs[k] = (pcts[v[1]] - pcts[v[0]]) / (pcts[v[1]] + pcts[v[0]])
+            denom = pcts[v[1]] + pcts[v[0]]
+            if denom == 0:
+                pscs[k] = 0.0
+            else:
+                pscs[k] = (pcts[v[1]] - pcts[v[0]]) / denom

[stale] [feature] Add snippets (using legacy apps) #2904

Are you sure you want to change the base?

[stale] [feature] Add snippets (using legacy apps) #2904

Uh oh!

Conversation

jp-agenta commented Nov 11, 2025

Uh oh!

Check failure

Uh oh!

Uh oh!

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Uh oh!

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Uh oh!

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Key Changes

Reviewed Changes

Uh oh!

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

CLAassistant commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants