testrunner/runners/system: fall back to packageRoot in CreatePackagePolicy when built tree is unavailable#3573
testrunner/runners/system: fall back to packageRoot in CreatePackagePolicy when built tree is unavailable#3573kcreddy wants to merge 3 commits into
Conversation
…able ReadBuiltPackageManifest assumed the built package directory always exists and is reachable from a Git repository. This fails when packageRoot is an EPR-extracted package (add_package_policy -version) because the built tree contains a different version, or when running outside a Git repository where no build/ ancestor exists. When BuildPackagesDirectory fails or the resolved path has no manifest, fall back to reading directly from packageRoot. This restores correct stream population in CreatePackagePolicy: without the fallback, ReadAllDataStreamManifests silently returned nil for the non-existent built path, producing an empty Streams map that Fleet rejected because required variables had no defaults. Fixes elastic#3552
| // Built tree unavailable or incomplete; fall back to packageRoot. | ||
| pkg, err := packages.ReadPackageManifestFromPackageRoot(packageRoot) |
There was a problem hiding this comment.
This is error prone, a caller could silently fallback to the source manifest when they are interested on the built one.
The caller should take the responsibility of falling back to the source manifest if it thinks it is accurate enough.
If the caller is interested only in fields that appear on both kinds of manifests (such as the name and version), and is not concerned about other fields that could change in build time they they could directly use the package root.
But in general a caller that is interested on the built files, such as most testers, should ensure first that they have a fully built package.
There was a problem hiding this comment.
Good point — the silent fallback in ReadBuiltPackageManifest is too broad and would hide real build problems for callers that genuinely need the built tree.
I've moved the fallback into CreatePackagePolicy, which is the only caller that receives EPR-extracted roots (via addPackagePolicy -version). ReadBuiltPackageManifest is now strict again — it fails if the built tree doesn't exist or contains a different version.
The reasoning: an EPR package downloaded from the registry is already fully resolved (no package: references), so reading manifests directly from packageRoot is accurate enough to build a correct policy. Callers that need composable resolution (the normal system test path, fleetpolicy.go, etc.) still get a hard error if the build is missing.
The fallback is logged at debug level so it's visible in -v output but doesn't mask problems silently.
Commit: 0901250
There was a problem hiding this comment.
CreatePackagePolicy must use a built manifest, we neither can silently fall back to the source manifest there.
Actually after re-reading #3552 I wonder if the problem is really related to use the built or the source manifest.
The failures are about required variables:
Package policy is invalid:
inputs.cel.streams.o365.audit.vars.azure_tenant_id: Directory (tenant) ID is required
inputs.cel.streams.o365.audit.vars.client_id: Application (client) ID is required
inputs.cel.streams.o365.audit.vars.client_secret: Client Secret is required
In #3307 we introduced the use of the newer "simplified" format of the Fleet policy API. This API is stricter regarding validation of required fields.
If in this case these fields are really required and not set, the test should be fixed to set the required variables.
If not possible, fall back to use the legacy API. This can be done in system and policy sets setting the policy_api_format: legacy parameter. Not sure what would be the best way to do this in script tests.
There was a problem hiding this comment.
The required variables are set in the test config (test_config.yaml lines 65-74 has: client_id, client_secret, azure_tenant_id set). I believe the problem isn't missing vars or the simplified API being stricter — it's that the vars never reach Fleet at all.
The chain:
ReadBuiltPackageManifest(eprRoot)resolves tobuild/packages/o365/3.8.1/— which doesn't exist (only3.8.2is built).ReadAllDataStreamManifests(builtRoot)returnsnilbecause that directory has nodata_stream/.buildStreamsForInputiterates over emptydatastreams, producing an emptyStreamsmap.json:"streams,omitempty"dropsStreamsfrom the request.- Fleet receives
enabled: truewith no streams, auto-creates them with no user vars, and rejects because the required fields have no defaults.
The legacy API would have the same issue — no streams means no vars get sent regardless of format.
Regarding "must use a built manifest": this seems true for composable integrations where package: references need resolution. But add_package_policy -version passes an EPR-extracted package downloaded from the registry. It's already fully resolved — there are no package: references. The EPR manifest and the built manifest are identical for non-composable packages. The built tree simply doesn't exist for the EPR version because only the dev version is built locally.
The alternative would be to also build (or extract) the EPR version into the build/ tree, but that's a much larger change for a path that doesn't need it.
Let me know what you think.
There was a problem hiding this comment.
…t in CreatePackagePolicy when built tree is unavailable ReadBuiltPackageManifest assumed the built package directory always exists and is reachable from a Git repository. This fails when packageRoot is an EPR-extracted package (add_package_policy -version) because the built tree contains a different version, or when running outside a Git repository where no build/ ancestor exists. Move the fallback into CreatePackagePolicy rather than ReadBuiltPackageManifest so the shared function stays strict — callers that genuinely need the built tree (composable integrations) still get a hard error if the build is missing. CreatePackagePolicy is the only caller that receives EPR-extracted roots; an EPR package is already fully resolved (no package: references), so reading manifests directly from packageRoot is accurate enough to build a correct policy. The fallback is logged at debug level so it appears in verbose output without masking real build failures. Fixes elastic#3552
💚 Build Succeeded
History
|
|
@kcreddy i am still figuring out what is the error and why this changes. perhaps is obvious but i am not understanding the following: why the error is reported at the script test runner when using certain flag, and the changes are about system test runner? ref #3552 is the flag also used for system? thanks |
@teresaromero So the bug surfaces via the script test command |
Uh oh!
There was an error while loading. Please reload this page.