Skip to content

chore: add scripts for metadata and shasum validation#6869

Merged
robertolopezlopez merged 1 commit into
mainfrom
feat/CLI-1550
Jun 4, 2026
Merged

chore: add scripts for metadata and shasum validation#6869
robertolopezlopez merged 1 commit into
mainfrom
feat/CLI-1550

Conversation

@robertolopezlopez
Copy link
Copy Markdown
Contributor

@robertolopezlopez robertolopezlopez commented Jun 1, 2026

User description

This PR

  • extracts script logic out of deployment-monitor.yml to separate scripts
  • extracts previous embedded python scripts to separate Go programs

Both changes aim for readability and sustainability.

In detail:

  • .github/scripts/deployment-monitor/capture-linux-metadata.sh
    • Captures version and sha256 metadata for Linux-based distribution channels.
    • Logic shared by monitor_npm and monitor_snyk_images jobs.
  • .github/scripts/deployment-monitor/capture-homebrew-metadata.sh
    • Captures version and sha256 metadata for the Homebrew distribution channel (macOS).
    • Used by monitor_homebrew job.
  • .github/scripts/deployment-monitor/compare-snyk-versions.sh
    • Compares collected Snyk version artifacts for stable/preview consistency.
    • Used by compare_versions job.
  • .github/scripts/deployment-monitor/helpers/*
    • Utility scripts extracted from those listed above.
  • .github/scripts/deployment-monitor/cmd/*
    • Small Go programs which replace the previous embedded Python code.
  • .github/workflows/deployment-monitor.yml
    • Skimmed of previous testing logic.

Pull Request Submission Checklist

  • Follows CONTRIBUTING guidelines
  • Commit messages
    are release-note ready, emphasizing
    what was changed, not how.
  • Includes detailed description of changes
  • Contains risk assessment (Low | Medium | High)
  • Highlights breaking API changes (if applicable)
  • Links to automated tests covering new functionality
  • Includes manual testing instructions (if necessary)
  • Updates relevant GitBook documentation (PR link: ___)
  • Includes product update to be announced in the next stable release notes

What does this PR do?

Where should the reviewer start?

How should this be manually tested?

What's the product update that needs to be communicated to CLI users?


PR Type

Enhancement


Description

  • Introduce new scripts for Snyk CLI deployment monitoring.
    • Automate validation of release artifact SHA256 checksums.
    • Ensure version consistency across different distribution channels.
    • Refactor CI workflow to leverage new modular scripts.

Diagram Walkthrough

    flowchart LR
      subgraph "New Scripts"
        A[compare-cdn-shasums.go]
        B[extract-release-json-hash.go]
        C[is-elf-binary.go]
        D[capture-homebrew-metadata.sh]
        E[capture-linux-metadata.sh]
        F[compare-snyk-versions.sh]
        G[helpers]
      end
      subgraph "Existing Workflow"
        H[deployment-monitor.yml]
      end
      H -- "Orchestrates and calls" --> A
      H -- "Orchestrates and calls" --> B
      H -- "Orchestrates and calls" --> C
      H -- "Orchestrates and calls" --> D
      H -- "Orchestrates and calls" --> E
      H -- "Orchestrates and calls" --> F
      H -- "Orchestrates and calls" --> G
      D -- "Uses helper" --> G
      E -- "Uses helper" --> G
      F -- "Uses helper" --> G
      A -- "Uses tool" --> C
      E -- "Uses tool" --> C
      D -- "Uses tool" --> B
Loading

File Walkthrough

Relevant files

Deployment Tests > Run workflow > this branch

image

Successful test

@snyk-io
Copy link
Copy Markdown

snyk-io Bot commented Jun 1, 2026

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues
Code Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@robertolopezlopez
Copy link
Copy Markdown
Contributor Author

/describe

@snyk-pr-review-bot
Copy link
Copy Markdown

PR Description updated to latest commit (c6c758c)

@robertolopezlopez robertolopezlopez force-pushed the feat/CLI-1550 branch 2 times, most recently from ca659fb to b69c16f Compare June 1, 2026 15:52
@robertolopezlopez robertolopezlopez marked this pull request as ready for review June 1, 2026 15:53
@robertolopezlopez robertolopezlopez requested a review from a team as a code owner June 1, 2026 15:53
@snyk-pr-review-bot

This comment has been minimized.

@snyk-pr-review-bot

This comment has been minimized.

@snyk-pr-review-bot

This comment has been minimized.

@snyk-pr-review-bot

This comment has been minimized.

@robertolopezlopez robertolopezlopez changed the title chore: add deployment monitoring scripts for metadata and shasum vali… chore: add deployment monitoring scripts for metadata and shasum validation Jun 2, 2026
Comment on lines +1 to +9
#!/usr/bin/env bash
# Prints the Linux binary asset name (snyk-linux or snyk-linux-arm64) to stdout.

ARCH="$(uname -m)"
if [ "$ARCH" = "arm64" ] || [ "$ARCH" = "aarch64" ]; then
echo "snyk-linux-arm64"
else
echo "snyk-linux"
fi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think that these small helpers could be together and we just call them as functions somehow, so we avoid scattering in multiple files. But in general it is ok. Maybe we could see opinion of other people about this refactoring

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I prefer to have them as separate files, because

  1. there is no need to source the whole file along with all its functions
  2. separation of concerns: one bash script -> one single function
  3. we avoid "utils.sh" jumbles

Thanks for your opinion though. Maybe others think in a different way than me.

Comment thread .github/scripts/deployment-monitor/cmd/is-elf-binary/main.go
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The refactoring looks better when looking at the high level file, nice improvement!

Copy link
Copy Markdown
Contributor

@danskmt danskmt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check if the comments from the bot are relevant:

Image

@robertolopezlopez
Copy link
Copy Markdown
Contributor Author

Please check if the comments from the bot are relevant:

Yes, I was looking at this when I got dragged by other task. I am working on it

@snyk-pr-review-bot

This comment has been minimized.

@snyk-pr-review-bot

This comment has been minimized.

@robertolopezlopez
Copy link
Copy Markdown
Contributor Author

robertolopezlopez commented Jun 3, 2026

Please check if the comments from the bot are relevant:

Image
  1. logic error: addressed in last commit.
  2. fragile globbing: will do.
  3. unsafe JSON generation: idem.

and from the lastest pr bot review:

  1. logic error: addressed with latest commit
  2. operational risk (go): go run is preceded by actions/setup-go@v5
  3. concurrency conflict: not relevant as github actions assigns its own runner + clean workspace to each monitor_* job.

@snyk-pr-review-bot

This comment has been minimized.

@snyk-pr-review-bot

This comment has been minimized.

@snyk-pr-review-bot

This comment has been minimized.

@snyk-pr-review-bot

This comment has been minimized.

@robertolopezlopez
Copy link
Copy Markdown
Contributor Author

PR Reviewer Guide 🔍

🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Brittle Hash Lookup 🟠 [major]
The awk command uses a regex ^[a-f0-9]{64}$ to identify lines containing hashes. If the sha256sums.txt.asc file contains any metadata lines or comments that don't match this exact pattern at the start of the line, they are skipped. Furthermore, it assumes the binary name is always the second field. While consistent with sha256sum output, the dependency on the * prefix being present (via gsub(/^\*/, "", $2)) will cause the lookup to fail if the manifest was generated without the binary flag (e.g., via shasum on some systems).

awk -v name="$BINARY_NAME" '$1 ~ /^[a-f0-9]{64}$/ {gsub(/^\*/, "", $2); if ($2==name) {print $1; exit}}' "$MANIFEST_FILE"

Version Prefix Mismatch 🟡 [minor]
The script retrieves the version using snyk --version (e.g., '1.123.0') and then constructs the URL using v${VERSION}. If snyk --version already includes a 'v' prefix (which is common in some release builds or environment-specific wrappers), the resulting URL will contain vv1.123.0, causing a 404 and failing the monitor. scripts/install-snyk.py handles this with explicit prefix checking (line 72), which this script lacks.

RELEASE_URL="https://static.snyk.io/cli/v${VERSION}/release.json"
HTTP_STATUS="$(
  curl --retry 2 -sSL "$RELEASE_URL" -w '%{http_code}' -o "$RELEASE_JSON"
)"
if [ "$HTTP_STATUS" != "200" ]; then

Silent Metadata Omission 🟡 [minor]
In main.go, the loop over metadata files continues silently if data.SHA256 is empty. This prevents the comparison logic from running, but also fails to alert if one of the monitored channels failed to produce a hash. This could lead to a 'Pass' state even if half the distribution channels are broken and reporting empty hashes.

if data.SHA256 == "" {
	continue
}

📚 Repository Context Analyzed

This review considered 30 relevant code sections from 14 files (average relevance: 1.00)

🤖 Repository instructions applied (from AGENTS.md)

I consider all those fixed - this is never ending

https://github.com/snyk/cli/actions/runs/26886238795

@robertolopezlopez robertolopezlopez changed the title chore: add deployment monitoring scripts for metadata and shasum validation chore: add scripts for metadata and shasum validation Jun 3, 2026
@snyk-pr-review-bot

This comment has been minimized.

@snyk-pr-review-bot

This comment has been minimized.

@robertolopezlopez
Copy link
Copy Markdown
Contributor Author

PR Reviewer Guide 🔍

🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

False Positive Risk 🟠 [major]
The script groups metadata entries for comparison using a key that only includes channel and binary, but omits the version. If the SNYK_VERSION_DIR directory contains stale metadata files from previous workflow runs with different versions—a common scenario on persistent CI runners—the script will detect multiple distinct SHAs for the same binary/channel and exit with a failure. The script should either purge the directory before use or include the version field in the cdnKey to ensure it only compares hashes within the same release version.

	if key.channel == "" {
		key.channel = "unknown"
	}
	if key.binary == "" {
		key.binary = "unknown"
	}

	cdnEntries[key] = append(cdnEntries[key], cdnEntry{
		source:   entrySource(data),
		sha256:   data.SHA256,
		filename: filename,
	})
}

if len(cdnEntries) == 0 {
	fmt.Println("No metadata entries with shasums found; skipping shasum comparison.")
	return
}

keys := make([]cdnKey, 0, len(cdnEntries))
for key := range cdnEntries {
	keys = append(keys, key)
}
sort.Slice(keys, func(i, j int) bool {
	if keys[i].channel != keys[j].channel {
		return keys[i].channel < keys[j].channel
	}
	return keys[i].binary < keys[j].binary
})

failed := false
for _, key := range keys {
	entriesForKey := cdnEntries[key]
	uniqueSHAs := make(map[string]struct{})
	for _, entry := range entriesForKey {
		uniqueSHAs[entry.sha256] = struct{}{}
	}

	if len(entriesForKey) < 2 {
		continue
	}

	if len(uniqueSHAs) > 1 {
		failed = true
		fmt.Printf("❌ Shasum mismatch for %s (%s)\n", key.binary, key.channel)
		for _, entry := range entriesForKey {
			fmt.Printf("  %s: %s (%s)\n", entry.source, entry.sha256, entry.filename)
		}
	}
}

Fragile Globbing 🟡 [minor]
The script uses a broad *.txt glob to identify version files. This will cause the script to attempt to parse any text file present in the SNYK_VERSION_DIR (such as build logs or documentation) as a Snyk version. If an unrelated file is empty, the script exits with code 3; if it contains non-version text, it will trigger false version consistency failures. The glob should be narrowed to snyk-version-*.txt to match the output pattern defined in write-metadata.sh.

for file in *.txt; do
  job_name=$(basename "$file" .txt)
  version=$(cat "$file" | tr -d '\r\n')

Portability Issue 🟡 [minor]
The script relies on the shasum command to calculate the binary's hash. While available on GitHub-hosted runners, shasum (part of Perl) is often absent from minimal or hardened Linux container images (like Alpine or Debian slim) frequently used for CI runners, whereas sha256sum (coreutils) is more universally present. Using a Go-based hasher or adding a fallback to sha256sum would improve operational reliability across different runner environments.

SHA256="$(shasum -a 256 "$SNYK_PATH" | awk '{print $1}')"

📚 Repository Context Analyzed

This review considered 30 relevant code sections from 14 files (average relevance: 1.00)

🤖 Repository instructions applied (from AGENTS.md)

False Positive Risk 🟠 [major] the environment (SNYK_VERSION_DIR + github.run_id) + version comparison step already cover the CI case. Passing on.
Fragile globbing 🟡 [minor] each monitor_* job just creates one txt file (snyk-version-<suffix>.txt) + a json. Passing on.
Portability Issue 🟡 [minor] not an issue here: CI runners execute ubuntu-latest

@robertolopezlopez robertolopezlopez deleted the feat/CLI-1550 branch June 4, 2026 07:09
@robertolopezlopez robertolopezlopez restored the feat/CLI-1550 branch June 4, 2026 08:11
@snyk-pr-review-bot
Copy link
Copy Markdown

PR Reviewer Guide 🔍

🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Brittle Version Comparison 🟠 [major]

The script uses ls *.txt to check for files, but the subsequent loop for file in *.txt does not check if any files were actually matched. If no .txt files exist, the loop will literally execute once with the filename string *.txt, causing cat to fail. While line 21 tries to guard this, the standard way to handle empty globs in bash is shopt -s nullglob.

txt_files=$(ls *.txt 2>/dev/null || true)
if [ -z "$txt_files" ]; then
  echo "❌ No .txt files found in ${SNYK_VERSION_DIR}. Version comparison cannot proceed."
  exit 2
Hardcoded External Dependency 🟡 [minor]

The script hardcodes a URL to raw.githubusercontent.com to fetch the GPG key if it's not found locally. This introduces a dependency on the external availability of the GitHub Raw service and the main branch state during the monitoring job, which could lead to transient failures in the CI pipeline if GitHub is rate-limiting or the file is moved.

curl --retry 2 -sSL \
  "https://raw.githubusercontent.com/snyk/cli/main/help/_about-this-project/snyk-code-signing-public.pgp" \
  -o "$TEMP_KEY"
Incomplete Error Reporting 🟡 [minor]

In main.go, when a metadata file is empty or contains invalid JSON, the program prints an error and exits with 1 (lines 97-110). However, it does not specify which 'channel' or 'binary' it was processing in the error message, making it harder to debug failures in a CI log that contains multiple distribution sources.

if trimmed == "" {
	_, _ = fmt.Printf("Metadata file %s is empty.\n", filename)
	os.Exit(1)
}

var data metadata
if err := json.Unmarshal([]byte(trimmed), &data); err != nil {
	snippet := strings.ReplaceAll(trimmed, "\n", "\\n")
	if len(snippet) > 200 {
		snippet = snippet[:200]
	}
	fmt.Printf("Invalid JSON in %s: %v\n", filename, err)
	fmt.Printf("Content snippet: %s\n", snippet)
	os.Exit(1)
}
📚 Repository Context Analyzed

This review considered 30 relevant code sections from 13 files (average relevance: 1.00)

🤖 Repository instructions applied (from AGENTS.md)

@robertolopezlopez
Copy link
Copy Markdown
Contributor Author

robertolopezlopez commented Jun 4, 2026

PR Reviewer Guide 🔍

🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Brittle Version Comparison 🟠 [major]
The script uses ls *.txt to check for files, but the subsequent loop for file in *.txt does not check if any files were actually matched. If no .txt files exist, the loop will literally execute once with the filename string *.txt, causing cat to fail. While line 21 tries to guard this, the standard way to handle empty globs in bash is shopt -s nullglob.

txt_files=$(ls *.txt 2>/dev/null || true)
if [ -z "$txt_files" ]; then
  echo "❌ No .txt files found in ${SNYK_VERSION_DIR}. Version comparison cannot proceed."
  exit 2

Hardcoded External Dependency 🟡 [minor]
The script hardcodes a URL to raw.githubusercontent.com to fetch the GPG key if it's not found locally. This introduces a dependency on the external availability of the GitHub Raw service and the main branch state during the monitoring job, which could lead to transient failures in the CI pipeline if GitHub is rate-limiting or the file is moved.

curl --retry 2 -sSL \
  "https://raw.githubusercontent.com/snyk/cli/main/help/_about-this-project/snyk-code-signing-public.pgp" \
  -o "$TEMP_KEY"

Incomplete Error Reporting 🟡 [minor]
In main.go, when a metadata file is empty or contains invalid JSON, the program prints an error and exits with 1 (lines 97-110). However, it does not specify which 'channel' or 'binary' it was processing in the error message, making it harder to debug failures in a CI log that contains multiple distribution sources.

if trimmed == "" {
	_, _ = fmt.Printf("Metadata file %s is empty.\n", filename)
	os.Exit(1)
}

var data metadata
if err := json.Unmarshal([]byte(trimmed), &data); err != nil {
	snippet := strings.ReplaceAll(trimmed, "\n", "\\n")
	if len(snippet) > 200 {
		snippet = snippet[:200]
	}
	fmt.Printf("Invalid JSON in %s: %v\n", filename, err)
	fmt.Printf("Content snippet: %s\n", snippet)
	os.Exit(1)
}

📚 Repository Context Analyzed

This review considered 30 relevant code sections from 13 files (average relevance: 1.00)

🤖 Repository instructions applied (from AGENTS.md)

Brittle Version Comparison:

  • ls *.txt + if [ -z "$txt_files" ] really works. 2>/dev/null || true protects from any ls error
  • line 22 is the safeguard for the for file in *.txt alleged "brittle version comparison". If there are no *.txt to iterate -> exit 2
  • shopt -s nullglob does not add any safeguard; on the contrary, we still need if [ -z "$txt_files" ] if we do not want to skip the for loop and continue the script
  • thus I DO NOT think that shopt brings any value here

Hardcoded External Dependency: What if github is down? The pipeline would not run on the first place

Incomplete Error Reporting:

  • According to deployment-monitor.yaml:65-74 (monitor_cdn), the json already contains the OS, base url and channel. I can say the same about the other monitor jobs in deployment-monitor.yaml and the script write-metadata.sh which is used through capture-linux-metadata.sh in monitor_npm and monitor_snyk_images.
image image image
  • main.go:108: we log the content fmt.Printf("Content snippet: %s\n", snippet)

Copy link
Copy Markdown
Contributor

@danskmt danskmt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a high-level point of view (the scripts for capture) they look good. For the other scripts that are in different files, I would think of merging them to a file with separate functions that can be called separately. But this is just personal preference.

Maybe you want to take someone else's opinion, but I believe is good to go

@robertolopezlopez robertolopezlopez merged commit 85b496c into main Jun 4, 2026
25 checks passed
@robertolopezlopez robertolopezlopez deleted the feat/CLI-1550 branch June 4, 2026 11:54
j-luong pushed a commit that referenced this pull request Jun 4, 2026
chore: add scripts for metadata and shasum validation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants