Skip to content

Conversation

@MrTolerant
Copy link

@MrTolerant MrTolerant commented Oct 21, 2025

Summary

Fix a critical issue where RedisProxy fails to handle classic ArgoCD applications (token-based) in mixed environments with agent-managed applications. Classic app Redis keys without agent prefix now route correctly to principal Redis instead of causing connection errors.
It allows to migrate to agents step by step, cluster by cluster and work in mixed environments: classic ArgoCD clusters + Agents.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Description

Problem

The Redis proxy strictly required all keys to have an agent name prefix (agentName_). This broke classic ArgoCD applications which generate keys without any prefix, causing:

  • ❌ "unexpected lack of '_' namespace/name separator" errors
  • ❌ PUB/SUB connections dropping
  • ❌ Resource requests failing for classic applications
  • ❌ Mixed environments becoming unusable

Root Cause

extractAgentNameFromRedisCommandKey() function in principal/redisproxy/redisproxy.go threw an error when unable to find underscore separator, assuming all keys would be agent-prefixed.

Solution

Modified the function to gracefully handle keys without prefix by:

  1. Detecting keys without underscore in namespace field
  2. Returning empty string (instead of error) for non-agent-prefixed keys
  3. Existing routing logic already supports this via if agentName != "" check

Example

Before (broken):

Classic key: app|managed-resources|my-app|1.8.3
     ↓
extractAgentNameFromRedisCommandKey()
     ↓
Error: "unexpected lack of '_'"
     ↓
Connection drops ❌

After (fixed):

Classic key: app|managed-resources|my-app|1.8.3
     ↓
extractAgentNameFromRedisCommandKey()
     ↓
Returns: "" (empty)
     ↓
Routes to principal Redis ✅

Testing

  • Tests pass locally (21 tests in redisproxy package)
  • Code compiles without errors
  • No linting issues
  • Multi-platform Docker builds successfully (amd64 + arm64)

Test Results Summary

✅ Test_extractAgentNameFromRedisCommandKey: 8 sub-tests pass
✅ handleInternalNotify: 4 sub-tests pass
✅ handleAgentGet: 4 sub-tests pass
✅ handleAgentSubscribe: 1 sub-test pass

Files Changed

  1. principal/redisproxy/redisproxy.go (lines 817-821)

    • Changed error handling to return empty string for keys without prefix
    • Added debug log for routing clarity
  2. principal/redisproxy/redisproxy_test.go (lines 54-63)

    • Updated test cases for classic app behavior
    • Changed errorExpected from true to false for non-prefixed keys

Routing Behavior After Fix

Application Type Redis Key Example Route
Agent-managed app|managed-resources|agent-name_app|1.8.3 → Remote Agent (gRPC)
Classic app|managed-resources|app|1.8.3 → Principal Redis ✅ NEW
Non-managed mfst|... or cluster|... → Principal Redis

Impact

✅ What Works Now

  • Classic ArgoCD applications with token authentication
  • Agent-managed applications (unchanged)
  • Mixed environments with both types
  • Resource viewing for classic apps
  • PUB/SUB for both application types

✅ Backward Compatibility

  • No breaking changes
  • All existing tests pass
  • Safe to deploy to production

Verification Checklist

  • Code follows style guidelines
  • Self-reviewed the changes
  • Comments added for complex logic
  • Tests pass locally
  • No new linting errors
  • No external dependencies added
  • Changes are backward compatible

Related Issues

  • Fixes: Mixed environment incompatibility
  • Fixes: Classic app connection drops
  • Enables: Gradual migration path from classic to agent-managed

Additional Notes for Reviewers

Why This Approach?

  • Minimal - Only touches error case
  • Safe - Leverages existing routing logic
  • Efficient - No new processing overhead
  • Clear - Debug logs show routing decisions

Edge Cases

  • ✅ Keys with agent prefix → Route to agent (unchanged)
  • ✅ Keys without prefix → Route to principal (fixed)
  • ✅ Special keys (mfst, cluster) → Route to principal (unchanged)

Summary by CodeRabbit

  • Bug Fixes

    • Improved handling of non-agent-managed applications: keys without agent prefixes now route directly to Redis instead of triggering errors, enabling better support for classic application configurations.
  • Chores

    • Updated build configuration to ignore local shell script files matching the local*.sh pattern.

@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 45.91%. Comparing base (74dd148) to head (7011c20).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #623   +/-   ##
=======================================
  Coverage   45.90%   45.91%           
=======================================
  Files          90       90           
  Lines       12103    12104    +1     
=======================================
+ Hits         5556     5557    +1     
  Misses       6101     6101           
  Partials      446      446           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…efix

## Problem
The redisproxy implementation strictly required all Redis keys to start with
an agent name prefix in the format 'agentName_'. This worked for agent-managed
applications but broke for classic ArgoCD applications (using token authentication)
which generate Redis keys without any prefix.

When a key without the prefix arrived at redisproxy, the extractAgentNameFromRedisCommandKey
function would fail with 'unexpected lack of "_" namespace/name separator' error,
causing PUB/SUB connections to drop and resource requests for classic applications
to fail completely.

## Solution
Modified extractAgentNameFromRedisCommandKey function to gracefully handle Redis keys
without the agent prefix by:

1. Detecting keys without underscore separator in the namespace field
2. Returning empty string (instead of error) for non-agent-prefixed keys
3. This allows these keys to be routed to the principal Redis instead of failing

The routing logic in handleConnectionMessageLoop already supports this via the
'if agentName != ""' check, which will forward local keys to principal Redis.

## Changes Made

### principal/redisproxy/redisproxy.go
- Modified extractAgentNameFromRedisCommandKey() function
- Changed error case for missing underscore to return empty string with debug log
- Updated comments to clarify behavior for classic vs agent-managed applications

### principal/redisproxy/redisproxy_test.go
- Updated test cases to reflect new behavior
- Changed 'errorExpected: true' to 'errorExpected: false' for classic app keys
- Updated test descriptions to clarify that keys without prefix are classic apps

## Testing
✅ All existing tests pass (21 tests in redisproxy package)
✅ Code compiles without errors
✅ Multi-platform Docker image builds successfully (linux/amd64 and linux/arm64)

## Impact
- ✅ Agent-managed applications continue to work as before
- ✅ Classic ArgoCD applications now work correctly
- ✅ Mixed environments with both types of applications are now supported
- ✅ No breaking changes to existing functionality

## Example Routing
- Agent-managed key: 'app|managed-resources|agent-managed_my-app|1.8.3'
  → Routed to agent via gRPC
- Classic app key: 'app|managed-resources|my-app|1.8.3'
  → Routed to principal Redis (control plane)

Signed-off-by: Petr Lebedev <[email protected]>
@MrTolerant MrTolerant force-pushed the fix/redisproxy-classic-app-support branch from 7011c20 to b1a112f Compare October 22, 2025 09:40
@coderabbitai
Copy link

coderabbitai bot commented Nov 10, 2025

Walkthrough

The pull request modifies Redis command key parsing to treat namespace-only keys as classic (non-agent-managed) applications, returning empty agent name instead of an error. Additionally, a .gitignore rule is added to exclude local shell scripts.

Changes

Cohort / File(s) Summary
Git configuration
.gitignore
Adds ignore pattern local*.sh to exclude local shell scripts from version control.
Redis proxy key parsing
principal/redisproxy/redisproxy.go, principal/redisproxy/redisproxy_test.go
Changes extractAgentNameFromRedisCommandKey to treat keys lacking an underscore as classic applications, returning ("", nil) with a debug log instead of an error. Corresponding test cases updated to expect no error for namespace-only keys.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Parser as extractAgentNameFromRedisCommandKey
    participant Logger
    participant Router as Redis Router
    
    Client->>Parser: Redis command key (namespace only)
    
    rect rgb(220, 240, 255)
    Note over Parser: OLD: Return error
    end
    
    rect rgb(240, 255, 240)
    Note over Parser: NEW: Log & route to classic
    Parser->>Logger: Log debug message
    Logger-->>Parser: ✓
    Parser->>Router: Return ("", nil)
    Router->>Router: Route as classic app
    end
    
    Router-->>Client: Forward to principal Redis
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

  • Key attention areas:
    • Verify that treating namespace-only keys as classic applications is the intended behavior and doesn't break existing routing logic
    • Ensure the debug logging provides sufficient visibility for troubleshooting agent vs. classic app routing decisions
    • Confirm all related test cases adequately cover the new code path

Suggested reviewers

  • jgwest
  • mikeshng
  • jannfis

Poem

🐰 Hop into the Redis pool with grace,
Old errors fade without a trace,
Classic apps now find their way,
Through debug logs, bright as day—
Local shells stay out of sight,
The parsing path is smooth and light! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: enabling support for classic ArgoCD applications without agent prefix in the RedisProxy module, which is the core fix addressed in the PR.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8979a32 and 0d25065.

📒 Files selected for processing (3)
  • .gitignore (1 hunks)
  • principal/redisproxy/redisproxy.go (1 hunks)
  • principal/redisproxy/redisproxy_test.go (1 hunks)
🔇 Additional comments (3)
.gitignore (1)

9-9: LGTM: Standard gitignore pattern for local scripts.

This addition appropriately excludes local development shell scripts from version control.

principal/redisproxy/redisproxy_test.go (1)

54-64: LGTM: Test expectations correctly updated.

The test cases now properly reflect the new behavior where namespace-only keys (classic apps without agent prefix) return an empty string without error, rather than raising an error. The descriptive test names clearly communicate the scenario being tested.

principal/redisproxy/redisproxy.go (1)

817-822: I need to verify the official Kubernetes naming constraints to confirm whether the underscore assumption is valid.

The underscore delimiter assumption is correct and aligns with Kubernetes design.

Kubernetes deliberately chose underscore as a delimiter because it is not allowed in DNS subdomains, and namespace names must contain only lowercase letters, numbers, and hyphens. The code's use of _ as a delimiter to separate namespace/name from agent prefixes is mathematically sound and follows established Kubernetes precedent.

The debug-level logging at line 820 is appropriate for this routing decision—it provides optional visibility without spamming logs in production environments where this check succeeds silently for agent-managed applications (the common case).


Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants