Skip to content

Conversation

@codefromthecrypt
Copy link
Contributor

@codefromthecrypt codefromthecrypt commented Oct 29, 2025

What type of PR is this?
feat

What this PR does / why we need it:
Auto-generates TLS certificates for host mode when they don't exist, eliminating the manual envoy-gateway certgen --local step.

This also cleans up tech debt in other areas:

Which issue(s) this PR fixes:
Race found in envoyproxy/ai-gateway#1314

Release Notes: Yes

@codefromthecrypt
Copy link
Contributor Author

coverage again... I will see if I can refactor the code again

@codefromthecrypt codefromthecrypt marked this pull request as draft October 29, 2025 06:57
@codefromthecrypt codefromthecrypt force-pushed the auto-getn branch 2 times, most recently from d4ca389 to f3e0b9d Compare October 29, 2025 07:57
@codefromthecrypt
Copy link
Contributor Author

ok I reduced significantly the actual amount of envoy starts by using func-e's RunFunc for tests that don't need to really start one.

@codecov
Copy link

codecov bot commented Oct 29, 2025

Codecov Report

❌ Patch coverage is 85.45455% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.35%. Comparing base (f17b6df) to head (ab04695).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
internal/infrastructure/host/infra.go 72.41% 4 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7362      +/-   ##
==========================================
+ Coverage   72.14%   72.35%   +0.20%     
==========================================
  Files         231      231              
  Lines       33859    33896      +37     
==========================================
+ Hits        24427    24524      +97     
+ Misses       7672     7612      -60     
  Partials     1760     1760              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@codefromthecrypt codefromthecrypt marked this pull request as ready for review October 29, 2025 08:55
Auto-generates TLS certificates for host mode when they don't exist, eliminating the manual 'envoy-gateway certgen --local' step. Also fixes a data race in proxy context map by using atomic LoadAndDelete operation.

Signed-off-by: Adrian Cole <[email protected]>
arkodg
arkodg previously approved these changes Oct 30, 2025
Copy link
Contributor

@arkodg arkodg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks

@arkodg arkodg requested review from a team October 30, 2025 00:30
Signed-off-by: Adrian Cole <[email protected]>
arkodg
arkodg previously approved these changes Oct 30, 2025
@arkodg arkodg requested review from a team October 30, 2025 01:07
@arkodg arkodg added this to the v1.6.0-rc.1 Release milestone Oct 30, 2025
@arkodg arkodg merged commit 689d266 into envoyproxy:main Oct 30, 2025
12 checks passed
@codefromthecrypt codefromthecrypt deleted the auto-getn branch October 30, 2025 02:33
@codefromthecrypt
Copy link
Contributor Author

thanks a lot for landing this, this is the last of the big cleanups for standalone I had in progress. folks can feel free to do more, but these were the glaring ones

Comment on lines 87 to +88
certPath := paths.CertDir("envoy")
if _, err := os.Lstat(certPath); err != nil {
return nil, fmt.Errorf("failed to stat cert dir: %w", err)
if err := maybeGenerateCertificates(cfg, certPath); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might be wrong -- paths.CertDir("") instead of paths.CertDir("envoy"). let me confirm

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure on how to fix but envoyproxy/ai-gateway#1470 tests are failing without something

@mathetake
Copy link
Member

mathetake commented Nov 1, 2025

@codefromthecrypt so after a bit of debugging I realized that this PR is not working as intended - since the certs gen happens in infra runner asynchronously against other systems like xds Service runner, so certs gen doesn't happen before xds tries to read the certs, which fails, hence the standalone mode always fails to start as I see in envoyproxy/ai-gateway#1470.

We need a change like this to make this work properly together with the dir name fix

diff --git a/internal/cmd/server.go b/internal/cmd/server.go
index 5498ddaa3..1efac234c 100644
--- a/internal/cmd/server.go
+++ b/internal/cmd/server.go
@@ -153,6 +153,15 @@ func startRunners(ctx context.Context, cfg *config.Server) (err error) {
                return err
        }
 
+       if err = startRunner(ctx, cfg, infrarunner.New(&infrarunner.Config{
+               Server:  *cfg,
+               InfraIR: channels.infraIR,
+       })); err != nil {
+               return err
+       }
+
+       // TODO: we need to wait for the infrastructure runner to be ready after it creates the required certs.
+
        runners := []struct {
                runner Runner
        }{
@@ -191,15 +200,6 @@ func startRunners(ctx context.Context, cfg *config.Server) (err error) {
                                ProviderResources: channels.pResources,
                        }),
                },
-               {
-                       // Start the Infra Manager Runner
-                       // It subscribes to the infraIR, translates it into Envoy Proxy infrastructure
-                       // resources such as K8s deployment and services.
-                       runner: infrarunner.New(&infrarunner.Config{
-                               Server:  *cfg,
-                               InfraIR: channels.infraIR,
-                       }),
-               },
                {
                        // Start the Admin Server
                        // It provides admin endpoints including pprof for debugging.
diff --git a/internal/infrastructure/host/infra.go b/internal/infrastructure/host/infra.go
index 039964744..eeeb5c997 100644
--- a/internal/infrastructure/host/infra.go
+++ b/internal/infrastructure/host/infra.go
@@ -84,13 +84,14 @@ func NewInfra(runnerCtx context.Context, cfg *config.Server, logger logging.Logg
        }
 
        // Check if certificates exist, generate them if not
-       certPath := paths.CertDir("envoy")
-       if err := maybeGenerateCertificates(cfg, certPath); err != nil {
+       certsDirRoot := paths.CertDir("envoy-gateway")
+       if err := maybeGenerateCertificates(cfg, certsDirRoot); err != nil {
                return nil, err
        }
 
        // Ensure the sds config exist
-       if err := createSdsConfig(certPath); err != nil {
+       sdsConfigPath := paths.CertDir("envoy")
+       if err := createSdsConfig(sdsConfigPath); err != nil {
                return nil, fmt.Errorf("failed to create sds config: %w", err)
        }
 
@@ -98,7 +99,7 @@ func NewInfra(runnerCtx context.Context, cfg *config.Server, logger logging.Logg
                Paths:             paths,
                Logger:            logger,
                EnvoyGateway:      cfg.EnvoyGateway,
-               sdsConfigPath:     certPath,
+               sdsConfigPath:     sdsConfigPath,
                defaultEnvoyImage: egv1a1.DefaultEnvoyProxyImage,
                Stdout:            cfg.Stdout,
                Stderr:            cfg.Stderr,
@@ -109,6 +110,9 @@ func NewInfra(runnerCtx context.Context, cfg *config.Server, logger logging.Logg
 
 // createSdsConfig creates the needing SDS config under certain directory.
 func createSdsConfig(dir string) error {
+       if err := os.MkdirAll(dir, 0o750); err != nil {
+               return fmt.Errorf("failed to create sds config directory: %w", err)
+       }
        if err := file.Write(common.GetSdsCAConfigMapData(
                filepath.Join(dir, XdsTLSCaFilename)),
                filepath.Join(dir, common.SdsCAFilename)); err != nil {

ekline-ai pushed a commit to ekline-ai/envoyproxy-gateway that referenced this pull request Nov 2, 2025
* feat(host): auto-generate certificates for host mode

Auto-generates TLS certificates for host mode when they don't exist, eliminating the manual 'envoy-gateway certgen --local' step. Also fixes a data race in proxy context map by using atomic LoadAndDelete operation.

Signed-off-by: Adrian Cole <[email protected]>
Signed-off-by: EkLine AI <[email protected]>
ADITYATIWARI342005 pushed a commit to ADITYATIWARI342005/envoyproxy_gateway that referenced this pull request Nov 3, 2025
* feat(host): auto-generate certificates for host mode

Auto-generates TLS certificates for host mode when they don't exist, eliminating the manual 'envoy-gateway certgen --local' step. Also fixes a data race in proxy context map by using atomic LoadAndDelete operation.

Signed-off-by: Adrian Cole <[email protected]>
Signed-off-by: ADITYA TIWARI <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants