Add Kubernetes manifest generator #120

fnerdman · 2025-04-30T13:50:25Z

Builder Playground Kubernetes Integration

Design Analysis and Implementation Report

This report summarizes our discussions, design analysis, and implementation approach for enabling Kubernetes support in builder-playground. It serves as a comprehensive guide to continue development of the Kubernetes operator.

1. Context and Background

Builder Playground is a tool to deploy end-to-end environments for blockchain builder testing. Currently, it uses Docker Compose for local deployments, but there's a need to support Kubernetes for more scalable and production-ready deployments.

Current Architecture Overview

Builder Playground follows a clear architecture:

Recipes: Define environments to deploy (L1, OpStack, BuilderNet)
Artifacts: Files and configurations needed by services
Manifest: Intermediate representation of services and their relationships
Runner: Responsible for executing the manifest (currently Docker Compose)

Integration Goals

Enable Kubernetes deployment option for builder-playground
Maintain the simple developer experience
Support both local development and testing/staging environments
Allow for customization before deployment
Provide a path toward a Kubernetes operator

2. Design Approaches Considered

We analyzed two primary approaches for Kubernetes integration:

Approach 1: K8s Runner Implementation

This approach would add a new runner to the existing architecture:

// Register the kubernetes runner
factory.Register("kubernetes", func(out *output, manifest *Manifest, overrides map[string]string, interactive bool) (Runner, error) {
    namespace := "default"
    if ns, ok := overrides["namespace"]; ok {
        namespace = ns
        delete(overrides, "namespace")
    }
    return NewK8sRunner(out, manifest, namespace)
})

Pros:

Tightly integrated with builder-playground
Extends existing architecture
Full control over resource lifecycle
Easier debugging (single process)

Cons:

Less Kubernetes-native
More complex logic to track resource status
Places burden of k8s resource management on CLI
More complex error handling

Approach 2: Kubernetes Operator

This approach creates a dedicated Kubernetes operator that watches for custom resources describing a Builder Playground deployment:

apiVersion: playground.flashbots.io/v1alpha1
kind: BuilderPlayground
metadata:
  name: demo-builder
spec:
  latestFork: true
  genesisDelay: 30
  dryRun: false
  storageMethod: local-path
  # other configuration...

Pros:

More Kubernetes-native (follows best practices)
Better scalability for future features
Cleaner separation of concerns
Better for production deployments

Cons:

More complex implementation (requires operator)
Requires managing two codebases
Debugging may be more difficult across system boundaries
Higher learning curve for contributors

Hybrid Approach (Selected)

After careful analysis, we chose a hybrid approach, focusing on:

Creating a manifest generator that outputs Kubernetes CRDs
Designing a CRD structure that preserves the builder-playground service model
Setting the stage for a future operator implementation

This approach enables users to:

Generate Kubernetes manifests with --k8s flag
Optionally inspect and modify the manifests
Apply them to Kubernetes with standard tools
Eventually use a purpose-built operator for enhanced functionality

3. CRD Design

We designed a comprehensive CRD structure that captures all aspects of the builder-playground manifest:

apiVersion: playground.flashbots.io/v1alpha1
kind: BuilderPlaygroundDeployment
metadata:
  name: l1-environment
spec:
  # Recipe information
  recipe: l1
  
  # Storage configuration
  storage:
    type: local-path
    path: /data/builder-playground
  
  # Services configuration
  services:
    - name: beacon
      image: sigp/lighthouse
      tag: v7.0.0-beta.0
      entrypoint: ["lighthouse"]
      args:
        - "bn"
        - "--datadir"
        - "{{.Dir}}/data_beacon_node"
        # ...more args
      ports:
        - name: http
          containerPort: 3500
      dependencies:
        - name: el
          condition: healthy

Key design decisions:

Self-Contained Definition: All services are contained within a single CRD, similar to how docker-compose works, making the resource easier to understand and manage.
Template Preservation: The manifest preserves templating expressions like {{.Dir}} and {{Service "el" "authrpc"}} to be resolved by the operator.
Service Relationships: Dependencies between services are explicitly defined, allowing the operator to manage deployment order.
Storage Abstraction: Two storage options (local-path for development, PVC for production) provide flexibility for different environments.
No Status Field: The status field is omitted from the generated CRD as it should be managed by the Kubernetes controller.

4. Implementation Details

Integration Approach

The implementation adds a --k8s flag to the existing cook command that triggers generation of a Kubernetes manifest alongside the regular artifacts:

$ playground cook l1 --dry-run --k8s --output ./output --storage-type local-path

Key Components

K8sGenerator struct:

type K8sGenerator struct {
    Manifest     *Manifest
    RecipeName   string
    StorageType  string
    StoragePath  string
    StorageClass string
    StorageSize  string
    NetworkName  string
    OutputDir    string
}

Strongly Typed CRD Structures:

type BuilderPlaygroundDeployment struct {
    APIVersion string                    `yaml:"apiVersion"`
    Kind       string                    `yaml:"kind"`
    Metadata   BuilderPlaygroundMetadata `yaml:"metadata"`
    Spec       BuilderPlaygroundSpec     `yaml:"spec"`
}

Service Conversion Logic:

func convertServiceToK8s(svc *service) (BuilderPlaygroundService, error) {
    // Validate required fields
    if svc.image == "" {
        return BuilderPlaygroundService{}, fmt.Errorf("service %s missing required image", svc.Name)
    }
    
    k8sService := BuilderPlaygroundService{
        Name:  svc.Name,
        Image: svc.image,
        Tag:   svc.tag,
    }
    
    // Convert other fields...
    
    return k8sService, nil
}

Label Filtering with Map:

var internalLabels = map[string]bool{
    "service":            true,
    "playground":         true,
    "playground.session": true,
}

for k, v := range labels {
    if !internalLabels[k] {
        serviceLabels[k] = v
    }
}

Error Handling

The implementation includes proper validation and error handling:

// Validate required fields
if svc.image == "" {
    return BuilderPlaygroundService{}, fmt.Errorf("service %s missing required image", svc.Name)
}

// Propagate errors with context
k8sService, err := convertServiceToK8s(svc)
if err != nil {
    return crd, fmt.Errorf("failed to convert service %s: %w", svc.Name, err)
}

5. Current Limitations and Future Development

Current Limitations

Our current CRD implementation doesn't yet address several important aspects:

Host Execution: There's currently no concrete solution for running services on the host machine when deploying to Kubernetes. The CRD includes a useHostExecution flag, but implementing this in Kubernetes requires careful consideration. Possible approaches include:
- Using DaemonSets with privileged containers
- Running services outside of Kubernetes with network exposure
- Using tools like Kubevirt for virtualization
Complex Networking: The current design doesn't fully address complex networking scenarios like exposing specific ports to external clients or handling cross-service communication with specific protocols.
Resource Limits: The CRD doesn't yet include configurations for CPU/memory limits and requests.
Security Considerations: JWT tokens and other sensitive information aren't properly handled through Kubernetes secrets yet.
Node Affinity and Placement: For testing/staging deployments, node selection and affinity rules would be needed.

Decision to Prioritize Core Functionality

We've deliberately chosen to focus first on covering the core functionality:

Service definitions and relationships
Storage configuration
Basic networking
Service dependencies

This approach allows us to get a working implementation more quickly, with a plan to address the more complex aspects in future iterations. The current implementation provides a solid foundation that captures the essential structure of builder-playground environments, while leaving room for enhancements.

6. Operator Implementation Guidelines

For developing the Kubernetes operator that will consume these CRDs, consider the following key aspects in the context of our single-pod approach:

1. Template Resolution in Single-Pod Context

Template resolution becomes more straightforward in a single-pod architecture:

Template	Single-Pod Resolution
`{{.Dir}}`	Common volume mount path (e.g., `/artifacts`)
`{{Service "el" "authrpc"}}`	`localhost:8551` (direct container-to-container communication)
`{{Port "http" 8545}}`	The actual port number as defined in the container

This simplifies the implementation as services can use localhost networking rather than requiring Kubernetes DNS for service discovery.

2. Resource Creation Strategy

Given our single-pod approach, the operator should create:

One Pod with Multiple Containers: Each service becomes a container within the same pod
Services: For network access to exposed ports
ConfigMap/Secret: For shared configuration files
SinglePVC: For persistent storage (based on storage configuration)

This simplifies resource management compared to a multi-pod approach while maintaining the relationship structure between services.

3. Dependency Management

With the single-pod approach, dependency management becomes primarily an initialization concern:

Init Containers: Can be used to ensure services start in the correct order
Readiness Probes: Services can wait for dependencies to be ready
Shared Volume: All containers have access to the same storage, simplifying file-based dependencies

This is simpler than managing dependencies across multiple independent deployments, as all containers are guaranteed to run on the same node and can communicate directly.

4. Readiness Checks

Transform the readyCheck definitions into Kubernetes readiness probes:

# From CRD
readyCheck:
  queryURL: http://localhost:3500/eth/v1/node/syncing
  interval: 1s
  timeout: 30s

# To Kubernetes
readinessProbe:
  httpGet:
    path: /eth/v1/node/syncing
    port: 3500
  periodSeconds: 1
  timeoutSeconds: 30

5. Storage Implementation

With the single-pod approach, storage becomes simpler:

Single Shared Volume: All containers in the pod mount the same volume
Common Access Path: Each container can access the same files at the same path
Storage Options:
- local-path: Use hostPath volumes for development environments
- pvc: Use a single PersistentVolumeClaim with ReadWriteOnce access mode for production

This eliminates the need for ReadWriteMany storage classes that would be required in a multi-pod architecture.

7. Host Execution in Single-Pod Context

For services with useHostExecution: true, the single-pod approach presents challenges:

Container vs. Host: By definition, these services need to run outside the pod
Potential Solutions:
- A coordinating controller that runs some services as pods and others as external processes
- A hybrid deployment where the operator manages both Kubernetes resources and external processes
- Using a privileged container with access to the host's process namespace

This remains one of the more challenging aspects to solve and may require further architectural discussions.

7. Testing Recommendations

When testing the operator, focus on:

Local Development Environments:
- Test with minikube, k3s, and kind
- Verify all services start in the correct order
- Check template resolution works correctly
Recipe Compatibility:
- Test all recipes (l1, opstack, buildernet)
- Verify service interactions work as expected
Storage Options:
- Test local-path for development environments
- Test PVC for testing/staging environments
Template Processing:
- Test resolution of all template expressions
- Verify service discovery works correctly

7. Conclusion and Next Steps

The implemented Kubernetes manifest generator provides a solid foundation for building a complete Kubernetes operator for Builder Playground. By generating CRDs that preserve all the necessary information about services and their relationships, we enable a clear path forward for Kubernetes integration.

Current Focus: Matching the Docker Compose Workflow

It's important to emphasize that our current goal is to match the existing workflow with --dry-run and docker-compose.yml, not to build a fully automated deployment system yet. The current approach allows users to:

Generate Kubernetes manifests
Inspect and potentially modify them
Manually deploy to Kubernetes when ready

This matches the developer workflow already familiar to builder-playground users, where the dry-run enables inspection and customization before deployment.

Future Deployment Options

For the longer term, we've considered two main approaches for the operator deployment:

Standalone Operator: Deploy the operator separately, then submit BuilderPlayground manifests. The operator would deploy a one-shot builder-playground container that generates the final service manifests for the operator to apply.
Integrated Operator: Include the operator code directly in the builder-playground binary, with an option to run it as the operator itself. This would create a more integrated solution but add complexity to the codebase.

Either approach would eventually enable a more automated workflow, but they are intentionally not part of the current implementation.

Next Steps

Develop Operator Skeleton:
- Register the CRD type
- Create basic controller structure
- Define reconciliation logic
Implement Template Handling:
- Create a template processor for builder-playground templates
- Test with different service configurations
Resource Management:
- Implement single pod with multiple containers
- Use init containers for proper ordering
- Manage shared PVC for storage
Testing:
- Begin with simple recipes and progressively test more complex ones
- Verify all features work as with docker-compose

This hybrid approach maintains the simplicity of builder-playground while enabling Kubernetes deployment, providing a flexible solution that supports both development and production use cases, with a clear path for evolution toward more automated deployment in the future.

- Created k8s_generator.go with Kubernetes manifest generation logic - Added CLI flags for Kubernetes manifest generation - Integrated manifest generation into recipe workflow - Added CRD defintion to examples folder 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

ferranbt · 2025-05-01T18:32:34Z

main.go

@@ -173,6 +178,11 @@ func main() {
 		recipeCmd.Flags().BoolVar(&bindExternal, "bind-external", false, "bind host ports to external interface")
 		recipeCmd.Flags().BoolVar(&withPrometheus, "with-prometheus", false, "whether to gather the Prometheus metrics")
 		recipeCmd.Flags().StringVar(&networkName, "network", "", "network name")
+		recipeCmd.Flags().BoolVar(&k8sFlag, "k8s", false, "Generate Kubernetes manifests")


These were the things I wanted to avoid for the integration. K8s must be a separated lifecycle from the normal playground. Playground should generate an artifacts folder which gets consumed by the k8s generator in a second step.

I'm unsure if I can follow you. The idea of the --k8s flag is to only use it with dry run, i.e.

$ playground cook l1 --dry-run --k8s --output ./output --storage-type local-path

What the flag does is create a k8s-manifest.yaml, which will then be consumed by the k8s generator/operator.
We need to convert the internal playground manifest to the k8s manifest. We shouldn't work with the docker-compose.yml as that has already abstracted away some of the core information needed.

Oh I see, just saw your manifest.json patch. You would like the k8s impl to use this manifest.json and not include the k8s manifest logic in playground at all?

fnerdman mentioned this pull request Apr 30, 2025

Add Kubernetes support #90

Open

ferranbt reviewed May 1, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Kubernetes manifest generator #120

Add Kubernetes manifest generator #120

Uh oh!

fnerdman commented Apr 30, 2025

Uh oh!

ferranbt May 1, 2025

Uh oh!

fnerdman May 2, 2025

Uh oh!

fnerdman May 2, 2025

Uh oh!

Uh oh!

Add Kubernetes manifest generator #120

Are you sure you want to change the base?

Add Kubernetes manifest generator #120

Uh oh!

Conversation

fnerdman commented Apr 30, 2025

Builder Playground Kubernetes Integration

Design Analysis and Implementation Report

1. Context and Background

Current Architecture Overview

Integration Goals

2. Design Approaches Considered

Approach 1: K8s Runner Implementation

Approach 2: Kubernetes Operator

Hybrid Approach (Selected)

3. CRD Design

4. Implementation Details

Integration Approach

Key Components

Error Handling

5. Current Limitations and Future Development

Current Limitations

Decision to Prioritize Core Functionality

6. Operator Implementation Guidelines

1. Template Resolution in Single-Pod Context

2. Resource Creation Strategy

3. Dependency Management

4. Readiness Checks

5. Storage Implementation

7. Host Execution in Single-Pod Context

7. Testing Recommendations

7. Conclusion and Next Steps

Current Focus: Matching the Docker Compose Workflow

Future Deployment Options

Next Steps

Uh oh!

ferranbt May 1, 2025

Choose a reason for hiding this comment

Uh oh!

fnerdman May 2, 2025

Choose a reason for hiding this comment

Uh oh!

fnerdman May 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!