Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Builder Playground Kubernetes Integration
Design Analysis and Implementation Report
This report summarizes our discussions, design analysis, and implementation approach for enabling Kubernetes support in builder-playground. It serves as a comprehensive guide to continue development of the Kubernetes operator.
1. Context and Background
Builder Playground is a tool to deploy end-to-end environments for blockchain builder testing. Currently, it uses Docker Compose for local deployments, but there's a need to support Kubernetes for more scalable and production-ready deployments.
Current Architecture Overview
Builder Playground follows a clear architecture:
Integration Goals
2. Design Approaches Considered
We analyzed two primary approaches for Kubernetes integration:
Approach 1: K8s Runner Implementation
This approach would add a new runner to the existing architecture:
Pros:
Cons:
Approach 2: Kubernetes Operator
This approach creates a dedicated Kubernetes operator that watches for custom resources describing a Builder Playground deployment:
Pros:
Cons:
Hybrid Approach (Selected)
After careful analysis, we chose a hybrid approach, focusing on:
This approach enables users to:
--k8s
flag3. CRD Design
We designed a comprehensive CRD structure that captures all aspects of the builder-playground manifest:
Key design decisions:
Self-Contained Definition: All services are contained within a single CRD, similar to how docker-compose works, making the resource easier to understand and manage.
Template Preservation: The manifest preserves templating expressions like
{{.Dir}}
and{{Service "el" "authrpc"}}
to be resolved by the operator.Service Relationships: Dependencies between services are explicitly defined, allowing the operator to manage deployment order.
Storage Abstraction: Two storage options (local-path for development, PVC for production) provide flexibility for different environments.
No Status Field: The status field is omitted from the generated CRD as it should be managed by the Kubernetes controller.
4. Implementation Details
Integration Approach
The implementation adds a
--k8s
flag to the existingcook
command that triggers generation of a Kubernetes manifest alongside the regular artifacts:Key Components
Error Handling
The implementation includes proper validation and error handling:
5. Current Limitations and Future Development
Current Limitations
Our current CRD implementation doesn't yet address several important aspects:
Host Execution: There's currently no concrete solution for running services on the host machine when deploying to Kubernetes. The CRD includes a
useHostExecution
flag, but implementing this in Kubernetes requires careful consideration. Possible approaches include:Complex Networking: The current design doesn't fully address complex networking scenarios like exposing specific ports to external clients or handling cross-service communication with specific protocols.
Resource Limits: The CRD doesn't yet include configurations for CPU/memory limits and requests.
Security Considerations: JWT tokens and other sensitive information aren't properly handled through Kubernetes secrets yet.
Node Affinity and Placement: For testing/staging deployments, node selection and affinity rules would be needed.
Decision to Prioritize Core Functionality
We've deliberately chosen to focus first on covering the core functionality:
This approach allows us to get a working implementation more quickly, with a plan to address the more complex aspects in future iterations. The current implementation provides a solid foundation that captures the essential structure of builder-playground environments, while leaving room for enhancements.
6. Operator Implementation Guidelines
For developing the Kubernetes operator that will consume these CRDs, consider the following key aspects in the context of our single-pod approach:
1. Template Resolution in Single-Pod Context
Template resolution becomes more straightforward in a single-pod architecture:
{{.Dir}}
/artifacts
){{Service "el" "authrpc"}}
localhost:8551
(direct container-to-container communication){{Port "http" 8545}}
This simplifies the implementation as services can use localhost networking rather than requiring Kubernetes DNS for service discovery.
2. Resource Creation Strategy
Given our single-pod approach, the operator should create:
This simplifies resource management compared to a multi-pod approach while maintaining the relationship structure between services.
3. Dependency Management
With the single-pod approach, dependency management becomes primarily an initialization concern:
This is simpler than managing dependencies across multiple independent deployments, as all containers are guaranteed to run on the same node and can communicate directly.
4. Readiness Checks
Transform the readyCheck definitions into Kubernetes readiness probes:
5. Storage Implementation
With the single-pod approach, storage becomes simpler:
local-path
: Use hostPath volumes for development environmentspvc
: Use a single PersistentVolumeClaim with ReadWriteOnce access mode for productionThis eliminates the need for ReadWriteMany storage classes that would be required in a multi-pod architecture.
7. Host Execution in Single-Pod Context
For services with
useHostExecution: true
, the single-pod approach presents challenges:This remains one of the more challenging aspects to solve and may require further architectural discussions.
7. Testing Recommendations
When testing the operator, focus on:
Local Development Environments:
Recipe Compatibility:
Storage Options:
Template Processing:
7. Conclusion and Next Steps
The implemented Kubernetes manifest generator provides a solid foundation for building a complete Kubernetes operator for Builder Playground. By generating CRDs that preserve all the necessary information about services and their relationships, we enable a clear path forward for Kubernetes integration.
Current Focus: Matching the Docker Compose Workflow
It's important to emphasize that our current goal is to match the existing workflow with
--dry-run
and docker-compose.yml, not to build a fully automated deployment system yet. The current approach allows users to:This matches the developer workflow already familiar to builder-playground users, where the dry-run enables inspection and customization before deployment.
Future Deployment Options
For the longer term, we've considered two main approaches for the operator deployment:
Standalone Operator: Deploy the operator separately, then submit BuilderPlayground manifests. The operator would deploy a one-shot builder-playground container that generates the final service manifests for the operator to apply.
Integrated Operator: Include the operator code directly in the builder-playground binary, with an option to run it as the operator itself. This would create a more integrated solution but add complexity to the codebase.
Either approach would eventually enable a more automated workflow, but they are intentionally not part of the current implementation.
Next Steps
Develop Operator Skeleton:
Implement Template Handling:
Resource Management:
Testing:
This hybrid approach maintains the simplicity of builder-playground while enabling Kubernetes deployment, providing a flexible solution that supports both development and production use cases, with a clear path for evolution toward more automated deployment in the future.