This document provides technical details about the infrastructure implementation, OrbStack integration, and inventory management in the Splunk Platform Automator.
The system processes Ansible inventory in multiple stages:
- Initial YAML Configuration:
# Example configuration (idx_sh_uf_orbstack.yml)
plugin: splunk-platform-automator
orbstack:
image: alma:9 # Base image for all machines
ansible_user: root # Default user for Ansible connections
splunk_hosts:
- name: idx1 # Machine name (used for DNS)
roles: # Splunk roles determine machine purpose
- indexer # This machine will be a Splunk indexer
Key Points:
plugin
: Identifies this as a Splunk Platform Automator configorbstack
: Provider-specific settings that apply to all machinessplunk_hosts
: List of machines with their roles and configurations
- Core Processing:
def _populate(self):
# Process each configuration section (general, custom, os, etc.)
for section in ['general', 'custom', 'os', 'splunk_dirs', ...]:
# Initialize empty defaults if section not found
if section not in self.defaults:
self.defaults[section] = {}
# If section exists in config, merge with defaults
if isinstance(self.configfiles.get(section), dict):
merged_section = self._merge_dict(self.defaults[section], self.configfiles[section])
Understanding the Code:
- Iterates through predefined configuration sections
- Handles missing sections by initializing empty defaults
- Merges user configuration with defaults when present
- Ensures all required settings have values
The inventory plugin was modified to add OrbStack support: The inventory plugin was modified to support OrbStack as a new provider:
- Plugin Registration:
DOCUMENTATION = r'''
# Added OrbStack support to plugin options
orbstack:
description: orbstack settings
type: dictionary
required: false
'''
class InventoryModule(BaseInventoryPlugin):
NAME = 'splunk-platform-automator'
- Provider Detection:
def _set_virtualization(self, splunk_config):
'''Detect provider from config file'''
supported_virtualizations = ['virtualbox','aws','orbstack']
for virtualization in supported_virtualizations:
if virtualization in splunk_config:
setattr(self, 'virtualization', virtualization)
- Configuration Processing:
def _process_orbstack_configs(self):
"""
Process OrbStack-specific variables with precedence:
1. Host-specific settings
2. Global OrbStack settings
3. Default settings
"""
# Start with defaults
default_config = self.defaults.get('orbstack', {})
global_config = self.configfiles.get('orbstack', {})
for hostname in self.inventory.hosts:
# Layer configurations
config = default_config.copy()
config.update(global_config)
host_config = host_vars.get('orbstack', {})
config.update(host_config)
# Set connection variables
ansible_user = config.get('ansible_user', 'ansible')
self.inventory.set_variable(hostname, 'ansible_host',
f"{ansible_user}@{hostname}@orb")
self.inventory.set_variable(hostname, 'ansible_ssh_user',
f'{ansible_user}@{hostname}')
# Global OrbStack settings
orbstack:
image: alma:9 # Default image for all machines
ansible_user: root # Default user for connections
splunk_hosts:
- name: idx1
roles:
- indexer
orbstack: # Host-specific override
image: rocky:9 # Overrides global image
-
Configuration Layers:
- Defaults provide baseline settings
- Global settings apply to all OrbStack machines
- Host-specific overrides allow customization
-
Connection Variables:
ansible_host
:{user}@{hostname}@orb
- Required format for OrbStack's SSH implementation
- Enables direct container access
ansible_ssh_user
:{user}@{hostname}
- Ensures correct user context
- Maintains Ansible compatibility
-
Integration Points:
- Inventory plugin detects OrbStack configuration
- Processes settings before Terraform reads inventory
- Ensures consistent host configuration
The project uses a clear environment separation:
terraform/
├── environments/
│ ├── dev/ # Development environment (OrbStack)
│ │ ├── main.tf # Environment-specific configuration
│ │ └── variables.tf
│ └── prod/ # Production environment (AWS - planned)
Currently focused on the development environment:
# dev/main.tf
terraform {
required_version = ">= 0.13"
}
module "platform" {
source = "../../modules/platform"
environment_name = terraform.workspace
inventory_file = "${path.module}/../../../config/inventory_output.yml"
provider_config = var.provider_config
}
The project demonstrates effective use of Terraform modules for code organization and reusability:
terraform/
├── modules/
│ ├── platform/ # High-level orchestration
│ │ ├── main.tf # Coordinates other modules
│ │ └── variables.tf # Common variables
│ ├── orbstack/ # OrbStack-specific logic
│ │ ├── main.tf # Provider configuration
│ │ └── variables.tf # OrbStack variables
│ └── orbstack_linux_machine/ # Low-level machine management
│ ├── main.tf # Machine creation/configuration
│ └── variables.tf # Machine-specific variables
-
Platform Module (Top Level):
# platform/main.tf module "provider" { source = "../orbstack" # Use OrbStack provider hosts = local.hosts # Pass processed inventory }
- Acts as the main orchestrator
- Reads and processes inventory
- Delegates to appropriate provider module
- Manages environment-specific settings
-
Provider Module (Middle Layer):
# orbstack/main.tf locals { # Transform inventory hosts into provider format machines = { for hostname, config in var.hosts : hostname => { name = hostname distro = try(config.orbstack_image, var.default_image) } } }
- Implements provider-specific logic
- Transforms generic config to provider format
- Handles provider-specific resources
- Manages network and DNS configuration
-
Machine Module (Bottom Layer):
# orbstack_linux_machine/main.tf resource "null_resource" "machine" { for_each = var.machines provisioner "local-exec" { # Create machine using OrbStack CLI command = "orb create -u ${var.username} ${each.value.distro} ${each.value.name}" } }
- Handles individual machine lifecycle
- Implements provider-specific commands
- Manages machine-level configuration
- Ensures proper cleanup on destroy
-
Module Composition:
module "orbstack_machines" { source = "../orbstack_linux_machine" machines = local.machines username = var.ansible_user }
-
Local Variables:
locals { machines = { for hostname, host_config in var.hosts : hostname => { name = try(host_config.machine_name, hostname) distro = split(":", try(host_config.orbstack_image, var.default_image))[0] architecture = try(host_config.architecture, "amd64") } } }
-
Resource Management:
resource "null_resource" "orbstack_machine" { for_each = var.machines triggers = { name = each.value.name distro = each.value.distro architecture = each.value.architecture } }
The project has evolved from using Vagrant to Terraform for infrastructure management, bringing several improvements:
- Better state management
- More flexible provider support
- Cleaner separation of concerns
# Example configuration (idx_sh_uf_orbstack.yml)
plugin: splunk-platform-automator
orbstack:
image: alma:9
ansible_user: root
splunk_hosts:
- name: idx1
roles:
- indexer
- name: sh1
roles:
- search_head
- name: uf1
roles:
- universal_forwarder
The configuration flows through several stages:
- YAML parsing and validation
- Environment-specific processing
- Host configuration generation
# orbstack/main.tf
module "orbstack_machines" {
source = "../orbstack_linux_machine"
machines = local.machines
username = var.ansible_user
}
# orbstack_linux_machine/main.tf
resource "null_resource" "orbstack_machine" {
for_each = var.machines
# Machine Creation
provisioner "local-exec" {
command = "orb create --arch ${each.value.architecture == "aarch64" ? "arm64" : each.value.architecture} -u ${var.username} ${each.value.distro} ${each.value.name}"
when = create
}
# Machine Cleanup
provisioner "local-exec" {
command = "orbctl delete ${self.triggers.name}"
when = destroy
}
}
-
Machine Creation:
- Uses
null_resource
since OrbStack lacks a native provider - Executes OrbStack CLI commands via local-exec provisioner
- Handles architecture mapping (aarch64 → arm64)
- Uses
-
State Management:
- Uses triggers to track machine configuration
- Enables proper update and destroy operations
- Maintains idempotency through resource tracking
-
Network Configuration:
data "external" "machine_ips" { program = ["bash", "-c", "IP=$(orb run -m \"${each.value.name}\" hostname -I | cut -d' ' -f1)"] }
- Collects IP addresses after machine creation
- Uses external data source for dynamic information
- Updates host files across all machines
-
Integration with Ansible:
- Generates inventory in required format
- Provides necessary connection information
- Ensures proper DNS resolution
The system collects IP addresses through Terraform's external data source:
data "external" "machine_ips" {
program = ["bash", "-c", <<-EOT
IP=$(orb run -m "${each.value.name}" hostname -I | cut -d' ' -f1)
echo "{\"ip\": \"$${IP:-}\"}"
EOT
]
}
Terraform manages the host file entries:
resource "null_resource" "update_hosts" {
provisioner "local-exec" {
command = <<-EOT
# Add new protected segment
echo "# BEGIN TERRAFORM MANAGED BLOCK
${local.hosts_entries}
# END TERRAFORM MANAGED BLOCK" | orb -m "${each.value.name}" sudo tee -a /etc/hosts
EOT
}
}
-
IP Addresses:
- Required for direct network communication
- Used by Ansible for SSH connections
- Needed for Splunk's network protocols
-
DNS Names:
- Required for Splunk clustering
- Used for service discovery
- Provides stable naming across restarts
-
Default Settings:
- Defined in
defaults/
directory - Provide baseline configuration
- Defined in
-
Global OrbStack Settings:
- Defined in main config file
- Apply to all OrbStack hosts
-
Host-specific Settings:
- Override global settings
- Allow per-host customization
-
Development Focus:
- Currently using only the dev environment
- OrbStack optimized for ARM-based macOS
- Future AWS support planned
-
Key Files:
terraform/environments/dev/main.tf
: Environment configurationterraform/modules/orbstack/main.tf
: OrbStack implementationansible/plugins/inventory/splunk-platform-automator.py
: Inventory management
-
Important Considerations:
- Always use fully qualified paths in Terraform
- Maintain host file consistency across all nodes
- Consider DNS resolution requirements for Splunk services