Building a Unified Application Environment for Python Services and Micro-frontends with Crossplane Compositions

Cloud Native

Word Count: 2.7k

Read Times: 17 Min

Our platform team was a bottleneck. Every feature squad, running with a Python backend and a collection of micro-frontends, needed isolated environments for development, staging, and QA. The process was a tangled mess of JIRA tickets, semi-automated Terraform modules, and manual kubectl commands. A request for a new “staging” environment for the “payments” squad would trigger a week-long process involving at least three different engineers. It was slow, error-prone, and wildly inconsistent. The goal became clear: create a fully self-service, declarative API for application environments. A developer should be able to define their entire stack—backend service, database, and frontend hosting—in a single YAML file, commit it to git, and have a fully provisioned, ready-to-use environment materialize within minutes.

The Initial Concept and Technology Rationale

The core idea was to build an Internal Developer Platform (IDP) fronted by a simple, Kubernetes-native API. We didn’t want developers to write Terraform or understand the intricacies of AWS IAM policies. They should declare what they need, not how to build it.

Our technology choices were deliberate:

Crossplane: We evaluated standard Infrastructure as Code tools like Terraform. The problem is that they are primarily one-shot execution tools. They don’t maintain a constant reconciliation loop. We chose Crossplane because it extends the Kubernetes API, turning our cluster into a universal control plane. We could define our infrastructure as custom Kubernetes resources, and Crossplane’s controllers would work relentlessly to ensure the real-world state matched our declared state. This continuous reconciliation is the key to a truly declarative system.
Python: While Crossplane provides the declarative infrastructure layer, some procedural logic is unavoidable. We needed a robust scripting layer within our CI/CD pipelines to validate developer manifests, apply them to the cluster, and orchestrate subsequent steps like application deployment. Python, with its excellent kubernetes client and boto3 library, was the lingua franca of our backend teams, making it the obvious choice for this orchestration logic.
Micro-frontends: This was our existing architectural pattern. Each frontend is a separate, buildable artifact of static files (JS, CSS, HTML). The challenge was to integrate the deployment of these static assets into the same unified, declarative workflow that managed the backend infrastructure.

Phase 1: Defining the Platform API with a CompositeResourceDefinition (XRD)

The first step was to define the “shape” of our self-service API. This is what developers would interact with. In Crossplane, this is done using a CompositeResourceDefinition (XRD). It’s analogous to creating a custom resource definition (CRD), but for a higher-level abstraction that will be composed of other resources.

Our UnifiedEnvironment XRD needed to capture the essential inputs from a developer: the application name, the desired version of the Python backend image, the size of the database, and any frontend-specific configurations.

Here is the complete XRD. A real-world project would have many more options, but this captures the core structure.

---
apiVersion: apiextensions.crossplane.io/v1
kind: CompositeResourceDefinition
metadata:
  name: unifiedenvironments.platform.acme.com
spec:
  group: platform.acme.com
  names:
    kind: UnifiedEnvironment
    listKind: UnifiedEnvironmentList
    plural: unifiedenvironments
    singular: unifiedenvironment
  claimNames:
    kind: EnvironmentClaim
    listKind: EnvironmentClaimList
    plural: environmentclaims
    singular: environmentclaim
  versions:
  - name: v1alpha1
    served: true
    referenceable: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              # --- Backend Configuration ---
              backend:
                type: object
                description: Configuration for the Python backend service.
                properties:
                  image:
                    type: string
                    description: The full container image URI for the backend service.
                  port:
                    type: integer
                    description: The port the backend application listens on.
                    default: 8000
                  replicas:
                    type: integer
                    description: Number of replicas for the backend deployment.
                    default: 1
                required:
                  - image
              # --- Database Configuration ---
              database:
                type: object
                description: Configuration for the required PostgreSQL database.
                properties:
                  size:
                    type: string
                    description: The desired size of the database.
                    enum: ["small", "medium", "large"]
                    default: "small"
                required:
                  - size
              # --- Frontend Configuration ---
              frontend:
                type: object
                description: Configuration for the micro-frontend static assets.
                properties:
                  appName:
                    type: string
                    description: A unique name for the frontend application, used for resource naming.
                required:
                  - appName
            required:
              - backend
              - database
              - frontend
          status:
            type: object
            properties:
              databaseHost:
                type: string
                description: The connection host for the provisioned database.
              frontendBucketName:
                type: string
                description: The name of the S3 bucket created for the frontend assets.
              ready:
                type: boolean
                description: Indicates if all underlying resources are provisioned and ready.

The critical parts here are the spec and status fields. The spec is the developer’s desired state. We use OpenAPI v3 schema validation to enforce rules, like ensuring database.size is one of the allowed values. The status sub-resource is where Crossplane will write back the outcomes of the provisioning process, like the generated database hostname or the S3 bucket name. Our Python orchestrator will poll this status to know when to proceed.

Phase 2: Implementing the Provisioning Logic with a Composition

With the API defined, we now need to implement the logic that translates a UnifiedEnvironment resource into actual infrastructure. This is done with a Composition. It’s a template that maps the inputs from the XRD to a set of concrete managed resources from Crossplane providers (like provider-aws and provider-kubernetes).

This is where the real power lies. A single Composition can create resources across multiple clouds and providers.

---
apiVersion: apiextensions.crossplane.io/v1
kind: Composition
metadata:
  name: unifiedenvironment.aws.platform.acme.com
  labels:
    provider: aws
spec:
  compositeTypeRef:
    apiVersion: platform.acme.com/v1alpha1
    kind: UnifiedEnvironment
  # Define the resources to be created when a UnifiedEnvironment is instantiated.
  resources:
    # 1. A dedicated Kubernetes Namespace for isolation.
    - name: namespace
      base:
        apiVersion: kubernetes.core.v1.Namespace
        kind: Object
        metadata:
          labels:
            team: platform-team
      patches:
        - fromFieldPath: "metadata.name"
          toFieldPath: "metadata.name"
          transforms:
            - type: string
              string:
                fmt: "%s-environment"

    # 2. An S3 bucket for the micro-frontend static assets.
    - name: frontend-bucket
      base:
        apiVersion: s3.aws.upbound.io/v1beta1
        kind: Bucket
        spec:
          forProvider:
            region: us-east-1
            acl: public-read
          # We must specify a provider config to use for AWS credentials.
          providerConfigRef:
            name: aws-default
      patches:
        - fromFieldPath: "spec.frontend.appName"
          toFieldPath: "metadata.name"
          transforms:
            - type: string
              string:
                fmt: "acme-mfe-%s"
        - fromFieldPath: "metadata.uid" # Use UID for global uniqueness
          toFieldPath: "metadata.name"
          transforms:
            - type: string
              string:
                fmt: "acme-mfe-%s-%s" # e.g., acme-mfe-payments-a1b2c3d4
                vars:
                  - fromFieldPath: "spec.frontend.appName"
          policy:
            fromFieldPath: Required
        # Write the bucket name back to the status of the UnifiedEnvironment CR
        - fromFieldPath: "status.atProvider.id"
          toFieldPath: "status.frontendBucketName"
          policy:
            toFieldPath: Required

    # 3. A PostgreSQL instance via the official Crossplane provider.
    #    In a real scenario, this would likely be an RDSInstance or CloudSQL resource.
    - name: postgres-db
      base:
        apiVersion: postgresql.sql.crossplane.io/v1alpha1
        kind: Database
        spec:
          # This provider also needs a config pointing to a PostgreSQL server.
          providerConfigRef:
            name: postgresql-provider
      patches:
        - fromFieldPath: "metadata.name"
          toFieldPath: "metadata.name"
      # This is the most critical part for application connectivity.
      # We instruct Crossplane to write the connection secret to a specific key
      # in the newly created namespace.
      connectionDetails:
        - fromConnectionSecretKey: "username"
        - fromConnectionSecretKey: "password"
        - fromConnectionSecretKey: "endpoint"
        - fromConnectionSecretKey: "port"
      writeConnectionSecretToRef:
        namespaceFieldPath: "metadata.name"
        name: "db-connection-details"

    # 4. The Kubernetes Deployment for the Python backend.
    - name: backend-deployment
      base:
        apiVersion: kubernetes.apps.v1.Deployment
        kind: Object
        spec:
          replicas: 1
          selector:
            matchLabels:
              app: backend
          template:
            metadata:
              labels:
                app: backend
            spec:
              containers:
              - name: backend-container
                image: placeholder-image
                ports:
                - containerPort: 8000
                envFrom:
                  - secretRef:
                      name: "db-connection-details" # Consume the secret!
      patches:
        - fromFieldPath: "metadata.name"
          toFieldPath: "metadata.namespace"
          transforms:
            - type: string
              string:
                fmt: "%s-environment"
        - fromFieldPath: "spec.backend.replicas"
          toFieldPath: "spec.replicas"
        - fromFieldPath: "spec.backend.image"
          toFieldPath: "spec.template.spec.containers[0].image"
        - fromFieldPath: "spec.backend.port"
          toFieldPath: "spec.template.spec.containers[0].ports[0].containerPort"

A common pitfall here is managing secrets. How does the Python application get the credentials for the database that Crossplane just created? The writeConnectionSecretToRef field is the answer. It tells the PostgreSQL provider to take the connection details it generates (host, user, password) and write them into a standard Kubernetes Secret named db-connection-details inside the newly created namespace. The Deployment resource then uses envFrom to mount this secret as environment variables, making them available to the Python application seamlessly.

Phase 3: The Python Orchestrator for CI/CD

Now we have the declarative layer. The next piece is the procedural “glue” that runs in our CI pipeline. This Python script is responsible for taking a developer’s manifest, performing sanity checks, and interacting with the Kubernetes API to manage the UnifiedEnvironment custom resource.

# orchestrator.py
import os
import sys
import time
import logging
import yaml
from kubernetes import client, config
from kubernetes.client.rest import ApiException

# --- Configuration ---
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
API_GROUP = "platform.acme.com"
API_VERSION = "v1alpha1"
RESOURCE_PLURAL = "unifiedenvironments"
POLL_INTERVAL_SECONDS = 15
MAX_WAIT_MINUTES = 20

# --- Kubernetes Client Setup ---
def get_k8s_client():
    """Initializes and returns a Kubernetes API client."""
    try:
        # Assumes running inside a pod with a service account
        config.load_incluster_config()
    except config.ConfigException:
        try:
            # Fallback for local development
            config.load_kube_config()
        except config.ConfigException:
            raise RuntimeError("Could not configure Kubernetes client.")
    return client.CustomObjectsApi()

def validate_manifest(manifest_path):
    """
    Performs basic validation on the developer's manifest.
    A production system would use a schema validation library like pykwalify.
    """
    if not os.path.exists(manifest_path):
        raise FileNotFoundError(f"Manifest file not found: {manifest_path}")
    with open(manifest_path, 'r') as f:
        data = yaml.safe_load(f)
    
    if not all(k in data for k in ["apiVersion", "kind", "metadata", "spec"]):
        raise ValueError("Manifest missing required top-level keys.")
    
    logging.info(f"Manifest {manifest_path} passed basic validation.")
    return data

def apply_environment(api_client, manifest_data):
    """
    Applies the UnifiedEnvironment manifest to the cluster using server-side apply.
    """
    namespace = "default" # Or a dedicated management namespace
    name = manifest_data["metadata"]["name"]
    
    try:
        logging.info(f"Applying UnifiedEnvironment '{name}'...")
        api_client.patch_namespaced_custom_object(
            group=API_GROUP,
            version=API_VERSION,
            namespace=namespace,
            plural=RESOURCE_PLURAL,
            name=name,
            body=manifest_data,
            field_manager="PipelineOrchestrator",
            force=True # Required for server-side apply patch type
        )
        logging.info(f"Successfully applied UnifiedEnvironment '{name}'.")
    except ApiException as e:
        logging.error(f"Failed to apply manifest for '{name}': {e.body}")
        raise

def wait_for_ready(api_client, resource_name):
    """
    Polls the status of the UnifiedEnvironment resource until it is ready.
    """
    namespace = "default"
    start_time = time.time()
    max_wait_seconds = MAX_WAIT_MINUTES * 60

    logging.info(f"Waiting for UnifiedEnvironment '{resource_name}' to become ready...")
    
    while time.time() - start_time < max_wait_seconds:
        try:
            resource = api_client.get_namespaced_custom_object(
                group=API_GROUP,
                version=API_VERSION,
                namespace=namespace,
                plural=RESOURCE_PLURAL,
                name=resource_name
            )
            
            # Check the status conditions populated by Crossplane
            if 'status' in resource and 'conditions' in resource['status']:
                is_ready = any(
                    cond['type'] == 'Ready' and cond['status'] == 'True'
                    for cond in resource['status']['conditions']
                )
                if is_ready:
                    logging.info(f"UnifiedEnvironment '{resource_name}' is ready.")
                    return resource # Return the full object for later use
            
            logging.info(f"'{resource_name}' not ready yet. Polling again in {POLL_INTERVAL_SECONDS}s.")

        except ApiException as e:
            if e.status == 404:
                logging.warning(f"Resource '{resource_name}' not found yet. Retrying...")
            else:
                logging.error(f"API error while polling '{resource_name}': {e.body}")
                raise

        time.sleep(POLL_INTERVAL_SECONDS)

    raise TimeoutError(f"Timed out after {MAX_WAIT_MINUTES} minutes waiting for '{resource_name}'.")

if __name__ == "__main__":
    if len(sys.argv) != 2:
        print(f"Usage: python {sys.argv[0]} <path_to_manifest.yaml>")
        sys.exit(1)
        
    manifest_file = sys.argv[1]
    
    try:
        api = get_k8s_client()
        manifest = validate_manifest(manifest_file)
        resource_name = manifest["metadata"]["name"]
        
        apply_environment(api, manifest)
        ready_resource = wait_for_ready(api, resource_name)
        
        # This is where we would trigger the next stage of the pipeline
        bucket_name = ready_resource.get('status', {}).get('frontendBucketName')
        if bucket_name:
            logging.info(f"NEXT_STEP: Trigger frontend deployment to S3 bucket: {bucket_name}")
            # Set output for the CI/CD system, e.g.,
            # print(f"::set-output name=bucket_name::{bucket_name}")
        else:
            logging.warning("Could not find frontendBucketName in status.")
            
    except (FileNotFoundError, ValueError, RuntimeError, ApiException, TimeoutError) as err:
        logging.error(f"Orchestration failed: {err}")
        sys.exit(1)

This script is designed to be executed by a CI/CD runner (like GitLab CI or GitHub Actions). The wait_for_ready function is critical. A common mistake is to kubectl apply and immediately move to the next step. Infrastructure provisioning takes time. This function polls the custom resource’s .status.conditions field, which Crossplane updates as it works. Only when the Ready condition is True can we be certain that the database is available and the S3 bucket exists.

Phase 4: Integrating the Micro-frontend Deployment

The final piece of the puzzle is deploying the static frontend assets. After the Python orchestrator confirms the UnifiedEnvironment is ready, it extracts the provisioned S3 bucket name from the resource’s status. The CI pipeline can then proceed to a stage that builds the React/Vue/Angular application and syncs the output to this bucket.

A small Python script using boto3 can handle this upload.

# deploy_frontend.py
import boto3
import logging
import os
import sys
from botocore.exceptions import ClientError

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

def sync_to_s3(source_dir, bucket_name):
    """
    Syncs a local directory to an S3 bucket.
    Assumes AWS credentials are configured via environment variables or IAM role.
    """
    s3_client = boto3.client('s3')
    logging.info(f"Starting sync of '{source_dir}' to bucket '{bucket_name}'...")
    
    # A real implementation would be more robust, handling content types, etc.
    for root, _, files in os.walk(source_dir):
        for filename in files:
            local_path = os.path.join(root, filename)
            relative_path = os.path.relpath(local_path, source_dir)
            s3_key = relative_path.replace("\\", "/") # For Windows compatibility
            
            try:
                s3_client.upload_file(local_path, bucket_name, s3_key)
                logging.info(f"Uploaded {local_path} to s3://{bucket_name}/{s3_key}")
            except ClientError as e:
                logging.error(f"Failed to upload {local_path}: {e}")
                return False
    
    logging.info("Sync completed successfully.")
    return True

if __name__ == "__main__":
    try:
        build_directory = os.environ['BUILD_DIR']
        target_bucket = os.environ['S3_BUCKET_NAME']
    except KeyError as e:
        logging.error(f"Missing required environment variable: {e}")
        sys.exit(1)
        
    if not sync_to_s3(build_directory, target_bucket):
        sys.exit(1)

The CI/CD pipeline now has a clear, ordered flow:

Developer commits a my-feature-env.yaml manifest.
Pipeline triggers, executes orchestrator.py apply ....
orchestrator.py applies the manifest and waits for the UnifiedEnvironment to be Ready.
orchestrator.py extracts the bucket name from the status and passes it to the next stage.
The next stage runs the frontend build, then executes deploy_frontend.py to upload the assets.

A major pitfall we hit here was permissions. The CI runner’s service account needs Kubernetes permissions to manage our custom resources, but it also needs AWS permissions to upload to S3. We solved this using IAM Roles for Service Accounts (IRSA) on our EKS cluster. This allows us to associate a Kubernetes service account with an AWS IAM role, providing secure, keyless access to AWS resources from within the pod running the pipeline job.

The Final Workflow

The result is a powerful, fully automated system that abstracts away massive complexity from our developers.

graph TD
    subgraph Git Repository
        A[Developer commits env.yaml]
    end

    subgraph "CI/CD Pipeline"
        B[Trigger Pipeline] --> C{Run Python Orchestrator};
        C --> D[1. Validate Manifest];
        D --> E[2. Apply UnifiedEnvironment CR to K8s];
        E --> F{3. Poll CR Status};
        F -- Not Ready --> F;
        F -- Ready --> G[4. Extract Bucket Name from Status];
        G --> H{Run Frontend Deploy};
        H --> I[Build Static Assets];
        I --> J[Sync to S3 Bucket via Python/boto3];
    end

    subgraph Kubernetes Cluster
        K[API Server]
        subgraph Crossplane
            L[Crossplane Controller]
            M[Provider-AWS Controller]
            N[Provider-K8s Controller]
        end
    end

    subgraph AWS
        O[S3 Bucket]
        P[RDS Database]
    end
    
    E --> K;
    K --> L;
    L --Reads Composition--> M & N;
    M --> O & P;
    N --> Q[K8s Namespace, Deployment, Secret];

    J --> O;

    A --> B

This system turned a week-long, manual process into a five-minute, git-driven workflow. It improved consistency, reduced errors, and freed up the platform team to work on higher-value problems.

The current Python script is a pragmatic solution that lives within the CI/CD pipeline, but it has limitations. It’s procedural and only runs on a git commit. A more advanced architecture would involve replacing this script with a dedicated Kubernetes Operator, perhaps written using a framework like Kopf. This operator would watch for changes to UnifiedEnvironment resources and orchestrate the frontend deployment reactively, providing a true control loop for the entire application stack, not just the infrastructure. Furthermore, while Crossplane handles provisioning, a robust de-provisioning strategy with finalizers is needed to ensure that external resources, like backups or DNS entries not directly managed by the Composition, are cleaned up properly when an environment is deleted.

Python Kubernetes Crossplane Micro-frontends Platform Engineering IDP

Utilizing Redis Streams for a Resilient MLOps Data Ingestion Layer from Swift Applications

2023-10-27 MLOps

Redis Streams MLOps Swift Data Engineering System Design

Implementing Fine-Grained Data Access for Dask Workers Using Dynamic IAM Policies and a NoSQL Datastore

2023-10-27 Cloud Security

Dask Python spaCy IAM NoSQL AWS