The primary source of deployment anxiety for our front-end platform was the all-or-nothing nature of our release process. Pushing a new version of our Angular application, built with UnoCSS for its utility-first styling, was a high-stakes event. A bug slipping through manual QA could impact 100% of our users, forcing a frantic, disruptive rollback. The core technical pain point was a lack of a reliable, automated mechanism to de-risk releases by exposing a new version to a subset of traffic before a full rollout. This led to a decision to build a robust, zero-downtime canary deployment pipeline.
Our initial concept was to leverage a toolchain that could handle complex deployment strategies natively, manage infrastructure declaratively, and provide bulletproof automated validation. Simple CI tools, while excellent for building and testing, often require extensive and brittle scripting to orchestrate canary releases. We needed an orchestrator.
This is why we selected Spinnaker. Its entire design philosophy revolves around continuous delivery and sophisticated deployment patterns like canaries. It treats cloud resources as first-class citizens, which was critical for our infrastructure on DigitalOcean. For infrastructure, DigitalOcean Kubernetes (DOKS) provided the necessary managed environment, abstracting away the control plane’s complexity while giving us direct access to Kubernetes APIs. The final piece was validation. A canary is useless if you can’t verify its correctness. Playwright was chosen for its robust end-to-end testing capabilities, allowing us to simulate real user flows against the canary instance and make an automated go/no-go decision.
Phase 1: Production-Grade Application Containerization
Before any deployment orchestration, the application itself needs to be correctly containerized. A common mistake is to create bloated Docker images that include build-time dependencies. We employ a multi-stage Dockerfile to ensure our final image is lean, containing only the Nginx server and the compiled static assets from our Angular and UnoCSS build.
# Dockerfile
# ---- Stage 1: Build ----
# Use a specific Node.js version for build reproducibility.
FROM node:18.18.0-alpine AS build
WORKDIR /app
# Copy package files and install dependencies.
# This layer is cached unless package.json or package-lock.json changes.
COPY package.json package-lock.json ./
RUN npm ci --loglevel warn
# Copy the rest of the application source code.
COPY . .
# Run the production build. The Angular CLI handles tree-shaking and optimization.
# UnoCSS is integrated into the build process via its webpack plugin.
# The --configuration=production flag is critical.
RUN npm run build -- --configuration=production
# ---- Stage 2: Serve ----
# Use a minimal, hardened Nginx image.
FROM nginx:1.25.3-alpine
# Remove the default Nginx configuration.
RUN rm /etc/nginx/conf.d/default.conf
# Copy our custom Nginx configuration.
# This config is tailored for a Single Page Application (SPA).
COPY nginx.conf /etc/nginx/conf.d/
# Copy the compiled application artifacts from the 'build' stage.
COPY /app/dist/my-angular-app /usr/share/nginx/html
# Expose the standard HTTP port.
EXPOSE 80
# The default Nginx command will start the server.
CMD ["nginx", "-g", "daemon off;"]
The accompanying nginx.conf
is crucial for an Angular SPA, as it must correctly handle client-side routing by redirecting all 404s to index.html
.
# nginx.conf
server {
listen 80;
server_name localhost;
# Root directory for the static files.
root /usr/share/nginx/html;
index index.html index.htm;
# Enable Gzip compression for better performance.
gzip on;
gzip_vary on;
gzip_min_length 1024;
gzip_proxied expired no-cache no-store private auth;
gzip_types text/plain text/css text/xml text/javascript application/x-javascript application/xml;
gzip_disable "MSIE [1-6]\.";
location / {
# This is the key for SPA routing.
# If a file is not found, fall back to index.html.
try_files $uri $uri/ /index.html;
}
# Add long-lived cache headers for static assets.
# Angular CLI adds hashes to filenames, so we can cache them aggressively.
location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg)$ {
expires 1y;
add_header Cache-Control "public";
}
}
With this setup, we can build and push a versioned, production-ready image to our DigitalOcean Container Registry:
# Authenticate with DigitalOcean Container Registry
doctl registry login
# Build and tag the image
VERSION=$(node -p "require('./package.json').version")
IMAGE_NAME="registry.digitalocean.com/my-registry/my-angular-app:${VERSION}"
docker build -t ${IMAGE_NAME} .
# Push the image
docker push ${IMAGE_NAME}
Phase 2: Kubernetes Manifests for Canary Traffic Management
The core of our canary strategy lies in how we structure our Kubernetes deployments. Instead of a single deployment, we manage two: a baseline
(the stable production version) and a canary
(the new, unverified version). A single Kubernetes Service
will select pods from both deployments, allowing the standard kube-proxy
load balancing to split traffic based on the number of running pods in each deployment.
Here is the manifest for the baseline deployment. Spinnaker will manage the image
tag in this file.
# k8s/deployment-baseline.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: angular-app-baseline
namespace: production
labels:
app: angular-app
track: stable
spec:
replicas: 4 # Maintain a healthy number of baseline pods
selector:
matchLabels:
app: angular-app
template:
metadata:
labels:
app: angular-app
# This 'track' label is important for the service selector
track: stable
spec:
containers:
- name: angular-app
# Spinnaker will replace this with the specific stable version
image: registry.digitalocean.com/my-registry/my-angular-app:placeholder
ports:
- containerPort: 80
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "250m"
memory: "256Mi"
The canary deployment is almost identical but has a different name
and track
label, and it’s initially deployed with zero replicas.
# k8s/deployment-canary.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: angular-app-canary
namespace: production
labels:
app: angular-app
track: canary
spec:
replicas: 0 # Canary starts with no pods
selector:
matchLabels:
app: angular-app
template:
metadata:
labels:
app: angular-app
track: canary
spec:
containers:
- name: angular-app
# Spinnaker will inject the new version to test here
image: registry.digitalocean.com/my-registry/my-angular-app:placeholder
ports:
- containerPort: 80
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "250m"
memory: "256Mi"
The Service
ties them together. The pitfall here is using a selector that is too specific. By only selecting on app: angular-app
, it will route traffic to any pod with that label, regardless of whether it’s stable
or canary
.
# k8s/service.yaml
apiVersion: v1
kind: Service
metadata:
name: angular-app-svc
namespace: production
spec:
type: LoadBalancer # Exposes the service via a DigitalOcean Load Balancer
ports:
- port: 80
targetPort: 80
protocol: TCP
selector:
# This selector is key: it targets pods from BOTH deployments.
app: angular-app
Phase 3: The Spinnaker Pipeline Orchestration
With the building blocks in place, we can construct the Spinnaker pipeline. This is defined as a JSON or YAML file. Spinnaker’s UI is useful for building it, but in a real-world project, the pipeline definition should be version-controlled.
The pipeline flow is visualized below:
graph TD A[Trigger: New Image in Registry] --> B{Find Baseline Image}; B --> C[Deploy Canary]; C --> D[Run Playwright E2E Tests]; D --> E{Tests Passed?}; E -- Yes --> F[Promote to Production]; F --> G[Cleanup Canary]; E -- No --> H[Rollback: Destroy Canary];
Here’s a breakdown of the key stages:
1. Configuration: Trigger and Parameters
The pipeline triggers automatically when a new image tag appears in our DigitalOcean Container Registry. Spinnaker uses this trigger to extract the image digest and tag, which are then available throughout the pipeline execution.
2. Stage: Deploy Canary
This stage uses Spinnaker’s “Deploy (Manifest)” step. It takes the k8s/deployment-canary.yaml
text as input.
- Action: It applies the canary deployment manifest to our DOKS cluster.
- Overrides: Critically, it overrides two values:
-
spec.replicas
: Set to1
. This spins up a single canary pod. If the baseline has 4 replicas, this directs roughly 20% of traffic to the new version. -
spec.template.spec.containers[0].image
: Set to the image reference from the pipeline trigger. This ensures we are deploying the new version.
-
3. Stage: Run Playwright E2E Validation
This is the most complex integration. Spinnaker does not run tests directly. Instead, we use a “Run Job (Manifest)” stage to launch a Kubernetes Job
that executes our Playwright test suite.
Here is the manifest for the validation Job
:
# k8s/playwright-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: playwright-validation-run-${#uuid()} # Unique name for each run
namespace: testing
spec:
template:
spec:
containers:
- name: playwright-runner
image: mcr.microsoft.com/playwright:v1.39.0 # Official Playwright image
command: ["/bin/sh", "-c"]
args:
- |
# 1. Clone the test repository
git clone https://<PAT>@github.com/my-org/my-e2e-tests.git
cd my-e2e-tests
# 2. Install dependencies
npm ci
# 3. Run the tests against the production service endpoint.
# Because the service load balances, some tests will hit the canary.
# The exit code determines the success or failure of the Spinnaker stage.
npx playwright test --reporter=line
env:
- name: PLAYWRIGHT_BASE_URL
# The service endpoint in the 'production' namespace
value: "http://angular-app-svc.production.svc.cluster.local"
restartPolicy: Never
backoffLimit: 0 # Do not retry the job on failure
The Playwright test itself needs to be robust. It shouldn’t just check for the existence of an element. It should verify a critical user journey. For example, a test for a checkout flow.
// tests/checkout.spec.ts
import { test, expect } from '@playwright/test';
test.describe('Critical Checkout Flow', () => {
test('should allow a user to add an item to the cart and proceed to checkout', async ({ page }) => {
// This assumes the baseURL is set in the config to our service endpoint
await page.goto('/');
// 1. Verify a key element rendered by the new UnoCSS styles
const heroButton = page.locator('button.bg-primary-500');
await expect(heroButton).toBeVisible({ timeout: 10000 });
await expect(heroButton).toHaveText('Shop Now');
// 2. Navigate and perform an action
await page.locator('.product-card[data-product-id="abc-123"]').click();
await page.getByRole('button', { name: 'Add to Cart' }).click();
// 3. Verify state change
await expect(page.locator('.cart-item-count')).toHaveText('1');
// 4. Intercept a critical API call to ensure the payload is correct
// This guards against front-end/back-end contract regressions.
let checkoutApiRequest: any = null;
page.on('request', request => {
if (request.url().includes('/api/v2/checkout')) {
checkoutApiRequest = request.postDataJSON();
}
});
await page.getByRole('link', { name: 'Checkout' }).click();
await page.getByRole('button', { name: 'Confirm Purchase' }).click();
// 5. Assert on the intercepted network call
await expect(page.locator('.order-confirmation')).toBeVisible();
expect(checkoutApiRequest).not.toBeNull();
expect(checkoutApiRequest.items[0].id).toBe('abc-123');
expect(checkoutApiRequest.metadata.clientVersion).toBeDefined(); // Check for new metadata
});
});
The key is that the Kubernetes job’s exit code (0
for success, 1
for failure) is automatically propagated back to Spinnaker, which then decides whether to proceed or halt the pipeline.
4. Stage: Promote to Production
If the Playwright job succeeds, the pipeline continues to the promotion stage. This is another “Deploy (Manifest)” stage, but this time it modifies the angular-app-baseline
deployment.
- Action: It applies the
k8s/deployment-baseline.yaml
manifest. - Overrides: It only overrides one value:
-
spec.template.spec.containers[0].image
: Set to the new image from the pipeline trigger.
-
Kubernetes’ rolling update strategy will then safely replace the old baseline pods with the new version, one by one, ensuring no downtime.
5. Stage: Cleanup Canary
Once the baseline is fully updated, the canary is no longer needed. A final “Scale (Manifest)” or “Delete (Manifest)” stage is executed. We prefer scaling the canary deployment down to zero replicas.
- Target:
deployment angular-app-canary
in theproduction
namespace. - Action: Scale to
0
replicas.
This preserves the canary deployment object for potential reuse but removes its pods, stopping it from receiving any traffic and consuming resources. If the pipeline had failed at the testing stage, the failure path would have triggered this same cleanup stage, effectively rolling back the change by simply removing the canary.
The resulting system provides a high degree of confidence. A developer commit triggers a process that results in a new version being tested against a live, small percentage of production traffic with a comprehensive E2E suite. Only upon successful validation is the change promoted. The entire flow is hands-off, reducing both the risk and the toil associated with releases.
This pipeline architecture, while effective, is not the final evolution. The current validation is a binary pass/fail from Playwright. A more sophisticated approach would integrate Spinnaker’s automated canary analysis engine, Kayenta, to monitor Prometheus metrics from both the baseline and canary pods over a period of time, making a data-driven decision based on error rates, latency percentiles, and business metrics. Furthermore, traffic splitting is currently coarse-grained and dependent on pod ratios; for fine-grained control (e.g., 1%, 5%, 10% traffic), integrating a service mesh like Istio or Linkerd would be the next logical step, allowing Spinnaker to manipulate traffic routing rules directly. Finally, this workflow does not address database schema migrations, which must be carefully managed as a separate, but coordinated, process to ensure backward compatibility during the canary period.