Establishing End-to-End Encrypted Observability from a Swift Client to a Go Backend with mTLS and the ELK Stack


Our initial problem wasn’t a lack of logging. We had logs. Terabytes of them. The problem was that they were disconnected silos of information. When a user of our macOS Swift client reported a critical failure, the support ticket would land on a developer’s desk with a vague error message and a client-side log file. The developer would then have to manually grep through backend Go service logs, trying to correlate timestamps. This process was brittle, time-consuming, and often fruitless. Compounding this, our security posture relied on API keys, which felt insufficient for the zero-trust environment we were moving towards. We needed to solve two problems simultaneously: establish ironclad, identity-based security between client and server, and create a unified, queryable narrative for every single user action across the entire stack.

This led to a non-negotiable set of requirements:

  1. Every client-server connection must be mutually authenticated via TLS (mTLS). No exceptions.
  2. Every user-initiated action must generate a unique trace context on the Swift client.
  3. This context must be propagated securely to the Go backend.
  4. All logs, from both client and server, must be structured (JSON) and contain this trace context.
  5. All data must flow into our existing ELK Stack, allowing us to retrieve the complete lifecycle of a request with a single query.

Our technology stack was largely fixed: a Swift/AppKit client, a Go backend, and the ELK Stack for observability. PostCSS entered the picture for a small, but necessary, internal admin tool to manage client certificate lifecycles. This is the log of how we built it, the code that made it work, and the pitfalls discovered along the way.

Step 1: Architecting the mTLS Foundation in Go

Before any observability could be implemented, the communication channel had to be secured. mTLS requires a Certificate Authority (CA) to issue and sign certificates for both the server and all legitimate clients. For a production environment, you would use a proper internal CA or a service like HashiCorp Vault. For this build log, we’ll demonstrate the process using Go’s own crypto libraries to generate the necessary artifacts. This is not just a setup step; understanding the certificate generation process is critical for debugging handshake failures.

First, we need a small utility to generate our CA, server certificate, and a client certificate.

// file: cert_gen/main.go
// A command-line utility for generating CA, server, and client certs for mTLS.
// In a real project, this logic would be part of a larger PKI management system.
package main

import (
	"crypto/rand"
	"crypto/rsa"
	"crypto/x509"
	"crypto/x509/pkix"
	"encoding/pem"
	"log"
	"math/big"
	"net"
	"os"
	"time"
)

func main() {
	// Generate CA
	caCert, caKey := generateCA()
	writeCert("ca.crt", caCert)
	writeKey("ca.key", caKey)

	// Generate Server Certificate, signed by our CA
	serverCert, serverKey := generateServerCert(caCert, caKey)
	writeCert("server.crt", serverCert)
	writeKey("server.key", serverKey)

	// Generate Client Certificate, signed by our CA
	clientCert, clientKey := generateClientCert(caCert, caKey)
	writeCert("client.crt", clientCert)
	writeKey("client.key", clientKey)

	log.Println("Successfully generated CA, server, and client certificates.")
}

func generateCA() (*x509.Certificate, *rsa.PrivateKey) {
	template := &x509.Certificate{
		SerialNumber:          big.NewInt(1),
		Subject:               pkix.Name{CommonName: "My Corp CA"},
		NotBefore:             time.Now(),
		NotAfter:              time.Now().AddDate(10, 0, 0),
		IsCA:                  true,
		ExtKeyUsage:           []x509.ExtKeyUsage{x509.ExtKeyUsageClientAuth, x509.ExtKeyUsageServerAuth},
		KeyUsage:              x509.KeyUsageDigitalSignature | x509.KeyUsageCertSign,
		BasicConstraintsValid: true,
	}

	privKey, err := rsa.GenerateKey(rand.Reader, 4096)
	if err != nil {
		log.Fatalf("Failed to generate CA private key: %v", err)
	}

	certBytes, err := x509.CreateCertificate(rand.Reader, template, template, &privKey.PublicKey, privKey)
	if err != nil {
		log.Fatalf("Failed to create CA certificate: %v", err)
	}

	caCert, _ := x509.ParseCertificate(certBytes)
	return caCert, privKey
}

func generateServerCert(caCert *x509.Certificate, caKey *rsa.PrivateKey) (*x509.Certificate, *rsa.PrivateKey) {
	template := &x509.Certificate{
		SerialNumber: big.NewInt(2),
		Subject:      pkix.Name{CommonName: "localhost"},
		IPAddresses:  []net.IP{net.IPv4(127, 0, 0, 1), net.IPv6loopback},
		DNSNames:     []string{"localhost"},
		NotBefore:    time.Now(),
		NotAfter:     time.Now().AddDate(1, 0, 0),
		KeyUsage:     x509.KeyUsageDigitalSignature,
		ExtKeyUsage:  []x509.ExtKeyUsage{x509.ExtKeyUsageServerAuth},
	}

	privKey, _ := rsa.GenerateKey(rand.Reader, 4096)

	certBytes, err := x509.CreateCertificate(rand.Reader, template, caCert, &privKey.PublicKey, caKey)
	if err != nil {
		log.Fatalf("Failed to create server certificate: %v", err)
	}

	serverCert, _ := x509.ParseCertificate(certBytes)
	return serverCert, privKey
}

func generateClientCert(caCert *x509.Certificate, caKey *rsa.PrivateKey) (*x509.Certificate, *rsa.PrivateKey) {
    // The Common Name for a client certificate often identifies the user or device.
	template := &x509.Certificate{
		SerialNumber: big.NewInt(3),
		Subject:      pkix.Name{CommonName: "client-device-001"},
		NotBefore:    time.Now(),
		NotAfter:     time.Now().AddDate(1, 0, 0),
		KeyUsage:     x509.KeyUsageDigitalSignature,
		ExtKeyUsage:  []x509.ExtKeyUsage{x509.ExtKeyUsageClientAuth},
	}
	
	privKey, _ := rsa.GenerateKey(rand.Reader, 4096)

	certBytes, err := x509.CreateCertificate(rand.Reader, template, caCert, &privKey.PublicKey, caKey)
	if err != nil {
		log.Fatalf("Failed to create client certificate: %v", err)
	}
	
	clientCert, _ := x509.ParseCertificate(certBytes)
	return clientCert, privKey
}

func writeCert(filename string, cert *x509.Certificate) {
	file, err := os.Create(filename)
	if err != nil {
		log.Fatalf("Failed to create file %s: %v", filename, err)
	}
	defer file.Close()
	pem.Encode(file, &pem.Block{Type: "CERTIFICATE", Bytes: cert.Raw})
}

func writeKey(filename string, key *rsa.PrivateKey) {
	file, err := os.Create(filename)
	if err != nil {
		log.Fatalf("Failed to create file %s: %v", filename, err)
	}
	defer file.Close()
	pem.Encode(file, &pem.Block{Type: "RSA PRIVATE KEY", Bytes: x509.MarshalPKCS1PrivateKey(key)})
}

With the certificates generated, we configured the Go server. The key is the tls.Config struct. We must load our CA’s certificate to create a client certificate pool. Then, we set ClientAuth to tls.RequireAndVerifyClientCert. This is a non-negotiable setting in a zero-trust model; it enforces that every connecting client must present a certificate, and that certificate must be valid and signed by our trusted CA.

// file: server/main.go
package main

import (
	"crypto/tls"
	"crypto/x509"
	"encoding/json"
	"io/ioutil"
	"log"
	"net/http"
	"os"
	"time"

	"github.com/google/uuid"
	"github.com/rs/zerolog"
)

// A structured log entry that we will send to ELK
type LogEntry struct {
	Level       string    `json:"level"`
	Timestamp   time.Time `json:"@timestamp"`
	Message     string    `json:"message"`
	TraceID     string    `json:"trace.id,omitempty"`
	SpanID      string    `json:"span.id,omitempty"`
	ClientCN    string    `json:"client.cn,omitempty"`
	RemoteAddr  string    `json:"remote.addr,omitempty"`
	UserAgent   string    `json:"user.agent,omitempty"`
	HTTPMethod  string    `json:"http.method,omitempty"`
	HTTPPath    string    `json:"http.path,omitempty"`
	HTTPStatus  int       `json:"http.status,omitempty"`
	DurationMs  int64     `json:"duration.ms,omitempty"`
}


func main() {
	// Using zerolog for structured JSON logging, which is ideal for Logstash.
	logger := zerolog.New(os.Stdout).With().Timestamp().Logger()

	caCert, err := ioutil.ReadFile("../certs/ca.crt")
	if err != nil {
		logger.Fatal().Err(err).Msg("Failed to read CA certificate")
	}
	caCertPool := x509.NewCertPool()
	caCertPool.AppendCertsFromPEM(caCert)

	tlsConfig := &tls.Config{
		ClientCAs:  caCertPool,
		// This is the critical setting for mTLS.
		// It requires the client to present a certificate and verifies it against our CA pool.
		ClientAuth: tls.RequireAndVerifyClientCert,
	}

	mux := http.NewServeMux()
	mux.HandleFunc("/status", statusHandler)
	
	// Middleware to inject logger and handle tracing headers
	loggingMiddleware := func(next http.Handler) http.Handler {
		return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
			startTime := time.Now()

			// Extract trace context from headers (sent by the Swift client)
			traceID := r.Header.Get("X-Trace-Id")
			if traceID == "" {
				traceID = uuid.New().String() // Generate if not present
			}
			
			// Extract client identity from the verified certificate
			var clientCN string
			if r.TLS != nil && len(r.TLS.PeerCertificates) > 0 {
				clientCN = r.TLS.PeerCertificates[0].Subject.CommonName
			}
			
			// A real OpenTelemetry integration would be more complex, but this captures the essence.
			// We generate a "span" for this request.
			spanID := uuid.New().String()[0:16]

			logCtx := logger.With().
				Str("trace.id", traceID).
				Str("span.id", spanID).
				Str("client.cn", clientCN).
				Str("remote.addr", r.RemoteAddr).
				Str("user.agent", r.UserAgent()).
				Str("http.method", r.Method).
				Str("http.path", r.URL.Path).
				Logger()
			
			logCtx.Info().Msg("Request started")
			
			// In a real app, you would inject the logger/context into the handler.
			// For simplicity, we just log start and end here.
			
			next.ServeHTTP(w, r)
			
			// This is a simplified stand-in for a real response writer wrapper
			// that would capture the status code. For this example, we assume 200.
			statusCode := http.StatusOK 
			
			logCtx.Info().
				Int("http.status", statusCode).
				Int64("duration.ms", time.Since(startTime).Milliseconds()).
				Msg("Request finished")
		})
	}
	
	server := &http.Server{
		Addr:         ":8443",
		Handler:      loggingMiddleware(mux),
		TLSConfig:    tlsConfig,
		ReadTimeout:  5 * time.Second,
		WriteTimeout: 10 * time.Second,
	}
	
	logger.Info().Msg("Starting mTLS server on :8443")
	// We pass the server cert and key to ListenAndServeTLS.
	// The tls.Config handles the client-side verification.
	err = server.ListenAndServeTLS("../certs/server.crt", "../certs/server.key")
	if err != nil {
		logger.Fatal().Err(err).Msg("Server failed to start")
	}
}

func statusHandler(w http.ResponseWriter, r *http.Request) {
	// A simple handler that returns some status.
	// In a real app, this would perform business logic.
	w.Header().Set("Content-Type", "application/json")
	w.WriteHeader(http.StatusOK)
	response := map[string]string{"status": "ok"}
	json.NewEncoder(w).Encode(response)
}

A common mistake here is setting ClientAuth to tls.RequestClientCert or tls.VerifyClientCertIfGiven. These are weaker and do not enforce the zero-trust principle. RequireAndVerifyClientCert is the only correct option for this architecture. The log message now includes client.cn, directly linking a request to a cryptographic identity.

Step 2: The Swift Client - Navigating mTLS and Context Propagation

Getting the Swift client to correctly handle mTLS was more involved than the server. We needed to package the client certificate and its private key into a PKCS#12 (.p12) file, which is a common format for bundling these assets.

# Convert the client cert and key into a .p12 file for use in the Swift client
openssl pkcs12 -export -out client.p12 -inkey client.key -in client.crt -certfile ca.crt -name "Client Identity"
# You'll be prompted for an export password.

This .p12 file is then embedded into the macOS application’s bundle. The core of the Swift implementation lies in conforming to the URLSessionDelegate and implementing the urlSession(_:didReceive:completionHandler:) method. This is where the OS asks our application how to handle the authentication challenge from the server.

// file: mTLSClient/NetworkService.swift
import Foundation

class NetworkService: NSObject, URLSessionDelegate {

    private lazy var urlSession: URLSession = {
        return URLSession(configuration: .default, delegate: self, delegateQueue: nil)
    }()

    func makeRequest() {
        // This is a manual implementation of trace context propagation.
        // A real app should use the OpenTelemetry Swift SDK.
        let traceID = UUID().uuidString
        let spanID = String(UUID().uuidString.prefix(16))

        // Log the start of the operation locally
        log("Request starting", traceID: traceID, spanID: spanID)

        guard let url = URL(string: "https://localhost:8443/status") else {
            log("Invalid URL", traceID: traceID, spanID: spanID, level: "error")
            return
        }

        var request = URLRequest(url: url)
        request.httpMethod = "GET"
        // Inject the trace context into the request headers. This is the crucial link.
        request.setValue(traceID, forHTTPHeaderField: "X-Trace-Id")
        request.setValue(spanID, forHTTPHeaderField: "X-Span-Id")

        let task = urlSession.dataTask(with: request) { data, response, error in
            if let error = error {
                self.log("Request failed: \(error.localizedDescription)", traceID: traceID, spanID: spanID, level: "error")
                return
            }
            guard let httpResponse = response as? HTTPURLResponse, httpResponse.statusCode == 200 else {
                self.log("Received non-200 response", traceID: traceID, spanID: spanID, level: "warn")
                return
            }
            self.log("Request successful", traceID: traceID, spanID: spanID)
        }
        task.resume()
    }
    
    // This delegate method is the core of client-side mTLS in Swift.
    func urlSession(_ session: URLSession, didReceive challenge: URLAuthenticationChallenge, completionHandler: @escaping (URLSession.AuthChallengeDisposition, URLCredential?) -> Void) {
        
        // We only care about server trust challenges which require a client certificate.
        guard challenge.protectionSpace.authenticationMethod == NSURLAuthenticationMethodClientCertificate else {
            completionHandler(.performDefaultHandling, nil)
            return
        }

        // Load our PKCS#12 identity file from the app bundle.
        guard let identity = loadClientIdentity() else {
            log("Failed to load client identity from .p12 file.", level: "error")
            completionHandler(.cancelAuthenticationChallenge, nil)
            return
        }

        // Create a URLCredential object with our identity.
        let credential = URLCredential(identity: identity, certificates: nil, persistence: .forSession)
        
        // Provide the credential to the completion handler.
        completionHandler(.useCredential, credential)
    }

    private func loadClientIdentity() -> SecIdentity? {
        guard let p12Path = Bundle.main.path(forResource: "client", ofType: "p12") else {
            print("Error: client.p12 not found in bundle.")
            return nil
        }
        
        let p12Data = try! Data(contentsOf: URL(fileURLWithPath: p12Path))
        
        // The password used when exporting with `openssl pkcs12`.
        // In a real app, this MUST be stored securely, e.g., in the Keychain.
        let importOptions = [kSecImportExportPassphrase as String: "your_p12_password"]
        
        var items: CFArray?
        let status = SecPKCS12Import(p12Data as CFData, importOptions as CFDictionary, &items)
        
        guard status == errSecSuccess, let unwrappedItems = items else {
            print("Error importing .p12 file. Status: \(status)")
            return nil
        }

        let firstItem = (unwrappedItems as! Array<Dictionary<String, Any>>).first
        return firstItem?[kSecImportItemIdentity as String] as? SecIdentity
    }

    // A simple structured logger that outputs JSON to the console.
    private func log(_ message: String, traceID: String? = nil, spanID: String? = nil, level: String = "info") {
        var logDict: [String: Any] = [
            "@timestamp": ISO8601DateFormatter().string(from: Date()),
            "level": level,
            "message": message,
            "service.name": "swift-client"
        ]
        if let traceID = traceID { logDict["trace.id"] = traceID }
        if let spanID = spanID { logDict["span.id"] = spanID }

        if let jsonData = try? JSONSerialization.data(withJSONObject: logDict, options: .prettyPrinted),
           let jsonString = String(data: jsonData, encoding: .utf8) {
            print(jsonString)
        }
    }
}

The pitfall here is credential storage. Hardcoding the .p12 password as I have in the example is unacceptable for production. It must be stored in the macOS Keychain and retrieved at runtime. Another common error is mishandling the URLAuthenticationChallenge. If the logic doesn’t correctly identify the NSURLAuthenticationMethodClientCertificate method, the connection will fail with an obscure TLS error.

Step 3: Tying It All Together with the ELK Stack

With structured JSON logs flowing from both the client and server, the final piece was configuring our ELK stack. The goal is to parse these logs and make the trace.id a first-class, searchable field. This is done in the Logstash pipeline configuration.

# file: logstash/pipeline/app-pipeline.conf

input {
  # In a production setup, you would use Beats (Filebeat) to ship logs,
  # or a TCP input for direct logging from services.
  # For this example, we assume logs are being fed into this pipeline.
  # Let's use a simple TCP input for the Go service.
  tcp {
    port => 5044
    codec => json_lines
    type => "go_app_log"
  }
  
  # A separate input might exist for client logs if they were being shipped.
  # tcp { port => 5045 codec => json_lines type => "swift_client_log" }
}

filter {
  # The JSON codec already parses the log line.
  # We just need to ensure core fields are processed correctly.
  
  # The @timestamp field is automatically parsed from our JSON logs.
  # If it weren't, we would use the date filter:
  # date {
  #   match => [ "timestamp", "ISO8601" ]
  # }

  # Mutate filters can be used to clean up or rename fields if needed.
  # For example, if we wanted to standardize on ECS (Elastic Common Schema).
  mutate {
    rename => { "trace.id" => "[trace][id]" }
    rename => { "span.id" => "[span][id]" }
  }
}

output {
  elasticsearch {
    hosts => ["http://elasticsearch:9200"]
    index => "app-logs-%{+YYYY.MM.dd}"
    # A proper index template should be used to define mappings.
  }
  # For debugging the pipeline itself:
  # stdout { codec => rubydebug }
}

The most critical part for performance and usability in Elasticsearch is the index mapping. Without an explicit mapping, Elasticsearch will dynamically map trace.id as a text field, which is analyzed and tokenized. This is inefficient for exact-match lookups. We need it to be a keyword field.

// Elasticsearch Index Template
PUT _index_template/app_logs_template
{
  "index_patterns": ["app-logs-*"],
  "template": {
    "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 1
    },
    "mappings": {
      "properties": {
        "@timestamp": { "type": "date" },
        "message": { "type": "text" },
        "level": { "type": "keyword" },
        "service.name": { "type": "keyword" },
        "client.cn": { "type": "keyword" },
        "http.status": { "type": "long" },
        "duration.ms": { "type": "long" },
        "trace": {
          "properties": {
            "id": { "type": "keyword" }
          }
        },
        "span": {
          "properties": {
            "id": { "type": "keyword" }
          }
        }
      }
    }
  }
}

With this in place, a simple query in Kibana’s Discover tab for trace.id: "some-uuid-string" instantly returns all log entries for that transaction, chronologically ordered, from both the Swift client and the Go server. The full story of the request is finally visible.

sequenceDiagram
    participant SwiftClient as Swift Client
    participant GoServer as Go Backend
    participant ELK as ELK Stack

    SwiftClient->>SwiftClient: User action initiates request (traceID=abc)
    SwiftClient->>SwiftClient: log("Request starting", traceID=abc)
    Note right of SwiftClient: Log shipped to ELK
    SwiftClient->>ELK: {"message": "Request starting", "trace.id": "abc"}

    SwiftClient->>GoServer: HTTPS GET /status (Header: X-Trace-Id: abc)
    Note over SwiftClient,GoServer: mTLS Handshake (Client cert verified)

    GoServer->>GoServer: log("Request started", traceID=abc, client.cn=...)
    Note left of GoServer: Log shipped to ELK
    GoServer->>ELK: {"message": "Request started", "trace.id": "abc"}
    
    GoServer-->>GoServer: Process request
    
    GoServer->>GoServer: log("Request finished", traceID=abc, status=200)
    Note left of GoServer: Log shipped to ELK
    GoServer->>ELK: {"message": "Request finished", "trace.id": "abc"}
    
    GoServer-->>SwiftClient: 200 OK Response
    
    SwiftClient->>SwiftClient: log("Request successful", traceID=abc)
    Note right of SwiftClient: Log shipped to ELK
    SwiftClient->>ELK: {"message": "Request successful", "trace.id": "abc"}

The Ancillary Piece: An Admin UI with PostCSS

We needed a small internal web page to view the status of issued client certificates and potentially revoke them. A full-blown React/Vue frontend was overkill. We opted for a simple Go template served by a separate admin port on our backend service. However, writing vanilla CSS is tedious. This is where PostCSS provided a lightweight solution to use modern CSS features like nesting and auto-prefixing without a complex build system.

The setup was minimal:

package.json:

{
  "name": "admin-ui",
  "version": "1.0.0",
  "scripts": {
    "build:css": "postcss src/styles.css -o static/bundle.css"
  },
  "devDependencies": {
    "autoprefixer": "^10.4.16",
    "cssnano": "^6.0.1",
    "postcss": "^8.4.31",
    "postcss-cli": "^10.1.0",
    "postcss-nesting": "^12.0.1"
  }
}

postcss.config.js:

module.exports = {
  plugins: {
    'postcss-nesting': {},
    'autoprefixer': {},
    'cssnano': { preset: 'default' },
  },
};

An example source CSS file (src/styles.css):

body {
  font-family: system-ui, sans-serif;
  background-color: #f0f2f5;

  .container {
    max-width: 960px;
    margin: 2rem auto;
    padding: 1.5rem;
    background: white;
    box-shadow: 0 2px 4px rgba(0,0,0,0.1);

    h1 {
      color: #333;
      border-bottom: 1px solid #eee;
    }
  }
}

Running npm run build:css transforms this into browser-compatible, minified CSS. This approach kept the admin UI tooling separate and lightweight, fitting the pragmatic engineering goal of using the right tool for the job. It was a small but important part of the overall solution, demonstrating that even ancillary components benefit from a modern, maintainable workflow.

Lingering Issues and Future Work

This solution dramatically improved our debugging capabilities and security posture, but it’s not a final state. The most significant operational burden is certificate management. The manual generation process described here is untenable at scale. The immediate next step is to build an automated PKI system using Vault or SPIFFE/SPIRE to handle certificate issuance, renewal, and revocation. This would allow us to programmatically provision new clients and automatically rotate credentials, which is a cornerstone of a truly dynamic zero-trust environment.

Furthermore, shipping logs directly from every client is not always feasible due to network constraints and cost. We are investigating a client-side logging agent that can batch, compress, and intelligently sample logs before sending them to a dedicated intake gateway. This would reduce the load on our central ELK cluster and give us more control over the volume of data ingested from the client fleet. Finally, while logs and traces provide deep context, high-level service health monitoring requires metrics. The Go backend is the next candidate for adding a Prometheus exporter to track key SLIs like request latency, error rates, and the rate of successful vs. failed mTLS handshakes.


  TOC