Building a Client-Initiated Distributed Tracing System with React Native MobX and Envoy Proxy

Observability

Word Count: 2.6k

Read Times: 16 Min

Debugging intermittent performance degradation in our React Native application was a recurring nightmare. A user reports a slow screen transition after tapping a button, but the backend metrics—CPU, memory, database query times—all look nominal. The critical gap was correlation. We had client-side logs and backend traces, but no definitive way to link a specific user interaction within the mobile app to the cascade of microservice calls it triggered. The entire flow, from the user’s tap to the final database commit, was a frustrating black box. The root of the problem was that our observability story began at the edge of our infrastructure, not at the true origin of the request: the client itself.

Our initial concept was to invert the traditional tracing model. Instead of the first-hop service generating a trace ID, the React Native client would become the originator of the trace context. Every significant, user-initiated business process, encapsulated as a MobX action, would create the root span of a new distributed trace. This context, containing a unique traceId and a root spanId, would then be systematically injected into the headers of every subsequent API call made during that action’s execution. This approach would allow us to create a single, unified trace that visualizes the entire lifecycle of a user interaction, from the mobile client’s state change through the API Gateway and across all downstream services.

The technology selection was critical for this to work without adding significant overhead.

MobX: Its architecture is uniquely suited for this task. The concept of actions provides a clean, centralized point of interception. We can wrap or monitor every action to initiate and tear down a trace context, effectively binding the trace’s lifecycle to the business logic’s lifecycle.
React Native: As the client platform, the main challenge was implementing a reliable, non-intrusive mechanism to inject headers into every outgoing network request, regardless of which component or service initiated it.
Envoy Proxy: Using Envoy as our API Gateway was the linchpin. Its first-class, built-in support for distributed tracing is far superior to implementing custom logic in each backend service. Envoy can intelligently parse incoming trace headers (like B3 or W3C Trace Context), participate in the trace by creating its own spans, and propagate the context correctly to upstream services. This keeps the tracing concern out of the application logic of the microservices.

Phase 1: Instrumenting the Client-Side Trace Origin

The first step is to establish a mechanism on the client to generate and manage the trace context. We need a way to know when a user-initiated action begins and ends. A dedicated MobX store is the cleanest way to manage this state globally.

The `TraceStore`

This store will hold the active trace and span IDs. It’s intentionally simple, designed to hold the context for the currently executing root action.

// src/stores/TraceStore.ts

import { makeAutoObservable, runInAction } from 'mobx';
import 'react-native-get-random-values';
import { v4 as uuidv4 } from 'uuid';

// Generates a 16-character hex string for B3-style IDs
const generateTraceIdPart = (): string => {
  const buffer = new Uint8Array(8);
  crypto.getRandomValues(buffer);
  return Array.from(buffer)
    .map((b) => b.toString(16).padStart(2, '0'))
    .join('');
};

interface TraceContext {
  traceId: string;
  spanId: string;
}

class TraceStore {
  activeTraceContext: TraceContext | null = null;

  constructor() {
    makeAutoObservable(this);
  }

  /**
   * Starts a new trace, generating new Trace and Span IDs.
   * This should be called at the beginning of a root MobX action.
   * @param {string} actionName - The name of the action for context.
   */
  startTrace(actionName: string): TraceContext {
    const newContext = {
      traceId: generateTraceIdPart() + generateTraceIdPart(), // 32-char hex
      spanId: generateTraceIdPart(), // 16-char hex
    };

    console.log(`[TraceStore] Starting trace for action "${actionName}": ${newContext.traceId}`);

    runInAction(() => {
      this.activeTraceContext = newContext;
    });

    return newContext;
  }

  /**
   * Clears the active trace context.
   * This should be called at the end of a root MobX action.
   */
  endTrace() {
    if (this.activeTraceContext) {
      console.log(`[TraceStore] Ending trace: ${this.activeTraceContext.traceId}`);
      runInAction(() => {
        this.activeTraceContext = null;
      });
    }
  }

  get currentContext(): TraceContext | null {
    return this.activeTraceContext;
  }
}

export const traceStore = new TraceStore();

A key decision here is the format of the trace IDs. We’re generating B3-compatible hex-encoded IDs (traceId is 32 hex chars, spanId is 16) because it’s widely supported, especially by Envoy and Zipkin.

Intercepting MobX Actions

Now we need to automatically call startTrace and endTrace. We can create a higher-order function (or a decorator) that wraps our MobX store actions. This keeps the tracing logic separate from the business logic.

// src/utils/traceAction.ts

import { traceStore } from '../stores/TraceStore';

/**
 * A higher-order function that wraps a MobX action to provide
 * automatic start and end tracing.
 *
 * In a real-world project, this would be more robust, potentially using
 * MobX's `onAction` for a more global interception approach, but this
 * explicit wrapper is clearer for demonstration.
 *
 * @param {string} actionName - A descriptive name for the action, used for logging.
 * @param {Function} fn - The asynchronous action function to be wrapped.
 * @returns {Function} The wrapped function.
 */
export function traceAction<T extends any[], R>(
  actionName: string,
  fn: (...args: T) => Promise<R>
): (...args: T) => Promise<R> {
  return async function (...args: T): Promise<R> {
    // Avoid nested traces. If a trace is already active, run inside it.
    if (traceStore.currentContext) {
      console.warn(`[traceAction] Action "${actionName}" called within an existing trace. Reusing context.`);
      return await fn(...args);
    }

    traceStore.startTrace(actionName);
    try {
      // The core business logic is executed here
      const result = await fn(...args);
      return result;
    } catch (error) {
      // Ensure the trace context is cleared even if the action fails.
      console.error(`[traceAction] Error in action "${actionName}":`, error);
      throw error;
    } finally {
      traceStore.endTrace();
    }
  };
}

// Example usage in a MobX store
class UserStore {
  // ... other properties

  constructor() {
    // ...
    this.fetchUserProfile = traceAction('fetchUserProfile', this.fetchUserProfile.bind(this));
  }

  async fetchUserProfile(userId: string) {
    // ... actual implementation to fetch data
  }
}

The traceAction wrapper is the core of our client-side instrumentation. It ensures that a trace context exists only for the duration of the top-level asynchronous operation initiated by the user. The check for traceStore.currentContext prevents nested actions from overwriting the root trace ID.

Phase 2: Propagating Context via Network Requests

With the trace context managed, we now need to inject it into every outgoing API request. An axios interceptor is the perfect tool for this. It’s a single point of configuration that affects all requests made with that axios instance.

// src/api/apiClient.ts

import axios from 'axios';
import { traceStore } from '../stores/TraceStore';

const apiClient = axios.create({
  baseURL: 'http://<your-envoy-gateway-ip>:8080', // Point to the Envoy Gateway
  timeout: 10000,
});

// Request Interceptor to add Trace Headers
apiClient.interceptors.request.use(
  (config) => {
    const context = traceStore.currentContext;

    if (context) {
      // B3 Propagation Headers
      // These headers will be picked up by Envoy to continue the trace.
      config.headers['X-B3-TraceId'] = context.traceId;
      config.headers['X-B3-SpanId'] = context.spanId;
      config.headers['X-B3-Sampled'] = '1'; // Signal that this trace should be recorded

      console.log(`[APIClient] Injecting trace context: ${context.traceId}`);
    } else {
      // In a production app, you might want to generate a request ID even
      // for requests not part of a user-initiated trace.
      // For now, we log a warning.
      console.warn(`[APIClient] Making a request outside of a trace context for URL: ${config.url}`);
    }

    return config;
  },
  (error) => {
    // This part handles errors that occur when setting up the request
    console.error('[APIClient] Request Interceptor Error:', error);
    return Promise.reject(error);
  }
);

// Response Interceptor for logging (optional, but good for debugging)
apiClient.interceptors.response.use(
  (response) => {
    return response;
  },
  (error) => {
    // Log detailed error information
    if (error.response) {
      console.error(
        `[APIClient] Response Error: ${error.response.status}`,
        error.response.data
      );
    } else if (error.request) {
      console.error('[APIClient] No response received:', error.request);
    } else {
      console.error('[APIClient] Error setting up request:', error.message);
    }
    return Promise.reject(error);
  }
);

export default apiClient;

Now, any code in the React Native application that imports and uses apiClient will automatically have the B3 trace headers attached, provided the request is made from within a function wrapped by traceAction.

Phase 3: Configuring Envoy Proxy as the Tracing-Aware API Gateway

This is where the client-side work connects to the infrastructure. Envoy will act as the API Gateway, receiving requests from the mobile app, participating in the trace, and forwarding the request with the correct headers to the appropriate backend service.

Here is a complete, production-grade envoy.yaml configuration. It sets up a listener on port 8080, routes traffic to two different backend services (user_service and order_service), and configures Zipkin tracing.

# envoy.yaml
static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 8080
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: ingress_http
          
          # This tells Envoy to generate a request ID if one isn't present.
          # More importantly, it respects incoming x-request-id headers.
          generate_request_id: true
          
          # Tracing Configuration
          tracing:
            provider:
              name: envoy.tracers.zipkin
              typed_config:
                "@type": type.googleapis.com/envoy.config.trace.v3.ZipkinConfig
                collector_cluster: zipkin_cluster
                collector_endpoint: "/api/v2/spans"
                collector_endpoint_version: HTTP_JSON
                # This is critical. It ensures Envoy joins the trace initiated by the client.
                shared_span_context: false
            # We can add custom tags to our spans for better context
            custom_tags:
              - tag: "envoy.node.id"
                literal:
                  value: "api-gateway-node"
              - tag: "guid:x-request-id"
                request_header:
                  name: "x-request-id"
                  default_value: "unknown"

          route_config:
            name: local_route
            virtual_hosts:
            - name: local_service
              domains: ["*"]
              routes:
              - match:
                  prefix: "/users"
                route:
                  cluster: user_service
                  # Automatically retry on 5xx errors
                  retry_policy:
                    retry_on: 5xx
                    num_retries: 3
                    per_try_timeout: 2s
              - match:
                  prefix: "/orders"
                route:
                  cluster: order_service
          http_filters:
          - name: envoy.filters.http.router
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router

  clusters:
  - name: user_service
    connect_timeout: 5s
    type: LOGICAL_DNS
    # For Docker Compose, 'userservice' is the name of the service container.
    load_assignment:
      cluster_name: user_service
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: userservice
                port_value: 3001
  - name: order_service
    connect_timeout: 5s
    type: LOGICAL_DNS
    load_assignment:
      cluster_name: order_service
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: orderservice
                port_value: 3002
  - name: zipkin_cluster
    connect_timeout: 1s
    type: LOGICAL_DNS
    lb_policy: ROUND_ROBIN
    load_assignment:
      cluster_name: zipkin_cluster
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: zipkin
                port_value: 9411

The most important section is tracing.

provider: We configure it to use the zipkin tracer.
collector_cluster: This tells Envoy where to send the trace data (to our zipkin_cluster definition).
shared_span_context: false: This is a subtle but crucial setting. When false, Envoy respects the incoming X-B3-SpanId as the parent of the span it creates. This ensures the Envoy span is correctly nested under the client’s root span in the trace hierarchy. If it were true, Envoy would reuse the incoming span ID, breaking the parent-child relationship.

To run this, you would typically use Docker Compose to orchestrate Envoy, Zipkin, and the backend services.

sequenceDiagram
    participant RN as React Native App
    participant MobX as MobX Store
    participant Envoy as Envoy API Gateway
    participant UserSvc as User Service
    participant OrderSvc as Order Service
    participant Zipkin

    RN->>MobX: User taps button, calls traced action `loadDashboard()`
    MobX->>MobX: `traceStore.startTrace()` -> Generates TraceID: A, SpanID: B
    RN->>Envoy: GET /users/123 (Headers: X-B3-TraceId: A, X-B3-SpanId: B)
    Envoy->>Envoy: Receives request, creates new SpanID: C (Parent: B)
    Envoy->>Zipkin: Sends span data (TraceID: A, SpanID: C, ParentID: B)
    Envoy->>UserSvc: GET /users/123 (Headers: X-B3-TraceId: A, X-B3-SpanId: C)
    
    UserSvc->>UserSvc: Processing...
    UserSvc->>Envoy: GET /orders?userId=123 (Headers: X-B3-TraceId: A, X-B3-SpanId: C)
    Envoy->>Envoy: Receives request, creates new SpanID: D (Parent: C)
    Envoy->>Zipkin: Sends span data (TraceID: A, SpanID: D, ParentID: C)
    Envoy->>OrderSvc: GET /orders?userId=123 (Headers: X-B3-TraceId: A, X-B3-SpanId: D)
    
    OrderSvc-->>Envoy: 200 OK (Order Data)
    Envoy-->>UserSvc: 200 OK (Order Data)
    UserSvc-->>Envoy: 200 OK (User + Order Data)
    Envoy-->>RN: 200 OK
    RN->>MobX: `loadDashboard()` completes
    MobX->>MobX: `traceStore.endTrace()` -> Clears context

Phase 4: Propagating Headers in Backend Services

The final piece of the puzzle is ensuring that backend services continue to propagate the trace headers when they call each other. This is crucial for building a complete trace graph. Here are two simple Express.js services demonstrating this.

// userservice/index.js
const express = require('express');
const axios = require('axios');
const app = express();
const port = 3001;

const B3_HEADERS = [
  'x-request-id',
  'x-b3-traceid',
  'x-b3-spanid',
  'x-b3-parentspanid',
  'x-b3-sampled',
  'x-b3-flags',
];

app.get('/users/:id', async (req, res) => {
  console.log(`[UserService] Received request for user ${req.params.id}`);
  console.log('[UserService] Incoming headers:', req.headers);

  // Propagate trace headers for the downstream call
  const propagatedHeaders = {};
  B3_HEADERS.forEach(header => {
    if (req.headers[header]) {
      propagatedHeaders[header] = req.headers[header];
    }
  });

  try {
    // This call goes back through Envoy to reach the order service
    // In a real K8s setup, this would be `http://orderservice:3002`
    // but routing through the gateway ensures tracing continues seamlessly.
    const ordersResponse = await axios.get(`http://envoy:8080/orders?userId=${req.params.id}`, {
      headers: propagatedHeaders,
    });
    
    res.json({
      userId: req.params.id,
      name: 'John Doe',
      orders: ordersResponse.data,
    });
  } catch (error) {
    console.error('[UserService] Failed to fetch orders:', error.message);
    res.status(500).send('Error fetching orders');
  }
});

app.listen(port, () => {
  console.log(`User service listening on port ${port}`);
});

// orderservice/index.js
const express = require('express');
const app = express();
const port = 3002;

app.get('/orders', (req, res) => {
  const userId = req.query.userId;
  console.log(`[OrderService] Received request for orders for user ${userId}`);
  console.log('[OrderService] Incoming headers:', req.headers);

  // In a real app, this would query a database.
  res.json([
    { orderId: 'abc-123', amount: 100 },
    { orderId: 'def-456', amount: 250 },
  ]);
});

app.listen(port, () => {
  console.log(`Order service listening on port ${port}`);
});

The critical part is the propagatedHeaders logic in userservice. It extracts all relevant B3 headers from the incoming request and forwards them on its outgoing request to the orderservice. This disciplined propagation is what links the service-to-service calls together under the same trace.

When this entire system is running, a single tap in the React Native app produces a beautiful, comprehensive trace in Zipkin. The trace starts with a root span representing the MobX action, followed by a child span from Envoy, which in turn has a child span from the userservice, which itself has a child span for the call to the orderservice. The black box is gone. We can now pinpoint latency at any stage of the process, whether it’s network latency between the client and the gateway, slow processing in a service, or a delay in a downstream dependency.

This architecture, however, is not without its limitations and areas for future refinement. The current client-side context management is simple and handles a single active action at a time. It does not gracefully handle concurrent, overlapping user actions, which could lead to context clashes. A more advanced implementation might explore a solution similar to AsyncLocalStorage for Node.js, but adapted for the React Native environment, to provide more robust context isolation. Furthermore, this system currently traces all initiated actions. For a high-traffic application, this would be prohibitively expensive. The next logical iteration would be to implement client-side sampling, allowing the mobile app to decide intelligently (e.g., based on a percentage or specific user IDs) which actions should generate a full trace, thus controlling the volume of data sent to the observability backend.