Implementing End-to-End Distributed Tracing Across a Solid.js and Quarkus Application


The incident report was maddeningly vague: “The main dashboard’s data grid is sometimes slow to load after filtering.” Our backend team, running a suite of Quarkus microservices, saw nothing. P99 latencies were stable, CPU usage was nominal, and no errors were logged. Our frontend team, using Solid.js with Material-UI components, couldn’t replicate it consistently, and when they did, the browser’s profiler showed no long tasks or rendering bottlenecks. The problem existed in the void between the browser’s fetch call and the Quarkus controller entry point. We were flying blind, and the finger-pointing had begun. This is a classic observability gap. Isolated logs and metrics are insufficient; we needed a single, contiguous narrative for each user request.

The initial concept was to adopt distributed tracing. The goal was to generate a unique trace ID on the client the moment a user initiated an action, propagate that ID through every network call and service hop, and visualize the entire lifecycle in a tool like Jaeger. This would expose the hidden latency, whether it was in network transit, an intermediate proxy, or the initial request processing on the server.

Our stack presented a unique challenge. Quarkus has first-class support for OpenTelemetry, making the backend instrumentation straightforward. The real difficulty lay with Solid.js. Unlike React, it lacks a mature ecosystem of official OpenTelemetry instrumentation libraries. This meant we’d be responsible for manually bootstrapping the tracer, instrumenting user interactions, and, most critically, ensuring the trace context was correctly propagated via HTTP headers. We chose OpenTelemetry as our standard to avoid vendor lock-in and leverage its growing ecosystem, particularly the OTel Collector for flexible data processing and routing.

Backend Instrumentation with Quarkus

Getting the Quarkus side operational was the logical first step, as it provided a stable target for the frontend to send traces to. In a real-world project, establishing the backend contract first is paramount.

The required dependencies in the pom.xml are minimal, thanks to the Quarkus extension model.

<!-- pom.xml -->
<dependencies>
    <!-- Core Quarkus REST capabilities -->
    <dependency>
        <groupId>io.quarkus</groupId>
        <artifactId>quarkus-resteasy-reactive-jackson</artifactId>
    </dependency>

    <!-- Quarkus OpenTelemetry Extension -->
    <dependency>
        <groupId>io.quarkus</groupId>
        <artifactId>quarkus-opentelemetry</artifactId>
    </dependency>

    <!-- Optional: Allows for programmatic span creation -->
    <dependency>
        <groupId>io.opentelemetry</groupId>
        <artifactId>opentelemetry-api</artifactId>
    </dependency>

    <!-- Other necessary dependencies -->
    <dependency>
        <groupId>io.quarkus</groupId>
        <artifactId>quarkus-arc</artifactId>
    </dependency>
    <dependency>
        <groupId>io.quarkus</groupId>
        <artifactId>quarkus-junit5</artifactId>
        <scope>test</scope>
    </dependency>
    <dependency>
        <groupId>io.rest-assured</groupId>
        <artifactId>rest-assured</artifactId>
        <scope>test</scope>
    </dependency>
</dependencies>

Configuration is handled in application.properties. Here, we define the service name and point the OTLP exporter to the OTel Collector, which we’ll run locally in Docker. A common mistake is to hardcode production endpoints here; always use environment variables or profile-specific configurations for different stages.

# src/main/resources/application.properties

# Service identifier for traces
quarkus.application.name=product-service
quarkus.application.version=1.0.0

# Configure OpenTelemetry SDK
# Disable default behavior if no exporter is configured
quarkus.opentelemetry.sdk.disabled=false

# Configure the OTLP Exporter
# The endpoint of the OpenTelemetry Collector
quarkus.opentelemetry.tracer.exporter.otlp.endpoint=http://localhost:4317

# Forcing traces to always be sampled for development purposes.
# In production, this should be a probabilistic sampler, e.g., 'traceidratio'.
quarkus.opentelemetry.tracer.sampler.name=always_on

The always_on sampler is critical for development and debugging but would be financially ruinous in a high-traffic production environment. The default parentbased_always_on is often a better choice, as it ensures that if an incoming request is being traced (i.e., has a trace context header), the trace will continue.

With configuration complete, we can create a simple REST resource. The quarkus-opentelemetry extension automatically instruments JAX-RS endpoints, creating a server-side span for each incoming request. To demonstrate deeper traces, we can inject a service and use the @WithSpan annotation to create a child span for a specific business logic method.

// src/main/java/org/acme/inventory/InventoryService.java
package org.acme.inventory;

import io.opentelemetry.instrumentation.annotations.WithSpan;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import jakarta.enterprise.context.ApplicationScoped;
import java.util.concurrent.ThreadLocalRandom;

@ApplicationScoped
public class InventoryService {

    private static final Logger LOGGER = LoggerFactory.getLogger(InventoryService.class);

    @WithSpan("check-stock-level") // Creates a child span
    public int getStockLevel(String productId) {
        LOGGER.info("Checking inventory for product: {}", productId);
        try {
            // Simulate database call or external service lookup
            Thread.sleep(ThreadLocalRandom.current().nextInt(50, 150));
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
        }
        return ThreadLocalRandom.current().nextInt(1, 100);
    }
}
// src/main/java/org/acme/ProductResource.java
package org.acme;

import io.opentelemetry.api.trace.Span;
import jakarta.inject.Inject;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.PathParam;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.core.MediaType;
import org.acme.inventory.InventoryService;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.util.Map;

@Path("/api/products")
@Produces(MediaType.APPLICATION_JSON)
public class ProductResource {

    private static final Logger LOGGER = LoggerFactory.getLogger(ProductResource.class);

    @Inject
    InventoryService inventoryService;

    @GET
    @Path("/{id}")
    public Map<String, Object> getProductDetails(@PathParam("id") String id) {
        // The quarkus-opentelemetry extension automatically creates a span for this JAX-RS method.
        // We can access the current span to add custom attributes.
        Span currentSpan = Span.current();
        currentSpan.setAttribute("product.id", id);

        LOGGER.info("Fetching details for product ID: {}. TraceID: {}", id, currentSpan.getSpanContext().getTraceId());

        int stock = inventoryService.getStockLevel(id);
        currentSpan.setAttribute("inventory.stock", stock);

        return Map.of(
            "id", id,
            "name", "Solid.js Component Kit",
            "price", 99.99,
            "stock", stock
        );
    }
}

The key takeaway here is the automatic context propagation. When getProductDetails calls inventoryService.getStockLevel, the OpenTelemetry agent ensures the span created by @WithSpan is correctly parented to the main request span. The log message explicitly prints the trace ID, which is invaluable for correlating logs with traces during an investigation.

The Observability Pipeline: OTel Collector and Jaeger

Before touching the frontend, we needed a pipeline to receive, process, and store the trace data. A docker-compose.yml file is the most pragmatic way to manage this for local development.

# docker-compose.yml
version: '3.8'

services:
  # Jaeger for trace visualization
  jaeger:
    image: jaegertracing/all-in-one:1.48
    ports:
      - "16686:16686" # Jaeger UI
      - "14268:14268" # jaeger-collector http
      - "4317:4317"   # OTLP gRPC receiver
      - "4318:4318"   # OTLP HTTP receiver

  # OpenTelemetry Collector
  otel-collector:
    image: otel/opentelemetry-collector-contrib:0.87.0
    command: ["--config=/etc/otel-collector-config.yml"]
    volumes:
      - ./otel-collector-config.yml:/etc/otel-collector-config.yml
    ports:
      # We re-expose 4317 (gRPC) and 4318 (HTTP) for applications to send data to the collector.
      # The Quarkus app sends to 4317. The Solid.js app will send to 4318.
      - "4317:4317" 
      - "4318:4318"
    depends_on:
      - jaeger

The collector’s configuration is the brain of the operation. It defines how data is received (receivers), processed (processors), and sent (exporters). For this setup, we receive OTLP via both gRPC (for Quarkus) and HTTP (for the browser), batch the spans for efficiency, and export them to both Jaeger and the console for real-time debugging.

# otel-collector-config.yml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
        # We need to configure CORS for the browser-based OTLP/HTTP exporter
        cors:
          allowed_origins:
            - "http://localhost:3000" # Solid.js dev server
          allowed_headers:
            - "*"

exporters:
  logging:
    loglevel: debug

  # Export to Jaeger
  jaeger:
    endpoint: jaeger:14268
    tls:
      insecure: true

processors:
  batch:
    timeout: 100ms

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [logging, jaeger]

A critical pitfall to avoid is forgetting the CORS configuration for the OTLP/HTTP receiver. Without it, the browser will block all POST requests from the frontend tracer due to security policies, and you’ll get zero traces from the client with only a cryptic network error in the browser console.

Frontend Instrumentation: The Solid.js Challenge

This is where the real work began. We created a Vite + Solid.js project and added the necessary OpenTelemetry dependencies.

npm install @opentelemetry/api \
    @opentelemetry/sdk-trace-web \
    @opentelemetry/context-zone \
    @opentelemetry/instrumentation-fetch \
    @opentelemetry/exporter-trace-otlp-http \
    @opentelemetry/resources \
    @opentelemetry/semantic-conventions

We also used solid-material for the UI components.

The core of the frontend setup is a dedicated module, src/tracing.js, responsible for initializing the entire OTel SDK. This must be imported and executed before any other application code.

// src/tracing.js

import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { WebTracerProvider, BatchSpanProcessor } from '@opentelemetry/sdk-trace-web';
import { ZoneContextManager } from '@opentelemetry/context-zone';
import { FetchInstrumentation } from '@opentelemetry/instrumentation-fetch';
import { registerInstrumentations } from '@opentelemetry/instrumentation';
import { Resource } from '@opentelemetry/resources';
import { SemanticResourceAttributes } from '@opentelemetry/semantic-conventions';

const resource = new Resource({
    [SemanticResourceAttributes.SERVICE_NAME]: 'dashboard-ui',
    [SemanticResourceAttributes.SERVICE_VERSION]: '1.0.0',
});

// The collector endpoint for the browser's HTTP exporter
const collectorOptions = {
    url: 'http://localhost:4318/v1/traces', 
    headers: {},
};

const exporter = new OTLPTraceExporter(collectorOptions);
const provider = new WebTracerProvider({ resource: resource });

// BatchSpanProcessor is crucial for performance. It batches spans before sending.
provider.addSpanProcessor(new BatchSpanProcessor(exporter, {
    scheduledDelayMillis: 500,
}));

// ZoneContextManager is vital for web applications to correctly propagate
// context across asynchronous operations like promises and timers.
provider.register({
    contextManager: new ZoneContextManager(),
});

// Registering instrumentations automatically patches native browser APIs.
registerInstrumentations({
    instrumentations: [
        new FetchInstrumentation({
            // This config is essential. It tells the instrumentation to inject
            // the W3C Trace Context headers (traceparent, tracestate) into
            // outgoing requests to our backend.
            propagateTraceHeaderCorsUrls: [
                'http://localhost:8080'
            ],
        }),
    ],
});

export const tracer = provider.getTracer('dashboard-ui-tracer');

The most important pieces here are:

  1. ZoneContextManager: Without it, the async nature of fetch would cause the active span context to be lost, breaking the parent-child relationship between our custom “user-click” span and the automatic “fetch” span.
  2. FetchInstrumentation: This is the magic that connects frontend and backend. It automatically intercepts all fetch calls.
  3. propagateTraceHeaderCorsUrls: This is the explicit instruction to inject the traceparent header into requests destined for our Quarkus API. If this URL pattern doesn’t match, no header is sent, and the trace context is broken at the network boundary.

Next, we initialize this module in our application’s entry point.

// src/index.jsx

/* @refresh reload */
import { render } from 'solid-js/web';
import './tracing'; // IMPORTANT: Import and execute tracing setup first.
import App from './App';

render(() => <App />, document.getElementById('root'));

Finally, we create a Solid.js component that uses an MUI button to trigger an API call. Here, we’ll manually create a span to represent the user’s interaction. This gives us a complete view, starting from the user’s intent.

// src/App.jsx

import { createSignal } from 'solid-js';
import { tracer } from './tracing';
import { trace, context } from '@opentelemetry/api';

// Using solid-material for UI components
import { Button, Card, CardContent, Typography } from 'solid-material';

function App() {
    const [product, setProduct] = createSignal(null);
    const [error, setError] = createSignal(null);

    const fetchProduct = async () => {
        // Step 1: Create a parent span for the entire user interaction.
        const parentSpan = tracer.startSpan('user-clicks-fetch-product');
        
        // Step 2: Set this new span as the active context. Any subsequent spans
        // (like the one from FetchInstrumentation) will be its children.
        await context.with(trace.setSpan(context.active(), parentSpan), async () => {
            try {
                setProduct(null);
                setError(null);
                parentSpan.addEvent('Fetching product details initiated');
                const response = await fetch('http://localhost:8080/api/products/prod-123');

                if (!response.ok) {
                    throw new Error(`HTTP error! status: ${response.status}`);
                }

                const data = await response.json();
                setProduct(data);
                parentSpan.setAttribute('http.response.status_code', response.status);
                parentSpan.addEvent('Product data received and rendered');

            } catch (e) {
                setError(e.message);
                parentSpan.recordException(e);
                parentSpan.setStatus({ code: trace.SpanStatusCode.ERROR, message: e.message });
            } finally {
                // Step 3: Always end the span.
                parentSpan.end();
            }
        });
    };

    return (
        <main style={{ padding: '2rem' }}>
            <Typography variant="h4">Product Dashboard</Typography>
            <Button variant="contained" onClick={fetchProduct} style={{ margin: '1rem 0' }}>
                Load Product Details
            </Button>
            
            <Card variant="outlined">
                <CardContent>
                    {error() && <Typography color="error">Error: {error()}</Typography>}
                    {product() ? (
                        <div>
                            <Typography variant="h6">ID: {product().id}</Typography>
                            <Typography>Name: {product().name}</Typography>
                            <Typography>Price: ${product().price}</Typography>
                            <Typography>Stock: {product().stock}</Typography>
                        </div>
                    ) : (
                        <Typography>Click the button to load product data.</Typography>
                    )}
                </CardContent>
            </Card>
        </main>
    );
}

export default App;

The context.with() pattern is the correct way to manage span lifetimes around asynchronous operations. It ensures that the parentSpan remains active until the entire fetch and state update logic is complete.

Visualizing the Full Trace

With all components running (docker-compose up, mvn quarkus:dev, npm run dev), clicking the “Load Product Details” button now yields a complete, end-to-end trace in the Jaeger UI.

The flow can be visualized as follows:

sequenceDiagram
    participant User
    participant SolidJS_UI as Solid.js UI (Browser)
    participant OTel_Collector as OpenTelemetry Collector
    participant Quarkus_API as Quarkus REST API
    participant InventorySvc as Inventory Service

    User->>SolidJS_UI: Clicks 'Load Product' Button
    
    SolidJS_UI->>SolidJS_UI: tracer.startSpan('user-clicks-fetch-product')
    
    activate SolidJS_UI
        Note over SolidJS_UI: FetchInstrumentation intercepts fetch()
        SolidJS_UI->>OTel_Collector: POST /v1/traces (HTTP)
        SolidJS_UI->>Quarkus_API: GET /api/products/prod-123 (with traceparent header)
    deactivate SolidJS_UI
    
    activate Quarkus_API
        Note over Quarkus_API: OTel extension reads traceparent header
        Quarkus_API->>OTel_Collector: Export Span (gRPC)
        Quarkus_API->>InventorySvc: getStockLevel('prod-123')
        
        activate InventorySvc
            Note over InventorySvc: @WithSpan creates child span
            InventorySvc-->>Quarkus_API: returns stock level
        deactivate InventorySvc
        
        Quarkus_API->>OTel_Collector: Export Span (gRPC)
        Quarkus_API-->>SolidJS_UI: 200 OK (Product JSON)
    deactivate Quarkus_API
    
    activate SolidJS_UI
        SolidJS_UI->>SolidJS_UI: Updates component state
        SolidJS_UI->>OTel_Collector: POST /v1/traces (HTTP)
    deactivate SolidJS_UI

In Jaeger, this appears as a single trace waterfall, clearly showing the duration of each step:

  1. user-clicks-fetch-product (from our manual Solid.js span)
    • HTTP GET (from FetchInstrumentation, child of the above)
      • GET /api/products/{id} (from Quarkus JAX-RS instrumentation, child of the fetch span)
        • check-stock-level (from our Quarkus @WithSpan annotation, child of the JAX-RS span)

The original problem of “sometimes slow” could now be diagnosed precisely. We could see if the time was spent in the initial fetch span (indicating network latency) or within one of the backend spans (indicating an application performance issue). The blind spot was gone.

The primary limitation of this implementation is its reliance on manual instrumentation for user interactions within Solid.js. A more mature ecosystem would provide an instrumentation library that could automatically create spans for component lifecycle events (mount, update) or router transitions. Furthermore, our current setup does not employ any sampling strategy. In a production environment, tracing every single request would be prohibitively expensive. The next logical iteration would involve implementing a probabilistic or tail-based sampling strategy in the OTel Collector to capture representative data without overwhelming the system or the budget.


  TOC