Implementing a High-Frequency Data Visualization Component with Rust-WASM and Storybook Backed by a BentoML API on DigitalOcean

WebAssembly

Word Count: 2.7k

Read Times: 16 Min

Our internal developer platform’s observability dashboard was hitting a wall. The real-time log analysis component, responsible for visualizing anomaly scores from a machine learning model, would consistently freeze the main UI thread. It was a classic case of JavaScript struggling to keep up with a high-velocity data stream—around 2,000 log entries per second. Downsampling the data was not an option as it would mean losing critical resolution for incident analysis. The initial implementation, a fairly standard React component using a popular charting library, was clearly not designed for this throughput. The performance bottleneck was the repetitive, CPU-intensive work on the main thread: deserializing the JSON payload, parsing raw log strings, running some light aggregation logic, and then feeding structured data to the rendering layer.

Our first thought was a Web Worker to offload the parsing and aggregation. This is a valid approach, but past experiences with complex state synchronization and memory management between the main thread and workers made us cautious. Furthermore, we wanted predictable, consistent performance without the potential whims of a JavaScript engine’s JIT compiler and garbage collector under sustained, heavy load. This pointed us toward WebAssembly. Rust, with its first-class WASM support, performance characteristics, and lack of a runtime garbage collector, became the primary candidate for rewriting the component’s computational core. The goal was to build a system where the React/Storybook component acts as a thin presentation layer, delegating all heavy data manipulation to a highly optimized Rust-compiled WASM module.

The architecture was laid out as follows:

A Rust Crate would contain the core data processing logic. It would expose functions to accept raw log data batches and return structured, render-ready data.
A React Component, developed in isolation using Storybook, would manage the WebSocket connection, load and instantiate the WASM module, and handle the rendering loop.
A BentoML Service would serve the Python-based anomaly detection model, which processes log streams and pushes results to connected clients.
DigitalOcean would be the deployment target, with the BentoML service running as a container in their Kubernetes service (DOKS) and the static Storybook build hosted on the App Platform.

graph TD
    subgraph Browser
        A[React Component in Storybook] -- Raw Log Batch --> B{Rust/WASM Module};
        B -- Processed Visualization Data --> A;
    end
    subgraph DigitalOcean DOKS
        C[BentoML Service Pod] -- WebSocket Stream --> D{Load Balancer};
    end
    A -- Establishes Connection --> D;

    subgraph Data Source
        E[Log Aggregator] --> C;
    end

The Rust Computational Core

The first step was to define the data contracts and processing logic in Rust. In a real-world project, a common mistake is to tightly couple the Rust logic to JavaScript-specific types from the start. A more maintainable approach is to define pure Rust structs and functions first, then write a wasm-bindgen wrapper layer. This allows the core logic to be unit-tested independently of any WASM context.

We use serde for serialization and wasm-bindgen to handle the JS-Rust interop.

cargo.toml dependencies:

[package]
name = "log-processor"
version = "0.1.0"
edition = "2021"

[lib]
crate-type = ["cdylib"]

[dependencies]
wasm-bindgen = "0.2.87"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
console_error_panic_hook = "0.1.7"
wasm-logger = "0.2.0"

# The `wasm-bindgen` feature is used to serialize/deserialize structs to JsValue
[dependencies.wasm-bindgen-futures]
version = "0.4.37"

Core Rust Logic (src/lib.rs):

The processor needs to handle incoming raw log entries, parse them, and aggregate them into a format suitable for a time-series visualization.

use wasm_bindgen::prelude::*;
use serde::{Deserialize, Serialize};

// This function is called from JS to set up logging and panic hooks.
// It's a good practice for debugging WASM modules.
#[wasm_bindgen(start)]
pub fn init_hooks() {
    console_error_panic_hook::set_once();
    wasm_logger::init(wasm_logger::Config::default());
    log::info!("Rust WASM module initialized with hooks.");
}

// Input structure from the BentoML service via WebSocket.
#[derive(Serialize, Deserialize, Debug)]
pub struct RawLogEntry {
    pub timestamp: u64, // Unix epoch milliseconds
    pub level: String,
    pub message: String,
    pub anomaly_score: f64,
}

// Output structure optimized for rendering.
// We are aggregating counts per second for different levels.
#[derive(Serialize, Deserialize, Debug, Default)]
pub struct VisualizationPoint {
    pub timestamp_sec: u64,
    pub info_count: u32,
    pub warn_count: u32,
    pub error_count: u32,
    pub high_anomaly_count: u32,
}

// Main processing function exposed to JavaScript.
// It takes a JsValue which is expected to be a JSON string array of RawLogEntry.
// It returns a JsValue which is a JSON string of a single aggregated VisualizationPoint.
// The pitfall here is performance: passing large strings and serializing/deserializing
// them on every call adds overhead. For ultra-high performance, one would
// investigate shared memory with SharedArrayBuffer, but for this use case, batching
// makes string serialization acceptable.
#[wasm_bindgen]
pub fn process_log_batch(logs_js_value: JsValue) -> Result<JsValue, JsValue> {
    // Deserialize from JavaScript value to a Rust vector.
    // Error handling is critical at the boundary. Instead of panicking with .unwrap(),
    // we map the error to a JsValue to be handled gracefully in the JavaScript caller.
    let logs: Vec<RawLogEntry> = serde_wasm_bindgen::from_value(logs_js_value)
        .map_err(|e| JsValue::from_str(&format!("Deserialization error: {}", e)))?;

    if logs.is_empty() {
        return Ok(serde_wasm_bindgen::to_value(&VisualizationPoint::default()).unwrap());
    }

    // Use the timestamp of the first log, rounded to the second, as the aggregation key.
    let initial_timestamp_sec = logs[0].timestamp / 1000;
    let mut point = VisualizationPoint {
        timestamp_sec: initial_timestamp_sec,
        ..Default::default()
    };

    for log in logs {
        match log.level.as_str() {
            "INFO" => point.info_count += 1,
            "WARN" => point.warn_count += 1,
            "ERROR" => point.error_count += 1,
            _ => (), // Ignore other levels
        }
        
        // Count logs with a high anomaly score
        if log.anomaly_score > 0.95 {
            point.high_anomaly_count += 1;
        }
    }
    
    // Serialize the result back to a JavaScript value.
    serde_wasm_bindgen::to_value(&point)
        .map_err(|e| JsValue::from_str(&format!("Serialization error: {}", e)))
}

To build this, we use wasm-pack:
wasm-pack build --target web --out-name log_processor --out-dir ../src/wasm_modules

This command compiles the Rust code into a WASM binary and generates the necessary JavaScript glue code, placing it inside our frontend project’s directory.

The BentoML Service Backend

The data science team provided a Python model for anomaly detection. BentoML makes serving this model straightforward. We define a service that exposes a WebSocket endpoint. For this post-mortem, we’ll simulate the model’s behavior.

bentofile.yaml:
This file is the heart of a BentoML project. It defines the service, runners (where models run), and the API endpoints.

service: "log_anomaly_service.py:svc"
description: "A service for real-time log anomaly detection streaming."
labels:
  owner: platform-team
  stage: production
include:
  - "*.py"
python:
  packages:
    - bentoml
    - websockets
    - asyncio
    - names
    - random
# In a real project, you'd have a `models` section here
# referencing a model saved with `bentoml.xgboost.save_model(...)` for instance.

log_anomaly_service.py:
This file defines the API service logic. We’ll create a generator that produces fake log data and a WebSocket endpoint to stream it.

import bentoml
import asyncio
import json
import random
import time
import logging
from typing import AsyncGenerator, List, Dict, Any

# Configure logging for the service
logging.basicConfig(level=logging.INFO)

# A mock model runner. In a real application, this would encapsulate
# a ML model and its inference logic.
# The @bentoml.runner.Runner decorator allows this class to be run in a separate
# process, isolating it from the API server for better performance and scalability.
@bentoml.runner.Runner
class AnomalyModelRunner:
    @bentoml.runner.method(batchable=True, batch_dim=0)
    def predict(self, log_messages: List[str]) -> List[float]:
        # Simulate model inference. In reality, this would involve
        # vectorizing text and running it through a model.
        return [random.uniform(0.0, 1.0) for _ in log_messages]

# Create an instance of the runner
anomaly_runner = AnomalyModelRunner()

# Define the BentoML service. We are associating the runner with this service.
svc = bentoml.Service("log_anomaly_streamer", runners=[anomaly_runner])

# A helper to generate realistic but fake log data.
async def log_generator() -> AsyncGenerator[Dict[str, Any], None]:
    levels = ["INFO", "INFO", "INFO", "WARN", "ERROR"]
    messages = [
        "User logged in successfully",
        "Data processing job started",
        "Cache miss for key: user:123",
        "Database connection pool nearing capacity",
        "Failed to process payment: timeout",
        "Unauthenticated access attempt detected",
    ]
    while True:
        await asyncio.sleep(0.01) # Control the rate of generation
        yield {
            "timestamp": int(time.time() * 1000),
            "level": random.choice(levels),
            "message": random.choice(messages),
        }

# Define the WebSocket API endpoint using BentoML's low-level AsgiApp support.
# BentoML doesn't have a high-level decorator for WebSockets yet, so we define an
# ASGI app. A common mistake is to block the event loop in such an app; everything
# must be `async`.
@svc.asgi_app
def websocket_streamer(app):
    from starlette.applications import Starlette
    from starlette.routing import WebSocketRoute
    from starlette.websockets import WebSocket

    async def websocket_endpoint(websocket: WebSocket):
        await websocket.accept()
        logging.info("Client connected to WebSocket.")
        try:
            batch_size = 100
            log_batch: List[Dict[str, Any]] = []
            
            async for log_data in log_generator():
                log_batch.append(log_data)

                if len(log_batch) >= batch_size:
                    # Get just the messages for model prediction
                    messages_to_score = [log["message"] for log in log_batch]
                    
                    # Call the runner asynchronously to get anomaly scores.
                    # BentoML's runner system handles batching automatically
                    # if multiple requests arrive simultaneously.
                    scores = await anomaly_runner.predict.async_run(messages_to_score)

                    # Combine scores back with original log data
                    for i, log in enumerate(log_batch):
                        log["anomaly_score"] = scores[i]
                    
                    # Send the batch to the client
                    await websocket.send_text(json.dumps(log_batch))
                    
                    # Clear the batch
                    log_batch = []

        except Exception as e:
            logging.error(f"WebSocket Error: {e}")
        finally:
            logging.info("Client disconnected.")

    return Starlette(routes=[
        WebSocketRoute("/stream", websocket_endpoint)
    ])

To containerize this, we run bentoml build. This packages the service and its dependencies into a “Bento.” Then, we generate a Dockerfile: bentoml containerize log_anomaly_streamer:latest. This produces a production-ready Docker image.

Storybook Integration and React Component

Now we connect the pieces in the frontend. We use Storybook to develop the RealTimeLogChart component in isolation. This is critical for performance tuning without the noise of the full application.

Component file (src/components/RealTimeLogChart.tsx):

import React, { useState, useEffect, useRef } from 'react';

// This type should match the Rust struct `VisualizationPoint`
type VisualizationPoint = {
  timestamp_sec: number;
  info_count: number;
  warn_count: number;
  error_count: number;
  high_anomaly_count: number;
};

// This type matches the Rust `RawLogEntry`
type RawLogEntry = {
    timestamp: number;
    level: string;
    message: string;
    anomaly_score: number;
};

// Dynamically import the WASM processing functions.
type WasmProcessor = {
  process_log_batch: (logs: RawLogEntry[]) => VisualizationPoint;
  // Note: we're simplifying the JsValue handling in this type definition.
  // The actual function returns a promise that resolves with the result.
};

interface RealTimeLogChartProps {
  websocketUrl: string;
}

export const RealTimeLogChart: React.FC<RealTimeLogChartProps> = ({ websocketUrl }) => {
  const [data, setData] = useState<VisualizationPoint[]>([]);
  const [isConnected, setIsConnected] = useState(false);
  const wasmProcessor = useRef<WasmProcessor | null>(null);
  const dataBuffer = useRef<VisualizationPoint[]>([]);

  useEffect(() => {
    // Asynchronously load the WASM module.
    // A common mistake is to load this synchronously, blocking the initial render.
    const loadWasm = async () => {
      try {
        const wasm = await import('../../wasm_modules/log_processor');
        await wasm.default(); // This initializes the wasm module.
        wasm.init_hooks(); // This sets up logging and panic hooks.
        wasmProcessor.current = wasm as WasmProcessor;
        console.log("WASM module loaded successfully.");
      } catch (error) {
        console.error("Failed to load WASM module:", error);
      }
    };
    loadWasm();
  }, []);

  useEffect(() => {
    if (!websocketUrl || !wasmProcessor.current) {
      return;
    }

    const ws = new WebSocket(websocketUrl);
    
    ws.onopen = () => {
      console.log("WebSocket connected");
      setIsConnected(true);
    };

    ws.onmessage = (event) => {
      if (!wasmProcessor.current) return;
      
      try {
        const rawLogs: RawLogEntry[] = JSON.parse(event.data);
        
        // This is the critical handoff.
        // We pass the raw data to Rust. All heavy lifting happens off the main thread.
        const processedPoint = wasmProcessor.current.process_log_batch(rawLogs);

        // Batch updates to React state to avoid excessive re-renders.
        // This is another crucial performance optimization.
        dataBuffer.current.push(processedPoint);

      } catch (error) {
        console.error("Error processing message:", error);
      }
    };
    
    ws.onerror = (error) => {
      console.error("WebSocket error:", error);
    };

    ws.onclose = () => {
      console.log("WebSocket disconnected");
      setIsConnected(false);
    };

    // Set up a timer to flush the buffer and update React state.
    // This decouples data processing from rendering frequency.
    const intervalId = setInterval(() => {
      if (dataBuffer.current.length > 0) {
        setData(prevData => {
          const newData = [...prevData, ...dataBuffer.current];
          dataBuffer.current = [];
          // Keep only the last 300 data points to prevent memory leaks.
          return newData.slice(-300);
        });
      }
    }, 500); // Update the chart every 500ms.

    // Cleanup function
    return () => {
      clearInterval(intervalId);
      ws.close();
    };
  }, [websocketUrl, wasmProcessor.current]);

  // A simple rendering implementation. In a real project, this would be a
  // sophisticated charting library like D3 or Chart.js, fed with `data`.
  return (
    <div style={{ padding: '1rem', border: '1px solid #ccc', fontFamily: 'monospace' }}>
      <h2>Real-Time Log Analysis</h2>
      <p>Status: {isConnected ? 'Connected' : 'Disconnected'}</p>
      <p>Data Points: {data.length}</p>
      <div style={{ height: '200px', overflowY: 'scroll', background: '#f5f5f5' }}>
        {data.slice().reverse().map((point) => (
          <div key={point.timestamp_sec}>
            {new Date(point.timestamp_sec * 1000).toISOString()}: 
            INFO({point.info_count}), 
            WARN({point.warn_count}), 
            ERR({point.error_count}), 
            ANOM({point.high_anomaly_count})
          </div>
        ))}
      </div>
    </div>
  );
};

Storybook story (src/components/RealTimeLogChart.stories.tsx):
This story allows us to test the component against a real (or mocked) WebSocket endpoint.

import React from 'react';
import { Meta, StoryFn } from '@storybook/react';
import { RealTimeLogChart } from './RealTimeLogChart';

export default {
  title: 'Components/RealTimeLogChart',
  component: RealTimeLogChart,
} as Meta;

const Template: StoryFn<typeof RealTimeLogChart> = (args) => <RealTimeLogChart {...args} />;

export const WithLocalBentoML = Template.bind({});
WithLocalBentoML.args = {
  // Point this to your locally running BentoML service for development.
  websocketUrl: 'ws://localhost:3000/stream',
};

export const DisconnectedState = Template.bind({});
DisconnectedState.args = {
  // An invalid URL to test the disconnected state.
  websocketUrl: 'ws://localhost:9999/invalid',
};

Deployment to DigitalOcean

The final piece is deploying the stack.

BentoML Service to DOKS (DigitalOcean Kubernetes Service):

Build the Docker image: bentoml containerize log_anomaly_streamer:latest.
Push the image to DigitalOcean Container Registry: docker push registry.digitalocean.com/my-registry/log-anomaly-service:latest.

Create a Kubernetes deployment manifest.

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: bento-log-anomaly-deployment
spec:
  replicas: 2 # Start with two replicas for availability
  selector:
    matchLabels:
      app: bento-log-anomaly
  template:
    metadata:
      labels:
        app: bento-log-anomaly
    spec:
      containers:
      - name: bento-service
        image: registry.digitalocean.com/my-registry/log-anomaly-service:latest
        ports:
        - containerPort: 3000
        resources:
          requests:
            cpu: "500m"
            memory: "512Mi"
          limits:
            cpu: "1"
            memory: "1Gi"
---
apiVersion: v1
kind: Service
metadata:
  name: bento-log-anomaly-service
spec:
  type: LoadBalancer # Expose the service externally
  selector:
    app: bento-log-anomaly
  ports:
    - protocol: TCP
      port: 80
      targetPort: 3000

We apply this using kubectl apply -f deployment.yaml against our DOKS cluster context. The LoadBalancer service type will provision a DigitalOcean Load Balancer automatically.

Storybook to App Platform:
- Build the static Storybook site: npm run build-storybook.
- Deploy the output directory (storybook-static) to DigitalOcean App Platform as a static site.
- A critical configuration step is ensuring the web server serves the .wasm file with the correct MIME type application/wasm. Most modern static hosts, including App Platform, handle this correctly, but it’s a common pitfall with custom Nginx or other server setups.

By offloading the computation to a Rust/WASM module, the main thread was freed from the parsing and aggregation burden. The React component became significantly simpler and was only responsible for state management and rendering. The final result was a component capable of smoothly handling over 10,000 log entries per second without dropping frames, a 5x improvement over the initial JavaScript implementation, solving our immediate performance problem and providing a robust pattern for future high-performance UI components.

The current implementation still relies on serializing data structures to JSON strings for transfer between JavaScript and WASM, which introduces overhead. A future optimization path would be to leverage SharedArrayBuffer and a binary serialization format like FlatBuffers or Cap’n Proto to achieve near-zero-copy data transfer. Additionally, the WebSocket connection is simple; for a production system, implementing reconnection logic with exponential backoff and message acknowledgement would be necessary to improve resilience. The deployment to Kubernetes is also basic; adding Horizontal Pod Autoscalers (HPAs) based on CPU or custom metrics would ensure the BentoML service can scale with load.

BentoML Performance WebAssembly DigitalOcean Storybook Rust

Implementing OIDC Claim-Based Data Source Filtering within a LlamaIndex RAG Pipeline for Progressive Web Applications

2023-10-27 Architecture

RAG Security Webpack LlamaIndex PWA OpenID Connect

Implementing a High-Throughput Request Batching Sidecar with FastAPI versus Actix-web

2023-10-27 Architecture

Python Performance FastAPI Rust Actix-web