Real-Time
Ops
A high-throughput monitoring suite processing 4M electrical signals per second with sub-10ms alert latency. Built on Go, Redis Streams, and Kafka — each layer decoupled so no consumer path can block another.
Millions of Events Per Second
The platform needed to ingest and process electrical signal data from thousands of sensors simultaneously. A Node.js monolith was saturating CPU at 12% of the target ingest rate — a structural ceiling, not a tuning problem.
Sub-10ms Alert Latency
Operators required alerts within 10ms of a threshold crossing. The existing architecture had alert latency measured in seconds, rendering the alerting system unreliable for safety-critical use cases and failing SLA requirements.
Stateful Fan-Out at Scale
Each incoming signal needed to be routed to potentially hundreds of downstream consumers — dashboards, alert evaluators, loggers — without any consumer blocking others or causing head-of-line delays under burst conditions.
Go Ingest Service
Rewrote the ingest layer in Go with goroutine-per-connection handling and a lock-free ring buffer for batching. The new service handles 4M events/sec on 4 vCPUs with median ingest latency of 1.2ms — a 30× throughput improvement with no hardware change.
Redis Streams Fan-Out
Implemented a Redis Streams-based fan-out layer for real-time consumers. Each consumer type — dashboards, alert evaluators, audit loggers — reads from its own consumer group. A slow dashboard never blocks an alert evaluator.
Kafka Durable Event Log
All events are durably committed to Kafka before acknowledgment. Downstream consumers — data lake writer, ML feature pipeline, audit log — replay from Kafka independently with no coupling to the real-time path. Retention configured at 7 days.
Grafana Operations Stack
Built a Grafana-based operations stack with real-time signal dashboards, alert history, on-call routing via PagerDuty, and SLA tracking — all backed by metrics exported from the Go service via Prometheus.
- Go 1.22
- Goroutines
- Lock-free ring buffer
- AWS EKS
- Redis 7 Streams
- Consumer groups
- AWS ElastiCache
- Apache Kafka 3.x
- AWS MSK
- 7-day retention
- S3 + Parquet
- Grafana
- Prometheus
- PagerDuty
- TimescaleDB
Building something similar?
We've solved these problems before. Let's talk about yours.