A comprehensive, modular stress-testing suite designed for high-performance MQTT brokers. Developed in Python
with uv, it validates latency, concurrency, and
stability thresholds.
The ProtoMQ Benchmarking Suite was built with a critical realization: performance validation is universal. While developed alongside ProtoMQ, the suite is entirely decoupled from the server implementation.
It uses standard MQTT protocols to interact with any broker, making it a valuable tool for testing any MQTT implementation (Mosquitto, EMQX, Vernemq, etc.) against the same rigorous scenarios used to optimize ProtoMQ.
Each benchmark targets a specific stress vector, from connection churn to payload size variance.
Verifies sub-millisecond round-trip latency under steady-state load with 100+ concurrent connections.
Metrics: p50/p99 Latency, Connection Time, Memory (RSS)
Simulates 10,000 clients connecting in bursts and publishing simultaneously. Tests massive fan-out and connection pressure.
Metrics: Fan-out Time, Connection Failures, Message Loss, Peak CPU/Memory
10-minute endurance test at 10,000 msg/s. Monitors for long-term stability and performance degradation.
Metrics: Messages/sec, p99 Latency Stability, Memory Growth, Avg CPU
Stress-tests the topic matching engine with complex overlapping patterns like sensor/+/temp.
Metrics: Topic Matching Latency (µs), Routing Correctness, Peak Memory
Compares binary vs JSON processing overhead. Measures efficiency gains of the Protobuf layer.
Metrics: Bandwidth Savings %, Decoding Latency, CPU Overhead
100,000 rapid connect/disconnect cycles. Essential for detecting socket leaks in edge/mobile scenarios.
Metrics: Connection Rate (conn/s), Memory Leak (MB), FD Leak Count
Tests performance across a spectrum of payloads, from 10-byte telemetry to 64KB binary images.
Metrics: Throughput vs Size, p99 Latency per Size, Memory Scaling
The suite leverages Python 3.11+ and the uv package manager for lightning-fast environment setup and execution.
A common/ library provides shared logic for connection tracking, resource
monitoring (CPU/Memory), and a standardized BenchmarkRunner. This allows new
scenarios to be added by writing only the logic for the specific test case.
Raw performance numbers are only useful if they can be measured against expectations. The suite includes a sophisticated Thresholding Mechanism that converts bench results into actionable status reports.
Every metric is defined with a target "direction" (lower is better for latency, higher is better for
throughput). The ThresholdChecker automatically interprets the results based
on these semantics.
Because thresholds are declared in thresholds.json files, the suite is ideal
for CI/CD pipelines. A build can be automatically failed if a commit introduces a performance regression,
ensuring the "sub-millisecond" promise of ProtoMQ is never broken.
# Examples of threshold definitions
{
"p99_latency_ms": { "max": 0.8, "warn": 0.6, "direction": "lower" },
"concurrent_connections": { "min": 9500, "direction": "higher" }
}