Theodolite’s Stream Processing Benchmarks

Theodolite comes with 4 application benchmarks, which are based on typical use cases for stream processing within microservices. For each benchmark, a corresponding load generator is provided. Currently, Theodolite provides benchmark implementations for Apache Kafka Streams, Apache Flink, Hazelcast Jet and Apache Beam (with Samza and Flink).

Theodolite’s benchmarks (labeled UC1–UC4) represent some sort of event-driven microservice performing Industrial Internet of Things data analytics. Specifically, they are derived from a microservice-based research software for analyzing industrial power consumption data streams (the Titan Control Center).

Stream processing engine	UC1	UC2	UC3	UC4
Apache Kafka Streams	✓	✓	✓	✓
Apache Flink	✓	✓	✓	✓
Hazelcast Jet	✓	✓	✓	✓
Apache Beam (Samza/Flink)	✓	✓	✓	✓

Installation

When installing Theodolite with Helm and the default configuration, also our stream processing benchmarks are automatically installed. This can be verified by running kubectl get benchmarks, which should yield something like:

NAME                AGE     STATUS
uc1-beam-flink      2d20h   Ready
uc1-beam-samza      2d20h   Ready
uc1-flink           2d20h   Ready
uc1-hazelcastjet    2d16h   Ready
uc1-kstreams        2d20h   Ready
uc2-beam-flink      2d20h   Ready
uc2-beam-samza      2d20h   Ready
uc2-flink           2d20h   Ready
uc2-hazelcastjet    2d16h   Ready
uc2-kstreams        2d20h   Ready
uc3-beam-flink      2d20h   Ready
uc3-beam-samza      2d20h   Ready
uc3-flink           2d20h   Ready
uc3-hazelcastjet    2d16h   Ready
uc3-kstreams        2d20h   Ready
uc4-beam-flink      2d20h   Ready
uc4-beam-samza      2d20h   Ready
uc4-flink           2d20h   Ready
uc4-hazelcastjet    2d16h   Ready
uc4-kstreams        2d20h   Ready

Alternatively, all benchmarks can also be found at GitHub and installed manually with kubectl apply -f <benchmark-yaml-file>. Additionally, you would need to package the benchmarks’ Kubernetes resources into a ConfigMap by running:

kubectl create configmap <configmap-name-required-by-benchmark> --from-file <directory-with-benchmark-resources>

See the install-configmaps.sh script for examples.

Running Benchmarks

To run a benchmark, you need to create and apply an Execution YAML file as described in the running benchmarks documentation. Some preliminary results of our benchmarks can be found in our publication:

S. Henning and W. Hasselbring. “Theodolite: Scalability Benchmarking of Distributed Stream Processing Engines in Microservice Architectures”. In: Big Data Research 25. 2021. DOI: 10.1016/j.bdr.2021.100209.

Control the Number of Load Generator Instances

Depending on the load to be generated, the Theodolite benchmarks create multiple load generator instances. Per default, a single instance will generate up to 150 000 messages per second. If higher loads are to be generated, accordingly more instances are deployed. However, the actual load that can be generated by a single load generator instance depends on the cluster configuration and might be lower. To change the maximum number of messages per instance, run the following commands. Set the MAX_RECORDS_PER_INSTANCE variable to the number of messages a single instance can generate in your cluster (use our Grafana dashboard to figure out that value).

export MAX_RECORDS_PER_INSTANCE=150000 # Change to your desired value
kubectl patch benchmarks uc1-beam-flink --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc1-beam-samza --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc1-flink --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc1-hazelcastjet --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc1-kstreams --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc2-beam-flink --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc2-beam-samza --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc2-flink --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc2-hazelcastjet --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc2-kstreams --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc3-beam-flink --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc3-beam-samza --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc3-flink --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc3-hazelcastjet --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc3-kstreams --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc4-beam-flink --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc4-beam-samza --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc4-flink --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc4-hazelcastjet --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc4-kstreams --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"

Theodolite’s Stream Processing Benchmarks

Installation

Running Benchmarks

Control the Number of Load Generator Instances

Table of contents