Performance Benchmarks

Proof of the pudding for scalability of Obsrv

Cluster Size

Config Name	Config Value
Number of Nodes	4
Node Size	4 core, 16 Gb
PV size	1 TB
Installation Mode	Obsrv with monitoring and real-time storage

Processing Benchmarks

Processing benchmark is independent on number of datasets created, hence the strategy is to test with volume with all configurations enabled. Disabling any configuration is going to improve throughput

Configuration 1

Dedup turned on
De-normalization configured on 2 master datasets
Transformations configured on 2 fields
Event size of 1 kb

Results

Flink Configuration	Events per Min	Events per hour	Events per Day
1 CPU, 1GB, 1 task slot, 1 parallelism	~ 13k \| 13 Mb	~ 750k \| 780Mb	~ 18Million \| 18Gb
2 CPU, 2GB, 2 task slot, 2 parallelism	~ 30k \| 30 Mb	~ 1.8Million \| 1.8Gb	~ 40Million \| 40Gb
4 CPU, 4Gb, 4 task slots, 4 parallelism	`In Progress`	`In Progress`	`In Progress`

Secor Backups Benchmark

To ensure there is no data loss across obsrv pipeline all data is backuped to object store using S3. Following are the benchmark results of Secor backups in real-time

Configuration 1

Total Secor processes - 7
Total CPU Allocated - 1.5 cpu
Event size of 1 kb

Results

Events per Min	Events per hour	Events per Day	Events per process
~ 1.6 Million \| 1.6Mb	~ 100 Million \| 100Gb	~ 2.4 Billion \| 2.4Tb	~ 300 Million \| 300Gb

Druid Indexing Benchmark

Druid indexing benchmark is dependent on number of datasets created and number of aggregate tables. This benchmark is done with minimal configuration only and can actually linearly scale with the number of CPUs provided

Minimum Configuration

Config Name	Config Value
Process Name	Druid Indexer
CPU	0.5
Direct Memory	2Gi
Heap	9Gi
GlobalIngestionHeap	8Gi
Workers Count	30
Pod Memory	11Gi

Results

Num of Tables	Events per Min	Events per hour	Events per Day
1	~ 80k \| 80 Mb	~ 4.8 Million \| 4.8 Gb	~ 110 Million \| 110 Gb
2	~ 40k \| 40 Mb	~ 2.4 Million \| 2.4 Gb	~ 55 Million \| 55 Gb
3	`In Progress`	`In Progress`	`In Progress`
4	`In Progress`	`In Progress`	`In Progress`
5	~ 35k \| 35 Mb	~ 2.1 Million \| 2.1 Gb	~ 50 Million \| 50 Gb

Query Benchmark

Similar to processing, query benchmark is dependent on the volume of data but not on the number of datasets (or tables) created. Query performance will increase linearly with the amount of CPU/Memory assigned to the Druid Historical process

Minimum Configuration

Config Name	Config Value
Process Name	Druid Historical
CPU	2
Direct Memory	4608Mi
Heap	1Gi
Pod Memory	5700Mi
Segment Size	4.77Gi
No. of rows per segment	5000000
processing.numThreads	2
processing.numMergeBuffers	6
Concurrency	100

RAW Table Results

Query Query Interval Throughput Response Times (in ms)

Group by on Raw Data

1 Day

25 r/s

Query	Query Interval	Throughput	Response Times (in ms)
Group by on Raw Data	1 Day	25 r/s	Avg \| Min \| Max \| 90th 392 \| 80 \| 686 \| 472
Group by on Raw Data	7 Days	4 r/s	Avg \| Min \| Max \| 90th 4933 \| 1277 \| 8382 \| 5154
Group by on Raw Data	30 Days	`In Progress`	`In Progress`

Avg | Min | Max | 90th

392 | 80 | 686 | 472

Group by on Raw Data

7 Days

4 r/s

Avg | Min | Max | 90th

4933 | 1277 | 8382 | 5154

Group by on Raw Data 30 Days In Progress In Progress

Aggregate (Rollup) Table Results

Query	Query Interval	Throughput	Response Times (in ms)
Group by on Aggregate Data	1 Day	`In Progress`	`In Progress`
Group by on Aggregate Data	7 Days	`In Progress`	`In Progress`
Group by on Aggregate Data	30 Days	`In Progress`	`In Progress`