Calculating TPS & Resource Estimations
Share
1) Basic definition (the “count & divide” way)
TPS = (number of completed transactions) ÷ (time window in seconds)
Use a steady window (exclude warm-up/cool-down) and count successful transactions separately from failures.
Worked example (steady window)
- Test ran 10 minutes (= 600 s), completed 120,000 transactions → raw TPS =
120000 ÷ 600 = 200 TPS
. - You exclude the first 2 minutes and last 1 minute to keep only the steady 7 minutes (= 420 s).
- In that 7-minute slice you see 105,000 successes → TPS(success) =
105000 ÷ 420 = 250 TPS
. - If 2% failed, TPS(success only) =
250 × 0.98 = 245 TPS
.
2) Using Little’s Law (from concurrency & latency)
When you know average in-flight requests and average latency, you can infer TPS:
TPS ≈ Concurrency ÷ Average Latency (seconds)
Worked example
- Avg in-flight = 800 requests.
- Avg end-to-end latency = 80 ms = 0.08 s.
-
TPS ≈ 800 ÷ 0.08 = 10,000 TPS.
(Tip: use average latency for this method; using p90/p95 gives a conservative estimate.)
3) Mixed request types (weighted latency)
If you have light & heavy transactions, compute a weighted average latency.
Worked example
- Mix: 70% light @ 30 ms, 30% heavy @ 120 ms.
- Weighted avg latency =
0.7×0.030 + 0.3×0.120 = 0.021 + 0.036 = 0.057 s
. - With concurrency 500, TPS ≈ 500 ÷ 0.057 ≈ 8,772 TPS.
4) Scaling out: per-instance → cluster TPS
Measure per instance first, then multiply by instances and apply a safety factor.
Worked example
- One pod sustains 1,500 TPS at safe utilization.
- You run 12 pods, and keep a 20% headroom (×0.8).
- Cluster TPS ≈ 1,500 × 12 × 0.8 = 14,400 TPS.
5) From TPS to Kafka capacity (handy for planning)
Once you know TPS, translate to data rate and partitions.
Worked example
- TPS = 20,000; avg payload 500 B; compression ≈ 2× → 250 B on the wire.
-
Broker ingest ≈
20,000 × 250 = 5,000,000 B/s
≈ 5 MB/s (~4.77 MiB/s). -
Partitions needed (rule-of-thumb):
- Target ~1,000 msgs/s per partition → ~20 partitions.
- If you prefer ~500 msgs/s/partition → ~40 partitions.
(Adjust for your broker hardware and replication settings.)
6) Quick checklist to get a reliable TPS number
- Pick a steady 5–10 min window from your test.
- Count completed (and successful) transactions only.
- Report: TPS (success), error rate, p95 latency, and CPU/memory for context.
- Cross-check with Little’s Law to ensure numbers are consistent.
Plug-and-play mini-templates
Basic:TPS = successes_in_window / seconds_in_window
Little’s Law:TPS ≈ average_in_flight / average_latency_seconds
Data rate:MB/s ≈ TPS × avg_payload_bytes / 1,000,000
Kafka partitions (rough):partitions ≈ TPS / target_msgs_per_partition_per_sec
Step-by-Step Resource Estimation
1) Define inputs
- T = target TPS (success path)
- L = avg (or p50) service latency in seconds (for capacity math; keep p95 for SLOs)
- B = avg payload size (bytes) before compression
- CR = compression ratio on producer (e.g., 2× → wire = B/2)
- D_r/D_w = DB reads/writes per transaction (if DB is in scope)
- H = cache hit rate for reads (0–1)
- cpu_ms = CPU milliseconds consumed per transaction (measure with a quick load test)
- U = planned CPU utilization cap (e.g., 0.6–0.7)
- P_target = target msgs/sec per Kafka partition (500–2,000 typical)
2) Concurrency (Little’s Law)
Concurrent_in_flight ≈ T × L
Use this to size threads/event-loops, connection pools, and buffer depths.
3) CPU cores / pods
Cores_needed ≈ (T × cpu_ms) / (1000 × U)
Pods ≈ ceil(Cores_needed / vCPU_per_pod)
Tip: Get cpu_ms by running a 2–5 min load at a known TPS and sampling CPU% on a single pod.
4) Memory
- Blocking model: threads ≈ concurrency → memory = baseline + (threads × stack/req heap).
- Async (Netty/WebFlux/Vert.x): memory dominated by request buffers & queues.
A quick safe bound:
Mem_needed ≈ Baseline + (Concurrent_in_flight × bytes_per_inflight)
Where bytes_per_inflight
is typically 16–64 KB for lean payloads. Add 30% headroom.
5) Kafka throughput & partitions
Wire_Bps ≈ T × (B / CR)
Wire_MBps ≈ Wire_Bps / 1,000,000
Partitions ≈ ceil( T / P_target )
Start with RF=3, min.insync.replicas=2, compression zstd/lz4.
Broker count: aim for ≤ ~200 partitions per broker and ample NIC headroom; usually 3–5 brokers to start.
6) Network (service side)
Ingress_MBps ≈ T × (req_bytes + resp_bytes) / 1e6
Egress_to_Kafka_MBps ≈ Wire_MBps
NIC headroom ≥ 2× peak (bursts, retries)
7) Database & cache (if in path or using Outbox)
DB_read_QPS ≈ T × D_r × (1 - H)
DB_write_QPS ≈ T × D_w
DB_total_QPS ≈ DB_read_QPS + DB_write_QPS
DB connection pool size (rule-of-thumb):
Conn_needed ≈ (DB_read_QPS × avg_read_time_s) + (DB_write_QPS × avg_write_time_s)
(That’s concurrent DB work.) Split across pods; add 20–30% headroom.
If you need atomic DB+Kafka, use Transactional Outbox (and size outbox writer similarly).
8) Queues & backpressure
- In-process buffer (bounded):
buffer_capacity ≥ expected_burst_tps × burst_duration_s
- Shed load at high watermark (429) and increase
linger.ms
slightly to batch more.
9) Autoscaling signals
- HPA on CPU% and producer queue depth (custom metric).
- Alert on producer error rate, retries, batch.size.avg, and p95 latency.
Worked Example A — Payments Ingest (10k TPS)
AssumptionsT=10,000 TPS
, L=0.08 s
, B=400 B
, CR=2×
, cpu_ms=2.0
, U=0.6
, P_target=1,000 msg/s
,
DB per tx: D_r=2
, D_w=1
, cache hit H=0.6
, avg_read=5 ms
, avg_write=10 ms
.
Concurrency= 10,000 × 0.08 = 800 in-flight
CPU / PodsCores = (10,000 × 2) / (1000 × 0.6) = 33.3 → ~34 cores
If pod = 2 vCPU → Pods ≈ ceil(34/2) = 17
Memory (async, 32 KB/req)
Req bytes ≈ 800 × 32 KB ≈ 25.6 MB
; add baseline ~300 MB/pod and headroom.
Per pod request share ≈ 25.6/17 ≈ 1.5 MB
; so 512–768 MB per pod is ample → set 1–2 GiB for safety.
Kafka
Wire rate = 10,000 × (400/2) = 2,000,000 B/s ≈ 2 MB/s
Partitions = ceil(10,000 / 1,000) = 10
→ start with 24 for headroom & hot-key smoothing.
Brokers: 3 (RF=3). Plenty of capacity for 2 MB/s.
Network
Client I/O (say req+resp ~1.2 KB): 10,000 × 1,200 / 1e6 ≈ 12 MB/s
ingress
Kafka egress: ~2 MB/s. NIC budget per pod is tiny; cluster-level fine.
Database
Reads to DB: 10k × 2 × (1-0.6) = 8,000 QPS
Writes: 10k × 1 = 10,000 QPS
→ 18k QPS total
Concurrent DB work: 8,000×0.005 + 10,000×0.010 = 40 + 100 = 140 conns
Add 30% → ~180 connections across 17 pods → ~10–12 per pod.
(If this is too high for your DB, increase cache hit, use read replicas, or move to Outbox.)
Worked Example B — Telemetry (100k TPS, Kafka-only)
AssumptionsT=100,000
, L=0.04 s
, B=200 B
, CR=2×
, cpu_ms=0.5
, U=0.65
, P_target=1,000
Concurrency= 100,000 × 0.04 = 4,000
CPU / PodsCores = (100,000 × 0.5) / (1000 × 0.65) ≈ 76.9 → ~77 cores
If pod = 8 vCPU → Pods ≈ ceil(77/8) = 10
(leave 1–2 extra for headroom → run 12)
Memory (async, 16 KB/req)
Req bytes ≈ 4,000 × 16 KB = 64 MB
(+ baseline). With 12 pods that’s ~5–6 MB per pod plus baseline → 512–1024 MB per pod is plenty.
Kafka
Wire rate = 100,000 × (200/2) = 10,000,000 B/s ≈ 10 MB/s
Partitions = ceil(100,000 / 1,000) = 100
→ choose 128 partitions.
Brokers: 4–5 brokers (RF=3), keeping partitions/broker ≤ ~200 and NIC headroom.
Network
Kafka egress ≈ 10 MB/s; client ingress depends on request size, but typically < 20–30 MB/s total. Comfortable on 10–25GbE.
Quick “fill-in” template
Inputs:
T=____ TPS, L=____ s, B=____ B, CR=____×, cpu_ms=____, U=____
D_r=____, D_w=____, H=____, P_target=____ msg/s
Concurrency = T × L = ______
Cores = (T × cpu_ms) / (1000 × U) = ______ → Pods = ______ (_____ vCPU each)
Mem ≈ Baseline + (Concurrency × bytes_per_inflight) × 1.3 = ______
Wire_MBps_to_Kafka = T × (B/CR) / 1e6 = ______
Partitions = ceil(T / P_target) = ______ → pick ______ (headroom)
DB_read_QPS = T × D_r × (1-H) = ______
DB_write_QPS = T × D_w = ______
DB_concurrency ≈ (read_QPS×read_s + write_QPS×write_s) × 1.3 = ______ → pool size