Calculating TPS & Resource Estimations


1) Basic definition (the “count & divide” way)

TPS = (number of completed transactions) ÷ (time window in seconds)
Use a steady window (exclude warm-up/cool-down) and count successful transactions separately from failures.

Worked example (steady window)

  • Test ran 10 minutes (= 600 s), completed 120,000 transactions → raw TPS = 120000 ÷ 600 = 200 TPS.
  • You exclude the first 2 minutes and last 1 minute to keep only the steady 7 minutes (= 420 s).
  • In that 7-minute slice you see 105,000 successes → TPS(success) = 105000 ÷ 420 = 250 TPS.
  • If 2% failed, TPS(success only) = 250 × 0.98 = 245 TPS.

2) Using Little’s Law (from concurrency & latency)

When you know average in-flight requests and average latency, you can infer TPS:

TPS ≈ Concurrency ÷ Average Latency (seconds)

Worked example

  • Avg in-flight = 800 requests.
  • Avg end-to-end latency = 80 ms = 0.08 s.
  • TPS ≈ 800 ÷ 0.08 = 10,000 TPS.
    (Tip: use average latency for this method; using p90/p95 gives a conservative estimate.)

3) Mixed request types (weighted latency)

If you have light & heavy transactions, compute a weighted average latency.

Worked example

  • Mix: 70% light @ 30 ms, 30% heavy @ 120 ms.
  • Weighted avg latency = 0.7×0.030 + 0.3×0.120 = 0.021 + 0.036 = 0.057 s.
  • With concurrency 500, TPS ≈ 500 ÷ 0.057 ≈ 8,772 TPS.

4) Scaling out: per-instance → cluster TPS

Measure per instance first, then multiply by instances and apply a safety factor.

Worked example

  • One pod sustains 1,500 TPS at safe utilization.
  • You run 12 pods, and keep a 20% headroom (×0.8).
  • Cluster TPS ≈ 1,500 × 12 × 0.8 = 14,400 TPS.

5) From TPS to Kafka capacity (handy for planning)

Once you know TPS, translate to data rate and partitions.

Worked example

  • TPS = 20,000; avg payload 500 B; compression ≈ 250 B on the wire.
  • Broker ingest20,000 × 250 = 5,000,000 B/s5 MB/s (~4.77 MiB/s).
  • Partitions needed (rule-of-thumb):
    • Target ~1,000 msgs/s per partition~20 partitions.
    • If you prefer ~500 msgs/s/partition → ~40 partitions.
      (Adjust for your broker hardware and replication settings.)

6) Quick checklist to get a reliable TPS number

  • Pick a steady 5–10 min window from your test.
  • Count completed (and successful) transactions only.
  • Report: TPS (success), error rate, p95 latency, and CPU/memory for context.
  • Cross-check with Little’s Law to ensure numbers are consistent.

Plug-and-play mini-templates

Basic:
TPS = successes_in_window / seconds_in_window

Little’s Law:
TPS ≈ average_in_flight / average_latency_seconds

Data rate:
MB/s ≈ TPS × avg_payload_bytes / 1,000,000

Kafka partitions (rough):
partitions ≈ TPS / target_msgs_per_partition_per_sec


Step-by-Step Resource Estimation

1) Define inputs

  • T = target TPS (success path)
  • L = avg (or p50) service latency in seconds (for capacity math; keep p95 for SLOs)
  • B = avg payload size (bytes) before compression
  • CR = compression ratio on producer (e.g., 2× → wire = B/2)
  • D_r/D_w = DB reads/writes per transaction (if DB is in scope)
  • H = cache hit rate for reads (0–1)
  • cpu_ms = CPU milliseconds consumed per transaction (measure with a quick load test)
  • U = planned CPU utilization cap (e.g., 0.6–0.7)
  • P_target = target msgs/sec per Kafka partition (500–2,000 typical)


2) Concurrency (Little’s Law)

Concurrent_in_flight ≈ T × L

Use this to size threads/event-loops, connection pools, and buffer depths.


3) CPU cores / pods

Cores_needed ≈ (T × cpu_ms) / (1000 × U)
Pods ≈ ceil(Cores_needed / vCPU_per_pod)

Tip: Get cpu_ms by running a 2–5 min load at a known TPS and sampling CPU% on a single pod.


4) Memory

  • Blocking model: threads ≈ concurrency → memory = baseline + (threads × stack/req heap).
  • Async (Netty/WebFlux/Vert.x): memory dominated by request buffers & queues.

A quick safe bound:

Mem_needed ≈ Baseline + (Concurrent_in_flight × bytes_per_inflight)

Where bytes_per_inflight is typically 16–64 KB for lean payloads. Add 30% headroom.


5) Kafka throughput & partitions

Wire_Bps ≈ T × (B / CR)
Wire_MBps ≈ Wire_Bps / 1,000,000
Partitions ≈ ceil( T / P_target )

Start with RF=3, min.insync.replicas=2, compression zstd/lz4.
Broker count: aim for ≤ ~200 partitions per broker and ample NIC headroom; usually 3–5 brokers to start.


6) Network (service side)

Ingress_MBps ≈ T × (req_bytes + resp_bytes) / 1e6
Egress_to_Kafka_MBps ≈ Wire_MBps
NIC headroom ≥ 2× peak (bursts, retries)


7) Database & cache (if in path or using Outbox)

DB_read_QPS  ≈ T × D_r × (1 - H)
DB_write_QPS ≈ T × D_w
DB_total_QPS ≈ DB_read_QPS + DB_write_QPS

DB connection pool size (rule-of-thumb):

Conn_needed ≈ (DB_read_QPS × avg_read_time_s) + (DB_write_QPS × avg_write_time_s)

(That’s concurrent DB work.) Split across pods; add 20–30% headroom.
If you need atomic DB+Kafka, use Transactional Outbox (and size outbox writer similarly).


8) Queues & backpressure

  • In-process buffer (bounded):
    buffer_capacity ≥ expected_burst_tps × burst_duration_s
  • Shed load at high watermark (429) and increase linger.ms slightly to batch more.


9) Autoscaling signals

  • HPA on CPU% and producer queue depth (custom metric).
  • Alert on producer error rate, retries, batch.size.avg, and p95 latency.


Worked Example A — Payments Ingest (10k TPS)

Assumptions
T=10,000 TPS, L=0.08 s, B=400 B, CR=2×, cpu_ms=2.0, U=0.6, P_target=1,000 msg/s,
DB per tx: D_r=2, D_w=1, cache hit H=0.6, avg_read=5 ms, avg_write=10 ms.

Concurrency
= 10,000 × 0.08 = 800 in-flight

CPU / Pods
Cores = (10,000 × 2) / (1000 × 0.6) = 33.3 → ~34 cores
If pod = 2 vCPUPods ≈ ceil(34/2) = 17

Memory (async, 32 KB/req)
Req bytes ≈ 800 × 32 KB ≈ 25.6 MB; add baseline ~300 MB/pod and headroom.
Per pod request share ≈ 25.6/17 ≈ 1.5 MB; so 512–768 MB per pod is ample → set 1–2 GiB for safety.

Kafka
Wire rate = 10,000 × (400/2) = 2,000,000 B/s ≈ 2 MB/s
Partitions = ceil(10,000 / 1,000) = 10 → start with 24 for headroom & hot-key smoothing.
Brokers: 3 (RF=3). Plenty of capacity for 2 MB/s.

Network
Client I/O (say req+resp ~1.2 KB): 10,000 × 1,200 / 1e6 ≈ 12 MB/s ingress
Kafka egress: ~2 MB/s. NIC budget per pod is tiny; cluster-level fine.

Database
Reads to DB: 10k × 2 × (1-0.6) = 8,000 QPS
Writes: 10k × 1 = 10,000 QPS18k QPS total
Concurrent DB work: 8,000×0.005 + 10,000×0.010 = 40 + 100 = 140 conns
Add 30% → ~180 connections across 17 pods → ~10–12 per pod.
(If this is too high for your DB, increase cache hit, use read replicas, or move to Outbox.)


Worked Example B — Telemetry (100k TPS, Kafka-only)

Assumptions
T=100,000, L=0.04 s, B=200 B, CR=2×, cpu_ms=0.5, U=0.65, P_target=1,000

Concurrency
= 100,000 × 0.04 = 4,000

CPU / Pods
Cores = (100,000 × 0.5) / (1000 × 0.65) ≈ 76.9 → ~77 cores
If pod = 8 vCPUPods ≈ ceil(77/8) = 10 (leave 1–2 extra for headroom → run 12)

Memory (async, 16 KB/req)
Req bytes ≈ 4,000 × 16 KB = 64 MB (+ baseline). With 12 pods that’s ~5–6 MB per pod plus baseline → 512–1024 MB per pod is plenty.

Kafka
Wire rate = 100,000 × (200/2) = 10,000,000 B/s ≈ 10 MB/s
Partitions = ceil(100,000 / 1,000) = 100 → choose 128 partitions.
Brokers: 4–5 brokers (RF=3), keeping partitions/broker ≤ ~200 and NIC headroom.

Network
Kafka egress ≈ 10 MB/s; client ingress depends on request size, but typically < 20–30 MB/s total. Comfortable on 10–25GbE.


Quick “fill-in” template

Inputs:
T=____ TPS, L=____ s, B=____ B, CR=____×, cpu_ms=____, U=____
D_r=____, D_w=____, H=____, P_target=____ msg/s

Concurrency = T × L = ______
Cores = (T × cpu_ms) / (1000 × U) = ______ → Pods = ______ (_____ vCPU each)
Mem ≈ Baseline + (Concurrency × bytes_per_inflight) × 1.3 = ______
Wire_MBps_to_Kafka = T × (B/CR) / 1e6 = ______
Partitions = ceil(T / P_target) = ______ → pick ______ (headroom)
DB_read_QPS = T × D_r × (1-H) = ______
DB_write_QPS = T × D_w = ______
DB_concurrency ≈ (read_QPS×read_s + write_QPS×write_s) × 1.3 = ______ → pool size


Back to blog

Leave a comment

Please note, comments need to be approved before they are published.