Latency vs Throughput

June 25, 2025

Both latency and throughput are performance metrics, but they measure very different aspects of a system — especially in networking, APIs, and distributed systems.

📌 1. What is Latency?

Latency is the time delay between a request and its response.

🕐 Measured in: milliseconds (ms) or microseconds (μs)
🔍 Focus: Speed of a single operation
📦 Example: Time taken from clicking a video play button to the video starting.

✅ Low latency = faster response
🚫 High latency causes lags or slow reactions

📌 2. What is Throughput?

Throughput is the amount of data or number of operations handled per second.

📏 Measured in: requests/sec, MBps, Mbps
🔍 Focus: Volume over time
📦 Example: Number of video chunks delivered per second in a stream.

✅ High throughput = more data processed
🚫 Low throughput = bottlenecks under load

🧠 Analogy:

Latency = How fast a single person can get through the drive-thru.
Throughput = How many people the drive-thru can serve in a minute.

⚖️ Side-by-Side Comparison:

Metric	Latency	Throughput
Definition	Time per request	Requests per unit time
Goal	Minimize	Maximize
Unit	ms, μs	req/sec, MBps, Mbps
Use Case	Real-time apps, gaming	Data-intensive apps, streaming
Indicator	Speed of response	Capacity under load

🚦 Example in API:

A fast API with 50ms latency, but only 10 requests/sec = low throughput.
A batch-processing service with 1s latency but 1,000 requests/sec = high throughput.

Back to blog

Item added to your cart