Latency vs Throughput

Both latency and throughput are performance metrics, but they measure very different aspects of a system β€” especially in networking, APIs, and distributed systems.


πŸ“Œ 1. What is Latency?

Latency is the time delay between a request and its response.

  • πŸ• Measured in: milliseconds (ms) or microseconds (ΞΌs)
  • πŸ” Focus: Speed of a single operation
  • πŸ“¦ Example: Time taken from clicking a video play button to the video starting.

βœ… Low latency = faster response
🚫 High latency causes lags or slow reactions


πŸ“Œ 2. What is Throughput?

Throughput is the amount of data or number of operations handled per second.

  • πŸ“ Measured in: requests/sec, MBps, Mbps
  • πŸ” Focus: Volume over time
  • πŸ“¦ Example: Number of video chunks delivered per second in a stream.

βœ… High throughput = more data processed
🚫 Low throughput = bottlenecks under load


🧠 Analogy:

Latency = How fast a single person can get through the drive-thru.
Throughput = How many people the drive-thru can serve in a minute.


βš–οΈ Side-by-Side Comparison:

Metric Latency Throughput
Definition Time per request Requests per unit time
Goal Minimize Maximize
Unit ms, ΞΌs req/sec, MBps, Mbps
Use Case Real-time apps, gaming Data-intensive apps, streaming
Indicator Speed of response Capacity under load

🚦 Example in API:

  • A fast API with 50ms latency, but only 10 requests/sec = low throughput.
  • A batch-processing service with 1s latency but 1,000 requests/sec = high throughput.
Back to blog

Leave a comment

Please note, comments need to be approved before they are published.