Latency vs Throughput
Both latency and throughput are performance metrics, but they measure very different aspects of a system β especially in networking, APIs, and distributed systems.
π 1. What is Latency?
Latency is the time delay between a request and its response.
- π Measured in: milliseconds (ms) or microseconds (ΞΌs)
- π Focus: Speed of a single operation
- π¦ Example: Time taken from clicking a video play button to the video starting.
β
Low latency = faster response
π« High latency causes lags or slow reactions
π 2. What is Throughput?
Throughput is the amount of data or number of operations handled per second.
- π Measured in: requests/sec, MBps, Mbps
- π Focus: Volume over time
- π¦ Example: Number of video chunks delivered per second in a stream.
β
High throughput = more data processed
π« Low throughput = bottlenecks under load
π§ Analogy:
Latency = How fast a single person can get through the drive-thru.
Throughput = How many people the drive-thru can serve in a minute.
βοΈ Side-by-Side Comparison:
Metric | Latency | Throughput |
---|---|---|
Definition | Time per request | Requests per unit time |
Goal | Minimize | Maximize |
Unit | ms, ΞΌs | req/sec, MBps, Mbps |
Use Case | Real-time apps, gaming | Data-intensive apps, streaming |
Indicator | Speed of response | Capacity under load |
π¦ Example in API:
- A fast API with 50ms latency, but only 10 requests/sec = low throughput.
- A batch-processing service with 1s latency but 1,000 requests/sec = high throughput.