- Latency is the time taken to process a single request.
- Throughput is the number of requests processed in a given time (e.g., per second).
- Latency measures delay; throughput measures capacity.
- Low latency = fast response.
- High throughput = more requests handled.
- A system can have low latency but low throughput (fast but handles few users).
- A system can have high throughput but high latency (handles many users slowly).