Latency
Latency is the time a request takes to travel from client to server, get processed, and return a response. In API development, latency directly shapes user experience — every additional millisecond of delay means slower interfaces, lower conversion rates, and frustrated users.
Components of API Latency
Total latency is the sum of several stages:
| Stage | Typical Duration |
|---|---|
| DNS resolution | 1–50ms |
| TCP connection | 10–100ms |
| Transport Layer Security (TLS) handshake | 30–100ms |
| Server processing | 1ms–minutes |
| Data transfer | 1–500ms |
For most APIs, server processing time dominates. A database query taking 500ms dwarfs the 50ms of network overhead.
Measuring Latency: Percentiles
Averages hide problems. Use percentiles:
- p50 (median): Half of requests are faster than this
- p95: 95% of requests are faster — this is what most users experience
- p99: 1 in 100 requests is slower — this catches tail latency
An API with a 100ms average might have a p99 of 3 seconds — meaning 1% of users wait 30x longer than expected.
How Async Processing Reduces Latency
The fastest API response is one that skips slow operations entirely:
// High latency: 12 seconds (waits for everything)
app.post('/onboard', async (req, res) => {
await createUser(req.body); // 50ms
await sendEmail(user); // 2,000ms
await syncCRM(user); // 3,000ms
await generateAvatar(user); // 4,000ms
res.json({ user }); // Total: ~9,050ms
});
// Low latency: 80ms (offloads slow work)
app.post('/onboard', async (req, res) => {
const user = await createUser(req.body); // 50ms
await queue.add('onboard-tasks', user); // 30ms
res.json({ user }); // Total: ~80ms
});
By offloading slow operations to a task queue, the user-facing latency drops from seconds to milliseconds.