logo

Understanding API Response Times

API response times can make or break your user experience. Learn what causes slow responses, how to measure them, and how async task queues solve the problem.

A
AsyncQueue TeamAuthor
March 15, 2026
5 min read
Understanding API Response Times

Every millisecond counts. Studies show that a 100ms delay in response time can reduce conversion rates by 7%. Yet most modern applications rely on external APIs that take seconds — sometimes minutes — to respond.

What Is API Response Time?

API response time is the total duration between sending a request and receiving the complete response. It includes:

  • Domain Name System (DNS) resolution: Translating the domain name to an IP address (1–50ms)
  • Transmission Control Protocol (TCP) handshake: Establishing the connection (10–100ms)
  • Transport Layer Security (TLS) negotiation: Setting up encryption (30–100ms)
  • Server processing: The actual work being done (variable)
  • Data transfer: Sending the response payload back (variable)

For most APIs, server processing dominates. A database lookup might take 5ms, while an artificial intelligence (AI) inference call could take 30 seconds.

Why Slow APIs Are a Problem

For users

Nobody wants to stare at a loading spinner. If your checkout page takes 10 seconds because the payment processor is slow, users abandon their carts.

For serverless functions

Platforms like Vercel, Netlify, and AWS Lambda have strict timeout limits:

PlatformTimeout
Vercel (Hobby)10 seconds
Vercel (Pro)60 seconds
AWS Lambda15 minutes
Netlify Functions10 seconds

If your external API call exceeds these limits, the platform kills your function mid-request.

For costs

Serverless functions are billed per millisecond of execution. A function waiting 30 seconds for an API response costs 30,000x more than one that returns in 1ms.

Measuring API Response Time

Using curl

curl -o /dev/null -s -w "Total: %{time_total}s\nDNS: %{time_namelookup}s\nConnect: %{time_connect}s\nTLS: %{time_appconnect}s\nFirst byte: %{time_starttransfer}s\n" https://api.example.com/endpoint

In your application

const start = performance.now();
const response = await fetch('https://api.example.com/endpoint');
const duration = performance.now() - start;
console.log(`API response time: ${duration.toFixed(0)}ms`);

Percentiles matter more than averages

An API with a 200ms average might have a p99 of 5 seconds — meaning 1 in 100 requests takes 25x longer than expected. Always track p50, p95, and p99 latency.

Strategies for Handling Slow APIs

1. Caching

If the data doesn’t change often, cache the response. Redis, Memcached, or HTTP cache headers can eliminate repeated slow calls.

Best for: Read-heavy endpoints with stable data.

2. Timeouts and circuit breakers

Set aggressive timeouts and fail fast. A circuit breaker pattern stops calling a failing service after a threshold of errors.

Best for: Protecting your application from cascading failures.

3. Async task queues

For operations where you need the result but not right away, offload the API call to a task queue. Your application responds instantly while the queue handles the slow call in the background.

// Instead of this (blocks for 30+ seconds):
const result = await callSlowAPI(data);
return Response.json(result);

// Do this (responds in ~50ms):
const task = await asyncqueue.tasks.create({
  callbackUrl: 'https://slow-api.example.com/process',
  payload: data,
  webhookUrl: 'https://your-app.com/api/on-complete',
  retries: 3,
});
return Response.json({ taskId: task.id, status: 'processing' });

Best for: Long-running operations, file processing, AI/machine learning (ML) inference, payment processing.

4. Parallel requests

If you need data from multiple APIs, call them all at once instead of one by one:

// Sequential: 200ms + 300ms + 150ms = 650ms
const a = await fetchA();
const b = await fetchB();
const c = await fetchC();

// Parallel: max(200ms, 300ms, 150ms) = 300ms
const [a, b, c] = await Promise.all([fetchA(), fetchB(), fetchC()]);

Best for: Multiple independent API calls in a single request handler.

When to Use a Task Queue

A task queue is the right choice when:

  • The API response takes longer than your platform’s timeout allows
  • The user doesn’t need the result right away
  • You need automatic retries if the API fails
  • You want to track and log every call attempt
  • You’re paying per-millisecond and want to minimize costs

AsyncQueue is built for this pattern. Your function creates a task in under 50ms, and AsyncQueue handles the slow API call — with retries, logging, and result storage — regardless of duration.

Conclusion

Slow APIs are a fact of life. The key is choosing the right strategy for each situation: cache what you can, timeout what you must, parallelize where possible, and offload everything else to an async task queue.

Your users get instant responses. Your serverless functions stay within platform limits. And your bill stays under control.

Subscribe to our newsletter

Stay updated with the latest articles, tutorials, and insights from our team. We'll never spam your inbox.

By subscribing, you agree to our Privacy Policy and consent to receive updates from our company.