Don't trust default timeouts

May 03, 2020

Modern applications don’t crash; they hang. One of the main reasons for it is the assumption that the network is reliable. It isn’t.

When you make a network call without setting a timeout, you are telling your code that you are 100% confident that the call is going to succeed. Would you really take that bet?

If you are a making synchronous network call that never returns, then to very least your thread hogs forever. Whoops. Asynchronous network calls that don’t return are not free either. Sure, you are not hogging threads, but you are leaking sockets. Any HTTP client library worth its salt uses socket pools to avoid recreating connections. And those pools have a limited capacity. Like any other resource leak, it’s only a matter of time until there are no sockets left. When that happens, your application is going to get stuck waiting for a connection to free up.

If the network is not reliable, why do we keep creating APIs that have infinity as the default timeout? Some APIs don’t even have a way to set a timeout in the first place! A good API should be easy to use the right way and hard to use the wrong way. When the default timeout is infinity, it’s all too easy for a client to shoot itself in the foot.

If you remember one thing from this post, then let it be this: never use “infinity” as a default timeout.

Let’s take a look at some concrete examples.

Javascript’s XMLHttpRequest is THE web API to retrieve data from a server asynchronously. Its default timeout is zero, which means there is no timeout!

var xhr = new XMLHttpRequest();
xhr.open('GET', '/api', true);

// No timeout by default!
xhr.timeout = 10000; 

xhr.onload = function () {
 // Request finished
};

xhr.ontimeout = function (e) {
 // Request timed out
};

xhr.send(null);

Client-side timeouts are as crucial as server-side ones. There is a maximum number of sockets your browser can open for a particular host. If you make network requests that never returns, you are going to exhaust the socket pool. When the pool is exhausted, you are no longer able to connect to the host.

The fetch web API is a modern replacement for the XMLHttpRequest API, which uses Promises. When the API was initially introduced, there was no way to set a timeout at all! Browsers have recently added experimental support for the Abort API to support timeouts, though.

const controller = new AbortController();

const signal = controller.signal;

const fetchPromise = fetch(url, {signal});  

// No timeout by default!
setTimeout(() => controller.abort(), 10000); 

fetchPromise.then(response => {
 // Request finished
})

Things aren’t much rosier in Python-land. The requests library uses a default timeout of infinity.

# No timeout by default!
response = requests.get('https://github.com/', timeout=10)

What about Go? Go’s HTTP package doesn’t use timeouts by default either.

var client = &http.Client{
  // No timeout by default!
  Timeout: time.Second * 10, 
}

response, _ := client .Get(url)

Modern HTTP clients for Java and .NET do a much better job and usually, come with default timeouts. For example, .Net Core’s HttpClient has a default timeout of 100 seconds. It’s lax but much better than no timeout at all. That comes as no surprise since those languages are used to build large scale distributed systems that need to be robust against network failures. Network requests without timeouts are the top silent killer of distributed systems.

Remember this

As a rule of thumb, always set timeouts when making network calls. And if you build libraries, always set reasonable default timeouts and make them configurable for your clients.

Do you want to learn more about stability patterns and anti-patterns of distributed systems? Check out my upcoming book.