Back of the envelope estimation hacks

May 19, 2020

There are two types of engineers, the ones that can quickly do estimates and the ones that can’t. Are these people just smarter, or is there more to it?

Enrico Fermi

Enrico Fermi was a master at estimation

Back of the envelope estimation is a skill that can be learned with practice. It becomes easier once you get familiar with some tricks of the trade.

Know your numbers

How fast can you read data from the disk? How quickly from the network? You should be familiar with ballpark performance figures of your components. What’s important are not the exact numbers per se but their relative differences in terms of orders of magnitude.

Approximate with powers

The goal of estimation is not to get a correct answer, but to get one in the right ballpark. Using powers of 2 or 10 makes multiplications very easy as in logarithmic space, multiplications become additions.

For example, let’s say you have 800K users watching videos in UHD at 12 Mbps. A machine in your Content Delivery Network can egress at 1 Gbps. How many machines do you need?

Approximating 800K with 10610^{6}, and 12 Mbps with 10710^{7} Mbps yields:

106107bps109bps=10(6+79)=104=10K\frac{10^{6}*10^7bps}{10^{9}bps} = 10^{(6 + 7 - 9)} = 10^{4} = 10K

Summing 6 to 7 and subtracting 9 from it is much easier than trying to get to the exact answer:

0.8M12Mbps1000Mbps=9.6K\frac{0.8M*12Mbps}{1000 Mbps} = 9.6K

Does the difference really matter? Not really; the precise answer is not exact anyway. Users vary over time, the network bandwidth is not constant, etc.

The Rule of 72

The rule of 72 is a method to estimate how long it will take for a quantity to double if it grows at a certain percentage rate.

time=72ratetime = \frac{72}{rate}

For example, let’s say the traffic to your web service is increasing by approximately 10% weekly, how long will it take to double?

time=7210=7.2time = \frac{72}{10} = 7.2

It will take approximately 7 weeks for the traffic to double.

If you combine the rule of 72 with the powers of two, then you can quickly find out how long it will take for a quantity to increase by several orders of magnitudes.

Little’s Law

A queue can be modeled with three parameters:

  • the average rate at which new items arrive, λ\lambda,
  • the average time an item spends in the queue, WW,
  • and the average number of items in the queue, LL.

Lots of things can be modeled as queues. A web service can be seen as a queue, for example. The request rate is the rate at which new items arrive. The time it takes for a request to be processed is the time an item spends in the queue. Finally, the number of concurrently processed requests is the number of items in the queue.

Wouldn’t it be great if you could derive one of the three parameters from the other two? It turns out there is a law that relates these three quantities to one another! It’s called Little’s Law:

L=λWL = \lambda * W

What it says is that the average number of items in the queue equals the average rate at which new items arrive, multiplied by the average time an item spends in the queue.

Let’s try it out. Let’s say you have a service that takes on average 100 ms to process a request. It’s currently receiving about two million requests per second (rps). How many requests are being processed concurrently?

requests=2Mrps0.1s=200Krequests = 2Mrps * 0.1s = 200K

The service is processing 200K requests concurrently. If each request is CPU heavy and requires a thread, then we will need about 200K threads. If we are using 8 core machines, then to keep up, we will need about 25K machines.

Remember this

Estimation is a vital skill for an engineer. It’s something you can get better at by practicing and using the hacks I have presented in this post:

  • Get familiar with your components’ performance numbers.
  • Approximate with powers of 2 or 10 to transform multiplications into additions.
  • Use the rule of 72 to find out how long it takes for a quantity to double given its growth rate.
  • Model your systems as queues and leverage Little’s Law.

Written by Roberto Vitillo

Want to learn how to build scalable and fault-tolerant cloud applications?

My book explains the core principles of distributed systems that will help you design, build, and maintain cloud applications that scale and don't fall over.

Sign up for the book's newsletter to get the first two chapters delivered straight to your inbox.

    I respect your privacy. Unsubscribe at any time.