The system design interview is a great way to assess the seniority of a candidate in an interview. You can find a lot on the Internet on how to prepare for the design interview or which system design interview questions to expect, but very little on how to conduct an actual interview.
In this post, I will describe my experience interviewing senior candidates for distributed systems roles. If you stumbled on this piece because you want to know how to ace the design interview, you should keep reading - a good way to prepare for the interview is to know how interviewers think about them.
A word of advice before I continue: this type of interview is best suited for interviewing engineers who have some experience working in the industry. Engineers straight out of school, or with few years of experience, don’t have the skills necessary to ace this interview yet.
A system design interview entails asking a candidate to design a system that solves a specific problem. The interview, which typically takes about an hour, is mostly an open-ended conversation - the goal is to get a sense of the candidate’s experience and problem-solving skills while working with them as they were your coworker.
You want to pick a system you worked on in the past, or at the very least, know well. Ideally, it should be something relevant to what your team or your company does so that the candidate gets the chance to see the kind of projects they will be working on if they join your team. Don’t forget that it’s not just you evaluating them - they are also evaluating you.
Another reason to pick a system you know well is that you have spent a considerable amount of time internalizing its solution space, scalability limitations, tradeoffs, and failure modes. It’s not only easier for you to prepare for the interview, but it also gives you the ability to compare their solution to what you, or your team, came up with. And who knows, they might even give you some insights that you missed - I call this a win-win.
Whatever you do, don’t ask them how to design a Twitter clone (unless you work at Twitter), or any other typical design interview question. When you are in the hotel booking business and ask to design a Twitter clone during the interview, you are telling the candidate that there is nothing interesting going on where you work. It’s a big red flag for a senior engineer, and rightly so.
“But what if I don’t have a Twitter-scale problem?” - I hear you ask. That’s perfectly fine, you don’t need to have millions of users to reason about scalability - plus, it’s just one of the many challenges to solve when designing a distributed system.
Spend the first minutes talking to the candidate to get a sense of their past experience, their strengths, and their weaknesses. Then, pick a design question you prepared before the interview and present it to the candidate. Draw a basic sketch of the system you want them to architect on the whiteboard - this will help make them feel more relaxed as empty whiteboards can make people nervous.
When presenting the system they should design, start with a small scale and make sure the candidate is on the same page - there will be time later to dig into scalability. Also, don’t hash out all the requirements in detail, as you want the candidate to ask questions to see whether they can get a clear picture of what they have been asked to design before jumping into the solution.
Finally, ask them how they would design the system. But don’t just sit silently in a corner, talk to the candidate, and try to understand why they are making the decisions they are and how they are approaching the problem. At this stage, there is no need to deep dive into a specific component of the design or technology - go broad and give the candidate space to hash out a complete architecture.
If you feel there are too many moving parts, ask whether there is a way to remove specific components while not affecting the overall solution. Also, ask the candidate to justify the tradeoffs they are making - for example, why did they pick a NoSQL store rather than SQL one?
By the end of the first 30 minutes, they should have a simple design sketched out on the whiteboard.
Now that the candidate has an initial design, crank up the scale. Increase the number of requests per second, the volume of the data ingested, or anything else relevant to your specific problem. No matter how the initial design looks like, it’s bound to hit a brick wall as you increase the load it’s under.
Ask them what they think will break as the scale increases and how they would address it. It’s a great sign if the candidate can point out brick walls on their own. Once they have identified the breaking points, work with the candidate to eliminate them. Start with the one that has the highest likelihood of being hit first, as the others might become less relevant once the design evolves. This stage is very much iterative, and as the design changes, so do the tradeoffs - discuss what those are and whether they make sense.
Point out failure modes they missed and ask them what would happen if this component, or that link, was to go down. You can tell a lot about a candidate’s experience by how quickly they can spot failure modes, like single points of failure, and come up with elegant ways to solve them. Because you are asking about a system you have experience with, you can just recollect what the most typical issues were in the past and how you mitigated them.
By the end of the first 50 minutes, you should have discussed at length scaling and reliability issues. If there is time for it, pick a slice of the design and drill down into it. For example, if the candidate’s design requires replicating data, talk about consistency guarantees, and how to enforce them in detail.
This is akin to the question: “tell me what happens when you type example.com in your browser,” but tailored to the design at hand. If you are hiring a senior systems engineer, you expect them to have a deep understanding of the basics, like TCP, paging, consistency models, etc. This knowledge is critical when debugging, and optimizing, complex systems.
By the end of the interview, you should have a good idea of the candidate’s strengths and weaknesses and how they approach problems that don’t have a single best answer. A senior candidate should be able to quickly go through the first part of the interview and have no issues scaling up the system and addressing failure modes in the second part. They should have a good explanation of every decision they took and be able to list the tradeoffs of their design. When interviewing strong candidates, you should leave the interview feeling like you learned something from them.
The system design interview is a great way to get a feeling of a senior’s candidate’s problem-solving skills. Connecting boxes with arrows is just one part of it, though - and not the most challenging one. The tricky part is understanding the requirements, failure modes, and tradeoffs. This type of interview also encourages lively discussions, which gives you a glimpse into the candidate’s soft skills. Can they explain things clearly? Do they receive constructive feedback well?
If you are interested to learn more about designing distributed systems, not just in the context of interviews, check out the Distributed Systems Manual. It talks at length about network fundamentals, the theory of distributed systems, architectural patterns of scalable systems, resiliency patterns, and operational best practices on how to maintain large-scale systems with a small team.