To blog Previous post | Next post
Throughput and latency: performance tuning made simple
Many of our previous topics have measured the behaviour of certain systems in terms of either latency or throughput. As it can be confusing to understand what either of the term actually stands for, I decided to write a post explaining the terms in performance optimization standpoint.
Let us start with an example to illustrate the concepts. As we are software persons, what could be a better example than a factory line. A physical assembly line in a Foxconn factory producing iPad Airs. The work of compiling an iPad is done on the line in sequential manner. Walking along the lines, we measure that it takes four hours to complete an iPad from the start where the casing is molded to the finish when the acceptance tests have concluded.
Measuring the speed of the very same assembly line we can see that one completed iPad is ready after each second, 24 hours a day, every day. So each day this factory is producing 86,400 iPads.
Now, if we were the Tim Cook looking at the numbers, we would start panicking. Being a CEO of Apple we would not care too much about the time it takes to complete one iPad in four hours, but the total number of iPads produced per day is frightening. Based on previous releases, we could estimate that during the first quarter after launch the need would be around 150M iPad Air to meet the need. Which is about double than the Foxconn was capable of. So, we would immediately call Pegatron and let them to set up an equivalent assembly line.
After Pegatron has completed their duty, we now have two equivalent factories shipping the very same iPads. By doing this, our imaginary Tim Cook has doubled the number of iPads produced per day – we now have 172,800 iPads available to ship each day. It is important to note that we have not reduced the time to complete an individual by even a millisecond – it still takes 4 hours to complete an iPad from the start to finish.
In the example above, Tim Cook carried out a performance optimization task, coincidentally dealing with both throughput and latency. As in any good example, he started by measured the system’s current performance:
- Latency of the system: 4 hours
- Throughput of the system: 86,400 iPads/day
Note that latency is measured in time units suitable for the task, so whether you use milleniums, hours or milliseconds it is up to you. Throughput of a system is measured in completed operations/time unit, operations being anything relevant to the specific system. In the example case, the operations were the shipped iPads.
Next, Tim made another important decision by picking a performance criteria to meet. Based on his historical experience, he extrapolated that during the first quarter after launch, he would need approximately 15,000,000 iPads to fulfill the need. Backtracking it he understood that shipping 86,400 units was not to make it, so he made the call to Pegatron doubling the capacity and being able to meet the demand.
He also made an important decision of not to focus on latency. In his case, he was most likely correct, as his system was built in a way where the latency of the manufacturing lines did not matter to him.
So whats the point here? I hope that in the article I was able to explain three different concepts of performance optimization you need to understand in order to succeed:
- The difference between throughput and latency
- Current throughput and latency of the system
- Criterias needed to meet, based on business requirements
And then and only then you are ready to start with actually optimizing the system.
But I doubt many Tim Cooks are actually reading this blog, so we need to put the fresh knowledge in context of a software in our next posts. Stay tuned by subscribing into the RSS feed or Twitter stream.
In you example, I think “4 hours” is response time. It’s the time server completes a request.
Am I correct?
Yes, this is response time. Which can be viewed as a system’s latency from end user perspective.