Nobody needs a million requests/sec
Web benchmarks aren’t about the hugest number
When we see web benchmarks presenting a gazillion requests per second, the most common critique goes something like “this doesn’t matter for any real world application and people who care are fools”. I think we can all agree that for many use cases there are other prioritizations to be made, but it’s definitely not automatically so for the entire industry.
If you’re skeptical you’re probably looking at it from the wrong perspective. A million requests/second sounds ridiculous because the absolute number is higher than any reasonable application will ever require. But this is not the point; nobody needs a million req/sec and the benchmark doesn’t assume so. The point of such a benchmark is to showcase relative web efficiency. The benchmark measures relative CPU-time lost to web I/O. A slow web server can steal a significant amount of CPU-time only doing web I/O. This means your app will have less time to execute.
Any web server software can handle 15 trillion requests/second given enough hardware. Throw a nuclear powered 50 terawatt supercomputer on the problem and you’re there. The real issue is running your service cheaply and efficiently on low cost instances. You see, there is an implied conclusion that if a desktop tower can do a million requests/second then a cheap low power device can do just enough for the business need. Better performance does not mean gigantic numbers; it means cheaper uptime.
There are many web apps that bottleneck in some kind of database; for these problems you are more likely to see gain looking at optimised databases. But the industry is absolutely full of problems that can be run from in-memory where a significant part of the bottleneck is web I/O. Having a web server that can deliver a billion requests per second on a desktop tower, means you can have an app that delivers 500 requests per second on a cheap low power device. This, is the point. Efficiency.
You can also flip the perspective upside down and target the latency aspect. A web server with optimised performance is also very likely going to have low latency. Some applications are very sensitive to latency and will gladly use the most low-latency web server there is. Financial applications are one such example. Gaming another.
I don’t want this post to be an exhaustive list of all the use cases where web performance matters (hence the very few examples given); instead I want this post to be an introduction to the perspective needed to understand why ridiculous web benchmarks aren’t always... ridiculous. You and your day job does not make up the entire industry — there are many odd parts of the industry where in-memory solutions crave for fast web performance:
finance, gaming, signalling, texting, location, collaboration are just a few examples where you aren’t always bottlenecked by an SQL database.
And besides, having the ability to write fast web apps is never a bad thing. It rather guarantees you’re investing your time learning a platform that can (eventually) deliver good performance for (future) projects that aren’t limited by an SQL database like your current project is.