If you have been around the industry for a year or more, you’ve almost undoubtedly heard someone talk about scale. Perhaps they have said, ‘that feature won’t scale’, or ‘I have to keep in mind that this needs to scale.’ If you were anything like me, the first times you heard about scaling, you wondered how the hell they knew what would and would not scale. It seemed like some sort of black art.
What does it mean to scale, exactly? I think that part is fairly obvious, but let’s go over it just in case: scalability is a characteristic in which a system is able to handle increasingly larger amounts of work or throughput.
Let’s take a simple example that’s familiar to I.T., although not about software development: server names. Many small companies, startups and skunkworks name their servers with themes, often related to pop culture. Characters from The Simpsons is one I have seen a few times. It’s cool, nerdy and a bit of fun, and in a small team, perfectly manageable, but ultimately, it won’t scale. When you have a company with a thousand employees and hundreds of servers (or let’s say for today’s environment, virtual machines), you will have a hard time keeping track of all of those servers without a naming system. ‘ACCTNG_NL_0001’ is a boring name, but its purpose is a whole heck of a lot clearer than ‘LIONELHUTZ’.
I ran into another type of scaling problem at work a few days ago. In this particular role, we try to keep in the back of our minds that we may want to spin up multiple copies of our software on various virtual machines or containers to handle increasing load. This will often scale better than simply throwing more CPU and RAM at a single process. In my case, I had a two-step process that was not ‘atomic’. In other words, if I wasn’t careful, because the work was being done in two steps, something that ran in between those steps could invalidate the process. With a single thread and some care to make sure that everything was running synchronously (using Apache Camel’s “direct” route), I was fine. But when someone mentioned that the operations team might spin up multiple copies of my software, suddenly multiple two-step processes could get interleaved and therefore invalidated. In my particular case, scaling wasn’t a big necessity, so I was able to prohibit the spinning up of multiple copies and limit scaling to increased resources for a single process. If I wasn’t able to do that, significant design work might be required. The processes would have to coordinate, which means inter-process communication, possibly with a database — overhead I did not want to introduce. (Adding more processes is often called ‘scaling out’, and adding more resources for an existing process is often called ‘scaling up’.)
Scalability and multithreading in many software systems are closely related, so you can often achieve the former by taking care to consider the latter. One great way of handling multiple threads is by making as many objects as possible ‘read-only’ or immutable. If you never change an object, you don’t need to be as careful with access control. Remember, threads that aren’t waiting to acquire a lock on a resource are threads that can be busy processing!
But try as you might, sometimes you can’t avoid problems with scale. You might have a core algorithm or business rule that requires a type of processing that slows with increased data size. This is particularly so when you are looking for patterns and are comparing blocks of data. The more blocks of data that exist, the more comparisons that have to run. The requirements of your system will often dictate whether or not a scaling problem is acceptable or if a re-design or sacrifices must be made.