Scaling Considerations and High Availability in Software architecture or software design

One of the major motivations and benefits of migrating to the cloud is scalability. Well, that's what I'll be talking about today.

While scaling is a great and wonderful thing, there are some things that need to be considered. Depending on your application, there are all sorts of things that may change how you're scaling. Some of the things we're going to look at are scaling constraints, session data management, and major scaling methodologies. 


Starting with constraints, this is important to cover because not everything can scale in the same way. If you have a limited number of licenses, and getting more requires a lot of time, approval, or just a lot of processes to work through, dynamically scaling can be very difficult. Or perhaps your application just can't support the downtime it would take to change the underlying architecture so that it can handle more of a workload. In these cases, architecting in such a way so that you don't have to dynamically scale would be the way to go.

Looking at adding other components to help when you can't scale would be a good way to design around this. Resources like managed caches and queues help to not lose requests when scaling is limited. Also, offloading unnecessary work from unscalable servers could help. If you're working with a caching server and increasing its resources is going to be difficult, perhaps it would be easier if stale data was life-cycled off the server to conserve storage and not over-provision and overspend.

session data management

server session storage can really help provide scaling in what could otherwise be a very constrained environment. Using data stores like caches can really help to maintain functionality while not limiting your scaling capabilities.

major scaling methodologies

So there are two different types of scaling, horizontal and vertical. 

Horizontal scaling: 1 x n while n is different instances


Horizontal scaling: increase in the number of instances as needed and could go back again to one later


Horizontal scaling: Could be costly and needs more attention to the infrastructure

Vertical Scaling: 1 x n while n is the size of the instance


Vertical Scaling: One instances increases in CPU, Memory or storage with one infrastructure


Vertical Scaling: Downtime every increase and no way to decrease after the increase.


High Availability

A primary tenet here is to avoid single points of failure. If the failure of any single component or node would cause a negative impact on the system as a whole, look for ways to minimize the impact. In some cases, this could mean running more than one instance of a component, and in other situations, it could mean having a replacement that can easily be launched should the primary fail.
Another way to help build towards higher availability is to utilize ways to distribute traffic and requests across multiple endpoints or nodes. Using load balancers is a major way this can be accomplished. Also, you'll want to have some scalability.


By: Mutasem Elayyoub