Scalability is the property of system to evolve as the work load grows.
Types of work load
- Users
- New features
- Data volume
- Complexity in code
- Geographical Reach
Ways to scale your system
- Vertical scaling (Scale up) : adding more RAM / fast ram, it is effective but has limit to how effective it can be.
- Horizontal scaling (Scale out) : adding additional machines to distribute work on these new added machines to reduce work load on a single machine.
- Partitioning: Split the data across multiple nodes to distribute workload and avoid bottlenecks.
- Load Balance: Having a lot of machines running is ineffective if the traffic is on single machine. Here comes load balances to distribute the traffic across machines.
- Auto Scaling: Helps to automatically spin new machines when burst of traffic apprears.
- Caching: Go to mechanism to improve latency of APIs. It is effective in reducing load on databases.
- Content Delivery Network (CDNs): Used to serve static pages, images, videos results in faster load time.
- Asynchronous Communication: Defer long-running tasks / non-critical tasks to background queues or message brokers. This ensures your main application remains responsive to users.
- Microservices Architecture: Break down your service to small independent services so that these services can scaled independently.
- Multi-region Deployment: Deploy the application in multiple data centers or cloud regions to reduce latency and improve redundancy.
Top comments (0)