The Scale Cube

the scale cube

Any software we start building, we start at 0, where there are no splits. We have a single application.

X-Axis scaling

Horizontal Duplication and Cloning of services and data

X-axis scaling consists of running multiple copies of an application behind a load balancer. If there are N copies then each copy handles 1/N of the load.


  • Easiest form of scaling
  • Quick to implement


  • Cost of infrastructure
  • Distributing cache
  • Scaling Teams or Organization

Y-Axis Scaling

Functional Decomposition and Segmentation – Microservices (or micro-services)

Y-axis axis scaling splits the application into multiple, different services. Each service is responsible for one or more closely related functions.


  • Allows Teams and Organizations to scale
  • Fault Isolation
  • Better caches


  • Difficult to build
  • Not for small teams, it becomes hard to maintain

Z-Axis scaling

Service and Data Partitioning along Customer Boundaries – Shards/Pods

In Z-axis scaling each server runs an identical copy of the code but of a different subset (or shard 1/Nth) of data. Ex: Free vs Paid customers, Geography based splitting.


  • Easy
  • Fault Isolation
  • Better response times


  • Difficult to build
  • Can’t scale teams or organization
  • Increased need for automation