Load balancing
Load balancing is used whenever Service A requests a service from Service B, but the latter is running in more than one instance, as shown in the following figure:

If we have multiple instances of a service such as Service B running in our system, we want to make sure that every, of those instances gets an equal amount of workload assigned to it. This task is a generic one, which means that we don't want the caller to have to do the load balancing, but rather an external service that intercepts the call and takes over the part of deciding to which of the target service instances to forward the call. This external service is called a load balancer. Load balancers can use different algorithms to decide how to distribute the incoming calls to the target service instances. The most common algorithm used is called round robin. This algorithm just assigns requests in a repetitive way, starting with instance 1 then 2 until instance n. After the last instance has been served, the load balancer starts over with instance number 1.