Load Balancing Concepts
What Is Load Balancing?
Load balancing distributes incoming network traffic across multiple servers to ensure no single server becomes a bottleneck. It improves availability, reliability, and performance.
Load Balancing Algorithms
Round Robin
Distribute requests sequentially across servers.
Request 1 → Server A
Request 2 → Server B
Request 3 → Server C
Request 4 → Server A
Least Connections
Route to the server with the fewest active connections.
Server A: 5 connections
Server B: 2 connections
Server C: 3 connections
→ Route to Server B
IP Hash
Route based on client IP address for session persistence.
hash(client_ip) % server_count
Weighted Round Robin
Distribute based on server capacity.
Server A (weight: 3) → gets more requests
Server B (weight: 1) → gets fewer requests
Least Response Time
Route to server responding fastest.
Load Balancer Types
Layer 4 (Transport Layer)
Works at TCP/UDP level, very fast.
Layer 7 (Application Layer)
Works at HTTP level, more intelligent routing.
High Availability
Active-Passive
One primary, one standby. Simple but underutilizes resources.
Active-Active
All servers handle traffic. More efficient but complex.
Sticky Sessions
Ensure requests from same client go to same server for session consistency.
Common Solutions
- Nginx: Lightweight, high performance
- HAProxy: Powerful, many algorithms
- AWS Elastic Load Balancer: Cloud-native
- Kubernetes Service: Container orchestration
