Elastic Load Balancer: Four Types, Four Problems — Choosing ALB, NLB, GWLB, or CLB for the SAA Exam
You just deployed a web app onto a single EC2 instance. Everything runs fine — until the day traffic spikes. One morning that EC2 dies, and because it is the only entry point, the whole system goes down with it. You spin up a few more EC2 instances to share the load, but immediately hit a wall: how does a client know which EC2 to call? If one EC2 dies, who notices and stops sending traffic to it? When traffic drops, who consolidates clients onto fewer machines to save money?
This is exactly the problem a load balancer solves: it sits in front of a group of servers, takes every request from clients, and distributes it to the servers that are currently healthy — so the system both handles load well and survives a single machine dying.
But your needs do not stop at web. Next week, the game team needs to balance UDP traffic with near-zero latency. The month after, a B2B partner asks you for a static IP to whitelist. Then the security team wants to insert a fleet of third-party firewalls into the packet path to inspect every flow. Each of these needs, it turns out, calls for a different type of load balancer in AWS’s Elastic Load Balancing family.
This article is the map that helps you tell those four types apart. For each one, we will cover: how it works under the hood, what problem it solves, when to use it, and the points that tend to get asked in the SAA exam.
Note: This is an overview to build a mental model and quickly recognize the answer in the exam. Most SAA traps live at the boundary between types — ALB or NLB, NLB or GWLB — so this article focuses on why each one exists and how they differ. Two dedicated deep-dives already exist: the mechanics behind ALB (load balancer nodes, scaling, why it has no static IP) in ALB Under The Hood, and the cost comparison of ALB versus NLB in Choosing ALB or NLB.
1. What is Elastic Load Balancing?
Elastic Load Balancing (ELB) is AWS’s fully managed load balancing service. You do not stand up HAProxy or Nginx servers yourself, you do not patch them, and you do not scale them: you declare a load balancer, and AWS handles the infrastructure underneath.
The first thing to pin down: an ELB is not “a single machine.” Under the hood, AWS deploys load balancer nodes spread across the multiple Availability Zones (AZ) you choose. As a result, if an entire AZ fails, the nodes in the remaining AZs keep taking traffic — the load balancer itself comes with high availability built in. These nodes scale up and down with traffic, so an ELB absorbs even sudden traffic surges without you doing anything.
The core value every ELB type delivers:
- Load distribution across many targets so no single machine is overwhelmed.
- Health check: automatically stops sending traffic to a failing target, and resumes when it recovers.
- High availability built in by spreading across multiple AZs.
- Decoupling clients from servers: clients only ever see one entry point, while the servers behind it can be added or removed freely.
- Auto Scaling integration: when an Auto Scaling Group adds or removes EC2 instances, targets are automatically registered with or removed from the load balancer.
AWS splits ELB into four types, each working at a different layer of the networking model. A handy mnemonic about “layers”: the lower the layer, the more “naive” and fast the load balancer (it only looks at IP and port); the higher the layer, the more it “understands” the content and routes intelligently — at the cost of more processing.
| Type | Layer | What it can “see” | The problem it solves |
|---|---|---|---|
| ALB (Application Load Balancer) | Layer 7 | HTTP content: path, host, header | Smart routing for web/microservices |
| NLB (Network Load Balancer) | Layer 4 | IP and port (TCP/UDP) | Extreme performance, static IP, any TCP/UDP protocol |
| GWLB (Gateway Load Balancer) | Layer 3 | Raw IP packets | Insert a fleet of third-party security appliances transparently |
| CLB (Classic Load Balancer) | Layer 4 + 7 | Limited | Legacy — do not pick it for new designs |
Before diving into each type, there is a set of concepts shared by all of them. Nail this framework and learning each type is just filling in the differences.
2. The shared framework of every ELB
Whichever type you use, the logical architecture revolves around three pieces: the Listener accepts connections, the Target Group collects the destinations, and the Health Check decides which destination is healthy enough to receive traffic.
2.1. Listener and Rule
A Listener is a process listening on a specific protocol-and-port pair, for example HTTPS on port 443. Every connection a client makes to the load balancer goes through a listener. An ALB’s listener also holds rules (routing rules) ranked by priority: the load balancer evaluates rules from highest priority down, executes the action of the first rule whose conditions match, and always has a default rule at the end to catch any request that matched no other rule.
2.2. Target Group and target types
A Target Group is a group of destinations that together receive traffic from the load balancer. A common point of confusion: the load balancer does not point directly at EC2, it points at a target group, and the target group is what holds the real destinations. Depending on the load balancer type, a target can be:
- Instance: a specific EC2 (identified by instance ID).
- IP: an IP address — lets you point to targets outside (on-premises over VPN/Direct Connect, or containers with their own IP).
- Lambda: ALB only.
- Application Load Balancer: NLB only.
2.3. Health Check — the heart of high availability
The load balancer periodically sends a probe request to each target (called an active health check). You configure the protocol, port, path (for example GET /health), the interval between probes, and the threshold of consecutive failures before a target is considered “unhealthy.” When a target crosses that threshold, the load balancer stops sending traffic to it; when it recovers with enough consecutive successes, traffic resumes. This is precisely the mechanism that keeps a dead EC2 from turning into errors for the user.
2.4. Cross-zone load balancing — a classic trap
Suppose AZ-a has 2 targets and AZ-b has 8 targets. When cross-zone load balancing is on, each load balancer node distributes evenly across all 10 targets in every AZ. When it is off, a node only distributes to the targets in its own AZ — meaning the 2 targets in AZ-a have to carry the same share of traffic as the 8 targets in AZ-b, causing heavy skew.
The point that gets asked is that the default differs by type:
| Type | Cross-zone default | Cost note |
|---|---|---|
| ALB | On (cannot be turned off at the load balancer level) | Free, no inter-AZ data transfer charges |
| NLB | Off | When on, inter-AZ traffic is charged as data transfer |
| GWLB | Off | Same as NLB |
| CLB | Off if created via API/CLI, on if created via Console | — |
2.5. Sticky session
A sticky session forces all requests from the same client to always land on the same target — useful when the server keeps session state in memory (stateful). The mechanism also differs by type:
- ALB and CLB use cookies (ALB generates an
AWSALBcookie for the duration-based mode, or follows your application cookie for the application-based mode). - NLB uses no cookie, sticking by the client’s source IP address instead.
2.6. Other shared features
- SSL/TLS termination: the load balancer can hold the TLS certificate (managed through ACM) and decrypt HTTPS, offloading work from the backend. Both ALB and NLB support SNI to serve many domains with many certificates on a single listener.
- Connection draining (called deregistration delay on ALB/NLB, default 300 seconds): when removing a target, the load balancer stops sending new requests but lets in-flight requests finish, avoiding mid-request cutoffs.
- Access logs: record every request to S3 for analysis or incident investigation.
- Deletion protection: guards a production load balancer against accidental deletion.
That covers the framework. Now to each type and the differences that give it its identity.
3. ALB — Application Load Balancer (Layer 7)
3.1. The problem it solves
ALB operates at Layer 7 (the application layer), meaning it can “open” the HTTP/HTTPS packet and read it. Because it understands the content, it routes based on things inside the request — path, hostname, header — not just IP and port. This is the load balancer you put in front of a web app, microservices, or a container cluster.
3.2. The routing mechanism under the hood
The heart of ALB is the set of rules on each listener. Each rule has a condition and an action. People usually name ALB’s routing styles after the very condition the rule is based on:
- Host-based routing: routes by the hostname in the request (the
host-headercondition) — for exampleapi.example.comgoes to the API cluster whileimg.example.comgoes to the image-serving cluster. - Path-based routing: routes by the URL path (the
path-patterncondition) — for example/api/*to the API cluster,/static/*to the static cluster. - Header-based routing: routes by an arbitrary HTTP header (
http-header) — for example splitting traffic by an API version header or byUser-Agent. - Method-based routing: routes by the HTTP method (
http-request-method) — for example separatingGETfromPOST. - Query string-based routing: routes by a parameter on the query string (
query-string). - Source IP-based routing: routes by the client’s source IP range (
source-ip).
The possible actions are:
forward: send to a target group.redirect: redirect, for example forcing HTTP to HTTPS.fixed-response: return a fixed response directly, for example a503maintenance page.authenticate: authenticate the user before letting them through.
For choosing a target within a group, ALB defaults to round robin (each target in turn), but can switch to least outstanding requests to favor the least-busy target — handy when per-request processing times vary widely.
A core trait: ALB terminates the connection (connection termination). It acts as a reverse proxy — it ends the connection from the client and opens a new connection to the backend. The consequence is that the backend no longer sees the client’s original IP.
To compensate, ALB inserts the headers X-Forwarded-For (the original client IP), X-Forwarded-Proto (http or https), and X-Forwarded-Port. If your application needs to know the real client, read these headers instead of the connection’s IP.
3.3. Distinctive features worth remembering
- Lambda as a target: ALB can invoke a Lambda function directly, turning it into an HTTP-serving backend without any EC2.
- Authentication offload: ALB handles login itself through Cognito or any OIDC provider, so the application does not have to write its own login layer.
- WAF integration: you can attach a WAF to block application-layer web attacks.
- gRPC, HTTP/2, WebSocket: supports modern HTTP-based protocols, fitting microservices and realtime.
- mTLS: supports mutual TLS, useful for B2B APIs that require the client to present a certificate.
One point to remember for the exam: ALB has no static IP, only a DNS name. The reason lies in its mechanism of continuously adding and removing nodes — detailed in ALB Under The Hood. If a question demands a static IP yet still needs Layer 7 routing, that is a hint to put an NLB in front of an ALB (section 4.4).
4. NLB — Network Load Balancer (Layer 4)
4.1. The problem it solves
NLB works at Layer 4: it only looks at IP and port, never opening the packet to read its content. The payoff for that “naivety” is speed — NLB handles millions of requests per second at microsecond-level latency, and works with any TCP/UDP-based protocol, not just HTTP.
4.2. Under the hood: flow hashing, no connection termination
Unlike ALB, NLB does not terminate connections. When a new connection arrives, NLB computes a hash over the 5-tuple — source IP, source port, destination IP, destination port, and protocol — and uses that value to pick a target. Every packet of that connection then flows straight to the chosen target, with NLB not stepping in the middle like a proxy.
Precisely because it does not have to set up and maintain two connections and does not read the content, NLB reaches extremely low latency. NLB also preserves the client’s source IP (source IP preservation). The backend sees the client’s real IP directly, without reading an X-Forwarded-For header as it would with ALB.
4.3. Static IP — NLB’s specialty
NLB provides one static IP per AZ, and you can attach a fixed Elastic IP for each AZ. This is the number-one reason to choose NLB: when a partner or a corporate firewall requires whitelisting a fixed IP, ALB (DNS only, IP changing constantly) cannot meet it, while NLB has it ready.
4.4. Distinctive features worth remembering
- Fronting PrivateLink: only NLB (and GWLB) can be the front for an endpoint service. To expose an internal service for a customer’s VPC to reach privately over PrivateLink, you put an NLB in front.
- NLB in front of ALB: an NLB’s target can be an ALB. This pairing gives you the best of both — NLB’s static IP on the outside, and ALB’s Layer 7 routing on the inside.
- Security Group support: NLB used to not support a Security Group, but since 2023 it does — a relatively new change, so do not rely on the old belief that NLB always lacks this feature.
- Idle timeout defaults to 350 seconds for TCP flows, far longer than ALB’s 60 seconds — fitting long-held connections.
5. GWLB — Gateway Load Balancer (Layer 3)
5.1. The problem it solves
This is the strangest and most-confused type. GWLB does not serve application traffic the way ALB or NLB do.
Instead, it exists to solve a very specific security problem: you want to insert a fleet of third-party virtual appliances — firewalls, intrusion detection systems (IDS), intrusion prevention systems (IPS), or deep packet inspection devices — into the middle of the traffic path for inspection, while still being highly available and auto-scaling with load.
GWLB does two things at once: it is both a transparent network gateway (a single entry/exit point for traffic) and a load balancer that distributes that traffic across the appliance fleet.
5.2. Under the hood: GENEVE and the GWLB Endpoint
GWLB operates at Layer 3 — it works on raw IP packets. Two concepts make up its “transparent” mechanism:
- GENEVE port 6081: GWLB uses the GENEVE protocol to wrap the entire original packet and tunnel it to the appliance. The appliance must “speak” GENEVE to decapsulate and inspect — which is also the main reason it does not serve application traffic the way ALB and NLB do.
- GWLB Endpoint (GWLBe): a kind of VPC endpoint (running on PrivateLink) placed in the application VPC. You edit the route table to steer traffic through the GWLBe; the GWLBe forwards to the GWLB in another VPC (usually a centralized inspection VPC), the GWLB distributes across the appliance fleet, and the traffic then returns along the correct path.
The subtle point that makes GWLB fit stateful firewalls: it maintains flow stickiness — every packet of the same flow (by default the 5-tuple, optionally the 3-tuple or 2-tuple) always lands on the same appliance, and both the forward and return directions go through that same appliance (a symmetric flow). That is how a stateful firewall gets to see both directions of a session to make a decision.
Why can NLB not replace GWLB here? NLB is a connection-distributing proxy, not a transparent gateway: forcing NLB to act as an inline inspection device requires complex routing configuration and usually source NAT, which breaks the transparency the problem demands. GWLB is what AWS purpose-built for exactly this need.
6. CLB — Classic Load Balancer (legacy)
The Classic Load Balancer is AWS’s first-generation load balancer, dating back to the EC2-Classic network era. It can do both Layer 4 (TCP/SSL) and Layer 7 (HTTP/HTTPS) at a basic level, but lacks almost all the features that make ALB and NLB powerful: no rule engine for path/host routing, only a single TLS certificate (no SNI), no Lambda targets, no gRPC, none of the modern protocols.
One frequently misunderstood point needs to be clear: CLB has not been retired. What ended in 2022 was the EC2-Classic network, not the Classic Load Balancer itself — CLB still runs inside a VPC. But it is previous generation, and AWS recommends moving to ALB or NLB for any new design.
For the exam, CLB is almost never the right answer for a new architecture. Its role in questions is usually a distractor carrying the keyword “legacy” or “Classic” — see it and you can almost always eliminate it, unless the question explicitly talks about an old system that needs migrating.
7. Which type? A decision map for the exam
When reading an SAA question, follow the decision tree: start from the clearest need (smart HTTP routing, or inserting a security appliance, or performance/static IP) and eliminate from there.
The table below collects the keywords that often appear in questions and the load balancer type they point to:
| Keyword in the question | Type |
|---|---|
| Routing by path/host/header, microservices, containers, WAF, Cognito, redirect, Lambda target | ALB |
| Static IP / Elastic IP, millions of requests, ultra-low latency, TCP/UDP, gaming, IoT/MQTT, PrivateLink endpoint service | NLB |
| Third-party security appliance, firewall/IDS/IPS, GENEVE, transparent inspection, centralized inspection VPC | GWLB |
| Preserve the original client IP | NLB (and GWLB); ALB does it via X-Forwarded-For |
| Legacy, Classic, EC2-Classic | CLB (almost always eliminate / migrate) |
Conclusion
Back to the story we opened with: you do not need a “do-everything” load balancer, you need the right type for the right problem. When web traffic grows and needs smart routing, that is ALB. When you need speed, a static IP, or TCP/UDP protocols, that is NLB. When you need to insert a fleet of security appliances into the path transparently, that is GWLB. And CLB is just a relic of the past.
What to pin down for the exam:
- Four types, four layers: ALB (Layer 7, HTTP content), NLB (Layer 4, IP and port), GWLB (Layer 3, raw packets), CLB (legacy, avoid).
- The shared framework: Listener accepts connections, Target Group collects destinations, Health Check keeps high availability — learn it once, use it for every type.
- Cross-zone defaults differ: ALB on and free; NLB and GWLB off, and NLB charges inter-AZ data transfer when on.
- ALB has no static IP (DNS only); NLB has a static/Elastic IP and preserves the client IP — this is a contrast pair that gets asked often.
- GWLB is the only one for third-party security appliances: recognize it by GENEVE port 6081, the GWLB Endpoint, and the “transparent” requirement; do not pick NLB for this problem.
- NLB and GWLB are the only two types that can front a PrivateLink endpoint service.
Which load balancer type are you putting in front of your system, and why? Share below.