Most engineers think…
Most engineers first meet Security Groups and NACLs and think they're "the same firewall, just one for instances and one for subnets — so pick whichever." They throw a rule into either and expect it to behave the same way.
Wrong — and this exact assumption is the bug that eats whole afternoons. A Security Group is stateful and allow-only: it remembers a connection, so the reply comes back automatically and there is no such thing as an SG deny rule. A NACL is stateless: it judges every packet on its own with no memory, so you must allow both directions explicitly — and forgetting the outbound ephemeral-port range (1024–65535) for return traffic is the single most common 'it connects but the reply hangs' failure in AWS networking. They are not interchangeable; they are two different machines that you stack on top of each other.
① The two firewalls in a VPC — subnet gate vs instance door
Meet Sneha, an L1 cloud engineer at Infosys. Her team runs a web app in a VPC on account 123456789012. A ticket lands: "users can't reach the new EC2 web server." Sneha opens the security settings and finds two different things that both look like firewalls — a Security Group and a Network ACL. Which one is blocking her? To answer that, she first has to know where each one sits.
Here is the mental model that makes everything else click. A NACL is the building gate: it guards the whole subnet, and every packet entering or leaving that subnet passes through it. A Security Group is the apartment door: it guards one specific instance (technically its ENI). A visitor must clear the building gate to even get into the compound, and then clear your apartment door to get into your flat. Clear the gate but fail at your door, and they're stuck in the corridor — which is exactly what a half-open ruleset feels like.
Put it in packet-path order. Traffic arriving from the internet gateway hits the subnet's NACL first (inbound), then the instance's Security Group (inbound), then finally the OS. On the way back out, the reply leaves the instance through the Security Group (outbound), then the NACL (outbound), then the subnet edge. So the NACL is the outer ring and the SG is the inner ring. Two rings, every packet, both directions — memorise that and half the 2 a.m. tickets explain themselves.
The two firewalls, one tap each
Tap each card. These four facts are the foundation of every NACL/SG question you'll ever get.
Attached to the ENI of one instance. Stateful, allow-only. So: the apartment door — the last check before the OS.
Attached to a whole subnet. Stateless, ALLOW + DENY. So: the building gate — every packet in or out of the subnet passes it.
In: NACL then SG. Out: SG then NACL. So: NACL is the outer ring, SG the inner — both are checked.
An instance gets at least 1 SG; a subnet always has exactly 1 NACL. So: you can't escape either — only configure them.
Think of a Mumbai housing society. The NACL is the society's main gate with the watchman and the gate-pass register: it screens everyone entering or leaving the whole compound, and it can keep a 'banned visitor' list (a DENY). Your Security Group is your own flat's door: even after a visitor clears the gate, they still have to be let in at your door. A delivery boy who clears the gate but stands outside your locked flat is exactly an 'inbound allowed at NACL, blocked at SG' situation — and you debug it by asking 'which barrier stopped them, the gate or my door?'
Rahul at TCS sees a packet from the internet reach his subnet but never reach his EC2 instance. In packet-path terms, which barrier did it pass and which one stopped it?
Pause & Predict
Predict: a subnet has NO custom NACL associated and you never touched the default one. Does that mean traffic into the subnet is unfiltered, or is something still guarding it? Type your guess.
② Stateful vs stateless — and the ephemeral-port trap
This is the single most important idea in the lesson, and the one interviews probe hardest. A Security Group is stateful. When it allows your inbound HTTPS request on port 443, it remembers that connection, so the server's reply is automatically allowed straight back out — you never write a rule for the return traffic. And there is no 'deny' in a Security Group at all: every rule is an allow, and anything not explicitly allowed is implicitly denied.
A NACL is the opposite: stateless. It has no memory of the request, so when the reply tries to leave, the NACL judges it as a brand-new packet with no idea it's a response. That means you must write rules for both directions explicitly. NACLs also support ALLOW and DENY, and they evaluate rules in number order — lowest number first, and the first matching rule wins (then evaluation stops). Rule numbers run 1 to 32766 for the ones you create, with a final catch-all * rule that denies anything unmatched.
Here's the trap that catches everyone. When a client opens a connection to your server on port 443, the server's reply doesn't go back to port 443 — it goes back to a high, random ephemeral port the client picked. On a stateful Security Group that reply is auto-allowed, so you never think about it. But on a stateless NACL, if your outbound rules don't allow the ephemeral range, the reply is dropped — the request landed, the page just never loads. AWS's documented advice is to open outbound 1024–65535 on the NACL to cover all client types. The actual ranges vary: a Linux kernel uses 32768–60999, Windows Server 2008+ uses 49152–65535, and a NAT gateway or ELB uses the full 1024–65535 — which is why 1024–65535 is the safe blanket.
▶ Watch the reply die at a stateless NACL
A user in Bengaluru hits Sneha's web server on 443. The request gets in fine. Follow the reply on its way back out through the stateless NACL. Press Play for the healthy path, then Break it to see the failure.
aws ec2 create-network-acl-entry \ --network-acl-id acl-0a1b2c3d4e5f6a7b8 \ --rule-number 200 \ --protocol tcp \ --port-range From=1024,To=65535 \ --cidr-block 0.0.0.0/0 \ --rule-action allow \ --egress
# (no output on success — create-network-acl-entry returns nothing)
# verify it landed:
$ aws ec2 describe-network-acls --network-acl-ids acl-0a1b2c3d4e5f6a7b8 \
--query 'NetworkAcls[0].Entries[?Egress==`true`]'
[ { "RuleNumber": 200, "Protocol": "6", "RuleAction": "allow",
"Egress": true, "CidrBlock": "0.0.0.0/0",
"PortRange": { "From": 1024, "To": 65535 } } ]Symptom: you tightened a subnet's NACL, allowed inbound 443, and now curl https://server connects but just spins / times out — yet the Security Group is untouched and correct. Cause: the NACL is stateless and you only allowed the inbound; the server's reply goes to a high ephemeral port that your outbound NACL rules don't permit, so it's dropped. Fix: add an outbound ALLOW rule for TCP 1024–65535. Verify with VPC Flow Logs — you'll see the inbound packet ACCEPT and the outbound reply REJECT on the ephemeral port.
Priya at HCL allows inbound 443 on a custom NACL but users still can't load the page (request arrives, reply never returns). Security Group is fine. What's the fix?
Pause & Predict
Predict: you do the exact same lockdown on a Security Group instead of a NACL — allow inbound 443, leave outbound at the default. Does the reply get dropped this time? Type your guess.
③ How they combine — pass the NACL AND the SG
Now stack them. For a packet to reach your instance, it must be allowed by the NACL (subnet) AND the Security Group (instance) — both, not either. They're an AND gate, evaluated as the outer and inner rings we saw in section 1. This is genuine defense in depth: if someone fat-fingers an SG and opens SSH to 0.0.0.0/0, a NACL that denies port 22 at the subnet still blocks it. One mistake doesn't become a breach.
The Security Group has a power the NACL doesn't: it can reference another Security Group instead of an IP range. A rule like 'allow 8080 from sg-app' means 'allow traffic from any instance that carries the app-tier Security Group, whatever its IP is.' This is how you wire tier-to-tier trust that survives auto-scaling — new app servers come and go, their IPs change, but they all carry sg-app, so the rule keeps working. A NACL can only match CIDR ranges; it has no idea what an SG is.
Know the defaults cold, because exam questions hinge on them. The default Security Group allows all inbound traffic from other instances in the same SG (a self-reference) and allows all outbound; a brand-new custom SG allows nothing inbound and all outbound. The default NACL allows all inbound and all outbound (rule 100 allow, then the * deny catch-all). But a brand-new custom NACL is the opposite — it denies everything in both directions until you add allow rules. So a fresh custom SG is 'deny-in/allow-out' while a fresh custom NACL is 'deny-all' — mix those up and you'll lock yourself out.
That gives the rule of thumb that experienced engineers live by. Allow-by-design belongs in Security Groups: they're stateful, instance-precise, and can reference SGs, so they're where you express 'the app tier may talk to the db tier.' Deny belongs in NACLs: an SG literally cannot deny, so when you need to block something — a known-bad public IP scanning your subnet, or a blanket 'no SSH from the internet to this whole tier' — you reach for a NACL DENY rule with a low rule number so it's evaluated first.
# allow the app tier (sg-app) to accept 8080 only from the web tier (sg-web) aws ec2 authorize-security-group-ingress \ --group-id sg-0app1111appapp11 \ --protocol tcp --port 8080 \ --source-group sg-0web2222webweb22
{
"Return": true,
"SecurityGroupRules": [
{ "SecurityGroupRuleId": "sgr-0c0ffee1234567890",
"GroupId": "sg-0app1111appapp11", "IsEgress": false,
"IpProtocol": "tcp", "FromPort": 8080, "ToPort": 8080,
"ReferencedGroupInfo": { "GroupId": "sg-0web2222webweb22" } }
]
}Don't guess which layer blocked it — read VPC Flow Logs. Each line ends in ACCEPT or REJECT. A flow log shows whether the packet reached the ENI at all (if it's missing entirely, the NACL likely dropped it at the subnet) and whether the reply went out. Console path: VPC Console → Your VPCs → (select VPC) → Flow logs → Create flow log, then read them in CloudWatch Logs. The combination 'inbound ACCEPT, outbound REJECT on a high port' is the fingerprint of the stateless-NACL ephemeral bug.
Aditya at Wipro must block one specific malicious IP (203.0.113.66) from reaching an entire subnet of app servers. Which control does the job, and why not the other?
Pause & Predict
Predict: you add a NACL rule numbered 50 that ALLOWs 203.0.113.66, and a rule numbered 100 that DENYs the same IP. Does the IP get in or get blocked? Type your guess.
④ Designing it — web/app/db tiers, troubleshooting & the cheat-sheet
Let's build the canonical three-tier design every interview and the SCS-C02 exam expects. Three Security Groups, chained by reference. sg-web: allow inbound 443 from 0.0.0.0/0 (the public can reach the web tier). sg-app: allow inbound 8080 from sg-web only (the app tier trusts the web tier, nobody else). sg-db: allow inbound 3306 (MySQL) from sg-app only (the database trusts the app tier, nobody else — and certainly not the internet). Each tier trusts exactly one tier below it, by Security Group, so the chain holds even as instances scale in and out.
Then add the NACL layer for the things SGs can't do. On the public subnet's NACL, a low-numbered DENY for any CIDR you've seen attacking you (say 198.51.100.0/24), plus the all-important outbound 1024–65535 ephemeral allow so replies flow. On the database subnet's NACL, you might DENY all inbound from the internet CIDR entirely as a belt-and-braces subnet-wide block — even if a future SG mistake opens the db, the subnet gate still refuses internet traffic. That's defense in depth made concrete.
Karthik at Flipkart faces this
Karthik, an L2 engineer, gets a ticket: a new analytics box in the app subnet can reach the database fine, but the database's replies to it 'sometimes' stall, and a fresh SSH session into the box from a bastion connects but then freezes mid-session.
'Works one way, breaks the other' is the signature of a stateless return-path problem. Someone tightened the app subnet's custom NACL to allow inbound from the db subnet, but never added the outbound ephemeral allow — so reply packets to high client ports are dropped. The Security Groups are stateful and fine; the NACL is the leaky layer.
He separates the layers: SGs are stateful so they can't be the asymmetry. That points straight at the stateless NACL's outbound rules. He pulls VPC Flow Logs for the app subnet's ENI and looks for inbound ACCEPT paired with outbound REJECT on ports in the 1024–65535 range.
AWS Console → VPC → Network ACLs → (app subnet's NACL) → Outbound rules · and VPC → Flow logs in CloudWatch LogsAdd an outbound NACL ALLOW rule (e.g. rule 200) for TCP 1024–65535 to the destination CIDRs the box replies to (or 0.0.0.0/0 for a public-facing tier). Keep any DENY rules at lower numbers so they still win.
Re-run the SSH session and the db query — both now complete without freezing. Flow Logs show the outbound ephemeral packets flipping from REJECT to ACCEPT, confirming the return path is open.
# web tier: public HTTPS aws ec2 authorize-security-group-ingress --group-id sg-web \ --protocol tcp --port 443 --cidr 0.0.0.0/0 # app tier: only from the web tier's SG aws ec2 authorize-security-group-ingress --group-id sg-app \ --protocol tcp --port 8080 --source-group sg-web # db tier: MySQL only from the app tier's SG aws ec2 authorize-security-group-ingress --group-id sg-db \ --protocol tcp --port 3306 --source-group sg-app
{ "Return": true } # web rule added (443 from 0.0.0.0/0)
{ "Return": true } # app rule added (8080 from sg-web)
{ "Return": true } # db rule added (3306 from sg-app)
# net result: internet → web → app → db, each tier trusts only the one aboveYour society's main gate (the NACL) keeps two lists: an allow list (residents and approved visitors get in) and a ban list (this delivery agent is barred — a DENY). Your flat door (the Security Group) only has a guest list — you can invite people in, but you can't 'ban' someone at your door, you just don't add them. So when you need to actively block a known troublemaker for the whole building, that goes on the gate's ban list (NACL DENY), not your flat's guest list (SG).
For the cert, this lesson sits squarely in the SCS-C02 'Infrastructure Security' domain — Security Groups and NACLs are bread-and-butter VPC controls the exam loves to test with 'request arrives, reply doesn't' trick scenarios and 'which control can DENY a subnet-wide IP' questions. If you can state statefulness, the ephemeral-port rule, the AND-gate combination, and the SG-to-SG reference without hesitating, you've got the network-security questions in the bag. Career-wise this is day-one knowledge for any AWS support or cloud-ops role in India — it comes up in nearly every interview.
An interviewer asks Meera: "Give me the single cleanest way to let an auto-scaling app tier reach a database, without updating rules every time an app server's IP changes." Best answer?
🤖 Ask the AI Tutor
Tap any question — instant, scoped to this lesson. No login, no waiting.
Pre-curated from AWS docs + community Q&A, scoped to this lesson. For a live prod issue, paste your export into chat.techclick.in.
📝 Wrap-up assessment — six more
You've answered 4 inline. Six left. 70% (7 of 10) marks the lesson complete on your profile. Tap Submit all answers at the end.
🧠 In your own words
Type one line: In one line, why does a request reach your EC2 instance but the reply never come back, even though your Security Group is correct? Then compare to the expert version.
🗣 Teach a friend
Best way to lock it in — explain it in one line to a teammate. Tap to generate a paste-ready summary.
📖 Glossary
- Security Group (SG)
- Stateful virtual firewall attached to an instance's ENI. Allow-rules only; remembers connections so replies are automatic.
- Network ACL (NACL)
- Stateless firewall at the subnet boundary. Supports ALLOW and DENY, numbered rules evaluated in order, no memory of past traffic.
- Stateful
- The firewall remembers each allowed connection, so the return traffic is auto-permitted regardless of outbound rules. (Security Groups.)
- Stateless
- The firewall keeps no memory; each packet is judged alone, so request and reply must each be explicitly allowed. (NACLs.)
- Ephemeral port
- A short-lived high source port (1024–65535) the client picks; the server's reply is addressed to it. Must be allowed outbound on a NACL.
- ENI
- Elastic Network Interface — the virtual NIC of an instance. A Security Group attaches here, which is why the SG is 'instance-level'.
- Subnet
- An IP range inside a VPC, tied to one availability zone. Every subnet is associated with exactly one NACL.
- Rule number (NACL)
- 1–32766 for your rules; evaluated lowest-first and the first match wins, then evaluation stops. Put DENY rules at low numbers.
- Default NACL
- The NACL a subnet gets if you attach none. Out of the box it allows ALL inbound and outbound (rule 100 allow + the * deny catch-all).
- SG reference
- A Security Group rule whose source is another SG, not a CIDR. Means 'allow any instance carrying that SG' — survives auto-scaling.
- Defense in depth
- Layering independent controls so a gap in one is caught by another — here, the subnet NACL backstops an instance-level SG mistake.
- VPC Flow Logs
- Per-flow records showing ACCEPT/REJECT for traffic at an ENI/subnet/VPC; the tool that proves which layer dropped a packet.
📚 Sources
- Amazon VPC User Guide — "Control subnet traffic with network access control lists" (NACL basics: stateless; rules numbered 1–32766, lowest-first, first match wins; default vs custom NACL behaviour; the * catch-all deny). docs.aws.amazon.com/vpc/latest/userguide/vpc-network-acls.html
- Amazon VPC User Guide — "Control traffic to your AWS resources using security groups" + "Default security groups" (SGs are stateful, allow-only; default SG self-reference inbound + all outbound; new custom SG denies inbound). docs.aws.amazon.com/vpc/latest/userguide/vpc-security-groups.html · docs.aws.amazon.com/vpc/latest/userguide/default-security-group.html
- Amazon VPC User Guide — "Ephemeral ports" / NACL examples (documented client ranges: Linux 32768–60999, Windows Server 2008+ 49152–65535, NAT gateway & ELB 1024–65535; AWS recommends opening 1024–65535 outbound on a NACL). docs.aws.amazon.com/vpc/latest/userguide/nacl-ephemeral-ports.html
- AWS re:Post — "Resolve inbound traffic ACL connections to AWS" + AWS blog "Building three-tier architectures with security groups" (real practitioner pain: stateless NACL must allow ephemeral return range; SG-to-SG referencing for web/app/db tiers). repost.aws/knowledge-center/resolve-connection-sg-acl-inbound · aws.amazon.com/blogs/aws/building-three-tier-architectures-with-security-groups/
- r/aws + Medium (Remy NTSHAYKOLO, "Why the need to set ephemeral ports range for NACL outbound rules") — the community gotcha: request arrives but reply is dropped when the outbound 1024–65535 NACL rule is missing; diagnose with VPC Flow Logs (inbound ACCEPT, outbound REJECT). remy-nts.medium.com/aws-nacl-why-the-need-to-set-ephemeral-ports-range-for-outbound-rules-50ee93986555
- AWS Certified Security – Specialty (SCS-C02) Exam Guide — Domain: Infrastructure Security (VPC security groups & network ACLs, defense-in-depth, designing secure network controls for multi-tier apps). aws.amazon.com/certification/certified-security-specialty/
What's next?
You can now read the packet path and stack SGs on NACLs. Next we move up a layer to the storage that leaks the most in the headlines: an S3 bucket that's one checkbox away from public — and exactly how to slam that door shut.