Virtual Private Cloud¶
Service ownership
Owner: network-platform (network-pm@clouddigit.ai) — Status: GA — Last audited: 2026-05-11
The tenant-isolated network that wraps every Cloud Digit compute resource. Standard VPC primitives — subnets, route tables, security groups, NACLs — over an SDN overlay.
What it is¶
A virtual network you control. CIDR ranges of your choice; subnets per availability zone; route tables you can attach to subnets; security groups for stateful per-resource policy; network ACLs for stateless per-subnet policy.
Default VPC¶
Every project gets a default VPC at creation: 10.0.0.0/16, one subnet per AZ. Sufficient for prototypes; for anything production-bound you should create a custom VPC with your own CIDR plan.
Building blocks¶
| Object | Scope | What it does |
|---|---|---|
| VPC | Region | Top-level container, /16 to /24 supported |
| Subnet | AZ | A slice of a VPC bound to one AZ |
| Route table | VPC | Per-subnet routing rules |
| Security group | VPC | Stateful, attached to ENIs |
| Network ACL | Subnet | Stateless, attached to subnets |
| Internet gateway | VPC | Default-route to the public Internet |
| NAT gateway | Subnet | Egress-only to public Internet for private subnets |
| VPC peering | Inter-VPC | Same- or cross-region private routing |
| Endpoint | VPC | Private link to S3, DBs, services |
Common patterns¶
Public/private split¶
VPC 10.10.0.0/16 ├─ public 10.10.1.0/24 (in az1) → IGW ├─ public 10.10.2.0/24 (in az2) → IGW ├─ private 10.10.11.0/24 (in az1) → NAT GW (in public subnet az1) └─ private 10.10.12.0/24 (in az2) → NAT GW (in public subnet az2)
Run web/LB tier in public subnets, app + DB tiers in private subnets, NAT gateway for outbound updates.
VPC peering across regions¶
Cloud Digit supports VPC peering across regions over the private backbone. CIDRs must not overlap; transitive peering not supported (use a hub-and-spoke design with VPN or a transit-VPC pattern if needed).
Limits¶
| Resource | Default per region | Cap |
|---|---|---|
| VPCs per project | 5 | 50 (bumpable) |
| Subnets per VPC | 50 | 200 |
| Security groups per VPC | 250 | 500 |
| Rules per security group | 60 | 200 |
| NAT gateways per AZ | 5 | 25 |
Pricing¶
VPC itself is free. NAT gateway is metered (per-hour + per-GB), as is inter-region peering. See Pricing.
Related¶
Operate this service¶
VPC is the network blast-radius boundary. Get it right at design time — retrofitting is painful.
Design principles¶
- One VPC per environment (prod / staging / dev). Cross-env traffic crosses a peering or a transit gateway, never a flat subnet.
- Plan CIDRs for 10 years.
10.0.0.0/8carve-out is enough for most orgs; pick a/16per VPC,/24per subnet. - AZ-symmetric subnets. Every public/private/data subnet exists in all 3 AZs. Makes HA design trivial.
- No overlap with on-prem. Coordinate with the network team before picking CIDRs.
IAM¶
| Role | Can do |
|---|---|
vpc.viewer | Read VPC topology |
vpc.builder | Create / modify subnets, route tables, SGs in a VPC |
vpc.admin | Create / delete VPCs, peering, transit gateways |
vpc.security-admin | NACLs, flow logs, security group baselines |
The vpc.security-admin role is usually a separate person from the vpc.builder — separation of duties.
Default security posture¶
Cloud Digit VPCs default to deny all inbound, allow all outbound. Don't relax this org-wide; tighten per workload.
Recommended baseline NACL applied at every public subnet: - Allow inbound TCP 443 from 0.0.0.0/0 - Allow inbound TCP 80 from 0.0.0.0/0 (typically only for ACME redirect to 443) - Deny everything else
Flow logs¶
Enable on every VPC, default destination is a project-owned S3 bucket:
bash cd network flow-logs enable --vpc acme-prod-vpc --destination s3://acme-flow-logs/
Lifecycle: 30-day hot S3, transition to Archive at 30 days, expire at 7 years (compliance default).
Peering and transit¶
Peering connects two VPCs (1:1, no transit). Transit gateway is the right primitive when you have ≥3 VPCs or want hub-spoke routing.
Tagging¶
env, cost-center, owner — required on every VPC.
Related¶
Metrics¶
| Metric | Healthy | Alert |
|---|---|---|
vpc.bytes_in / bytes_out | varies | sudden 10× change (DDoS or runaway) |
vpc.flows_per_sec | varies | sudden change |
vpc.nat_translation_count | within NAT GW capacity | > 80% of capacity |
vpc.route_table_misses | 0 | > 0 (route gap) |
Route table hygiene¶
Quarterly:
bash cd network route-table list --vpc acme-prod-vpc -o table cd network route-table audit --vpc acme-prod-vpc
The audit reports: - Routes pointing at deleted gateways/instances - Overlapping CIDRs - Asymmetric routes between AZs
NAT gateway capacity¶
A single NAT GW handles ~55,000 simultaneous connections. For workloads that exceed this (high-volume API clients), deploy multiple NAT gateways, one per AZ, and split traffic via route tables.
Security group changes — review process¶
SG changes hit traffic immediately. Reviewable change pattern:
```bash
Propose¶
cd network sg propose --sg sg-web --add-rule "tcp/443 from 0.0.0.0/0"
Returns a proposal-id; PR-style review in console¶
Reviewer applies¶
cd network sg apply --proposal-id
The propose-apply gate cuts most production SG mistakes.
Peering vs Transit¶
| Have | Want | Use |
|---|---|---|
| 2 VPCs | direct connectivity | Peering |
| 3+ VPCs | hub-and-spoke routing | Transit gateway |
| 2 VPCs + on-prem | unified routing | Transit + DC interconnect |
| 2 VPCs, transitive traffic forbidden | strict isolation | Peering |
Peering is non-transitive — A↔B + B↔C does not give A↔C. That's a feature.
Inter-AZ vs intra-AZ traffic¶
- Intra-AZ: < 0.5 ms, free
- Inter-AZ: 1–2 ms, billed at modest per-GB rate
- Inter-region (intra-BD): 5–15 ms, billed higher
Design assumes intra-AZ within a tier; cross-AZ only across HA tiers.
Related¶
Can't reach the internet from a VM¶
Standard diagnostic order:
- Subnet route table — does it have a
0.0.0.0/0route to an Internet GW (public subnet) or NAT GW (private subnet)? - Subnet NACL — allows outbound to
0.0.0.0/0? - VM security group — outbound to
0.0.0.0/0? - VM has an IP address — public IP (for public subnet) or NAT GW route (private subnet)?
- DNS —
/etc/resolv.confpopulated by VPC DHCP options?
```bash cd network reachability test --from vm-web-01 --to 1.1.1.1
Walks the route, NACL, SG, and reports the failing hop¶
```
Cross-subnet traffic blocked¶
| Cause | Check |
|---|---|
| Source SG outbound not allowed | cd network sg show <id> |
| Destination SG inbound not allowed | Same |
| NACL on source subnet | cd network nacl show <id> |
| NACL on destination subnet | Same |
| Route table missing inter-subnet route | VPC routes are implicit, but custom RT may override |
NACLs are stateless — outbound rule needed in addition to inbound on the return path.
Peering established but no traffic¶
After establishing peering, routes are not auto-injected. Add manually:
```bash
In VPC A's route table, route VPC B's CIDR to the peering¶
cd network route add --rt rt-public-a --destination 10.2.0.0/16 --target peering-ab
Same for VPC B → VPC A¶
```
Don't forget security groups — peered VPC traffic still hits SG.
Flow logs missing entries¶
Flow logs are sampled — by default 1:1 (every flow). Verify:
```bash cd network flow-logs show --vpc acme-prod-vpc
sample_rate should be 1.0¶
```
Common gaps: - Logs were disabled then re-enabled — gap during the off period - Destination S3 bucket misconfigured (wrong region, wrong IAM)
NAT gateway port exhaustion¶
WARN: NAT GW nat-abc port allocation > 90%
A NAT GW supports ~55,000 concurrent connections per destination IP. If a workload calls many distinct external endpoints, exhaustion is rare; if it pummels a single endpoint, exhaustion happens fast.
Mitigations: - Add a second NAT GW in another AZ; split traffic via route tables - Use connection pooling on the client to reduce concurrent connections - Move the workload to a dedicated egress (Floating IP)
Asymmetric routing after AZ failover¶
If a subnet's primary AZ goes down and traffic re-routes, return-path may go through a different NAT GW than the egress — stateful firewall drops it. Make NAT GWs per-AZ and route per-AZ to avoid asymmetry.
VPC delete fails¶
ERROR: DeleteVPC: 47 dependent resources prevent deletion
cd network vpc dependents --vpc <id> lists them. Common holdouts: orphaned ENIs, undeleted NAT GWs, security groups referenced by SGs in other VPCs.