Hybrid Cloud Burst¶
Service ownership
Owner: dc-operations (colo-pm@clouddigit.ai) — Status: GA — Last audited: 2026-05-11
Connect a colocation footprint into Cloud Digit cloud over private fabric — keep the colo as steady-state baseline, burst into cloud for peak.
What it is¶
A networking pattern (and a billing arrangement) that lets you treat your colo cabinets as the primary production environment and Cloud Digit cloud as elastic spillover. Traffic is routed over a private cross-connect, so it never traverses the public Internet.
Topology¶
graph LR
subgraph "Colo (your steady state)"
Colo1[App tier]
Colo2[DB primary]
end
subgraph "Cloud Digit cloud"
LB[Load Balancer]
ASG[Auto Scaling Group]
end
Internet((Internet)) --> LB
LB --> ASG
ASG -->|private fabric| Colo1
Colo1 -->|sync replication| Colo2
ASG -.peak only.-> Colo2 Use cases¶
- Holiday / event-day surge capacity (cricket finals, Eid, election results)
- New-product launch traffic
- Periodic batch (monthly close, end-of-quarter reporting) without permanent cloud footprint
- DR — cloud as the fallback target when colo is unavailable (alternative pattern: see DRaaS)
Network configuration¶
- Cross-connect between your cage/rack and Cloud Digit cloud fabric
- VPC in the cloud side, peered into your colo VLANs via the cross-connect
- Routing — BGP-default; static routes possible for simple cases
- Security — security groups on the cloud side; you operate firewalls on the colo side
- DNS — split-horizon supported; cloud-side resolver can serve colo records
Cost story¶
- Steady-state stays in colo (predictable monthly spend)
- Peak burns cloud per-hour
- Cross-connect is a fixed monthly fee
- Net effect: most months you spend the colo flat fee; peak months spend extra cloud
Pricing¶
- Cross-connect at standard rates
- Cloud resources at standard hourly / per-second rates
See Pricing.
Related¶
Operate this service¶
Connect your colo deployment to Cloud Digit cloud services for burst capacity, DR, or hybrid architectures.
When this fits¶
- Steady-state on colo gear, peaks burst to cloud
- Colo as primary, cloud as DR
- Cloud-native services (managed DB, AI) alongside colo'd legacy gear
- Migration in progress (colo → cloud, gradual)
Topology¶
[Colo cage / racks] ↓ cross-connect [Cloud Digit VPC] ↓ [Cloud services: VMs, K8s, managed DBs, etc.]
The cross-connect makes the colo network indistinguishable from a VPC subnet — same IPs, same security groups, same routing.
IAM¶
Standard CD IAM applies — cloud-side resources governed by cd.* roles; colo-side by colo.* roles. Cross-cutting concerns (network policy, audit) integrate.
Network design¶
Colo network as a first-class VPC subnet: - Allocate CIDR from your VPC supernet (e.g., 10.50.0.0/16 for colo, 10.0.0.0/16 for cloud) - Routes between colo and cloud subnets implicit in the VPC - Security groups apply uniformly - VPC Flow Logs capture all cross-domain traffic
Burst patterns¶
| Pattern | Use |
|---|---|
| Cloud-burst compute | Colo for baseline, cloud Auto Scaling for peaks |
| Cloud-burst storage | Colo for hot data, cloud S3 for cold/overflow |
| DR-burst | Colo primary, cloud cold standby ready to scale |
| Workload migration | Gradual move; eventually decom colo |
Cost¶
Cross-connect: monthly fee per fiber pair. Burst usage: standard cloud rates. The hybrid premium is small; the benefit is large.
Related¶
Metrics¶
Cross-domain visibility:
| Metric | Notes |
|---|---|
hybrid.colo_to_cloud.bytes_per_sec | Inter-domain traffic |
hybrid.cloud_to_colo.bytes_per_sec | Reverse direction |
hybrid.cross-connect.utilization_pct | Capacity planning |
hybrid.burst.cloud_instances | Cloud capacity in use |
hybrid.failover.last_drill_age_days | DR exercise cadence |
Burst trigger patterns¶
Auto-burst: - Auto Scaling Groups in cloud configured with hybrid-burst-on-trigger - Trigger: colo capacity threshold (e.g., > 85% sustained for 10 min) - Cloud ASG scales out; LB distributes traffic across colo + cloud
Manual burst: - Operator decides; pre-provisioned cloud capacity scaled up
Workload portability¶
Design workloads to run identically in colo and cloud: - Use container images (run anywhere) - Avoid hardware dependencies (custom NICs, GPU-direct) - Externalize state (managed DBs, S3)
If workload is colo-only by design (e.g., uses colo'd FC SAN), it can't burst.
Cost optimization¶
Cloud burst is expensive per hour but cheap overall (only pay during peaks): - For frequent peaks: increase colo baseline - For rare peaks: keep colo lean, burst more - Right balance: 6-month cost analysis
DR pattern¶
Colo + cloud DR: 1. Continuous replication (DBs, files, configs) to cloud 2. Cold standby cloud resources 3. Periodic DR drill (failover, validate, fail back) 4. Real failover triggered manually or by health-check
See DRaaS for orchestration.
Related¶
Burst not triggering¶
ASG configured to burst on colo capacity threshold, but isn't scaling: - Verify trigger metric is firing (cd colo metric --rack <id> --metric capacity_pct) - ASG max instance limit reached - Cross-connect capacity insufficient (burst traffic + existing = saturated) - Cloud VPC capacity in target region
Cross-connect saturated¶
hybrid.cross-connect.utilization_pct > 95%: - Real burst happening (good — order more cross-connect capacity) - Asymmetric traffic (one direction hot, other quiet) — investigate - Bug: cloud-burst egressing through cross-connect when cloud-cloud path is shorter
Latency higher than expected¶
Cross-domain (colo ↔ cloud) latency: - Same DC: < 1 ms - Different DC (intra-BD): 5-15 ms
If observing 30+ ms intra-DC: ticket. Likely routing issue.
Workload doesn't work in cloud¶
Bursted to cloud, but the workload fails: - Hardware dependency (FC, custom NIC) - Stateful and assumes local storage - License-tied to specific physical CPU - Network ACLs allow colo but not cloud subnets
Fix at workload design layer; not all workloads are burst-ready.
DR failover failed¶
DR drill exposed issues: - Replication caught up but gap discovered (manual data not replicated) - DNS cutover incomplete - Cloud capacity insufficient (didn't pre-warm) - Cross-cutting auth tokens region-locked
Document, fix, re-drill. The whole point of drills is to find these.
Bill spike from burst¶
Burst hours add up at cloud rates: - Was the burst justified (real demand) or runaway? - ASG max instances cap; verify reasonable - For sustained higher baseline, increase colo capacity instead
Hybrid network segmentation issues¶
Workloads in cloud can't reach colo'd services: - Routes between subnets configured? - Security groups allow cross-domain traffic? - DNS resolution works (private hostnames need to resolve from both sides)?
bash cd network reachability test --from cloud-vm-01 --to colo-server-04