VM flavors¶

Service ownership

Owner: compute-platform (compute-pm@clouddigit.ai) — Status: GA — Last audited: 2026-05-11

Three flavor families. Naming convention: <family>-<vcpu>x<ram-gib> — e.g., std-4x16 is 4 vCPU + 16 GiB RAM Standard.

Standard (`std-*`) — 1:4 vCPU:RAM ratio¶

The default for general-purpose workloads. Web/app servers, K8s workers, build agents.

Flavor	vCPU	RAM (GiB)	Boot disk	Net (Gbps)
`std-1x4`	1	4	20 GiB	1
`std-2x8`	2	8	20 GiB	2
`std-4x16`	4	16	30 GiB	5
`std-8x32`	8	32	50 GiB	10
`std-16x64`	16	64	50 GiB	15
`std-32x128`	32	128	80 GiB	25
`std-48x192`	48	192	100 GiB	25

Memory-optimized (`mem-*`) — 1:8 vCPU:RAM ratio¶

For caches, JVM heaps, in-memory analytics, large Postgres workers, Redis, OpenSearch.

Flavor	vCPU	RAM (GiB)	Boot disk	Net (Gbps)
`mem-2x16`	2	16	30 GiB	2
`mem-4x32`	4	32	50 GiB	5
`mem-8x64`	8	64	50 GiB	10
`mem-16x128`	16	128	80 GiB	15
`mem-32x256`	32	256	100 GiB	25
`mem-48x384`	48	384	100 GiB	25
`mem-64x512`	64	512	100 GiB	25

CPU-optimized (`cpu-*`) — 1:2 vCPU:RAM ratio¶

For HPC, encoding, batch jobs, build farms — workloads that lean on the CPU and don't need a lot of RAM per core.

Flavor	vCPU	RAM (GiB)	Boot disk	Net (Gbps)
`cpu-2x4`	2	4	30 GiB	2
`cpu-4x8`	4	8	50 GiB	5
`cpu-8x16`	8	16	50 GiB	10
`cpu-16x32`	16	32	80 GiB	15
`cpu-32x64`	32	64	100 GiB	25
`cpu-64x128`	64	128	100 GiB	25
`cpu-96x192`	96	192	100 GiB	25

Resize¶

You can resize between flavors of the same family with a reboot. Cross-family resize (e.g., std-4x16 → mem-4x32) is also supported but requires a stop → resize → start cycle.

vNUMA¶

VMs with ≥ 16 vCPU expose vNUMA topology to the guest. Pin vcpupin for latency-sensitive workloads.

Operate this service¶

AdministrationOperationTroubleshooting

Choosing and governing flavors across an organization. Flavors look simple but drive both performance and ~60% of the BDT bill.

Picking the right family¶

A 30-second decision tree:

Workload	Family	Why
Web/app, K8s worker, mixed	`std-*`	1:4 is the everyman ratio
Postgres/MySQL > 32 GiB shared_buffers	`mem-*`	Avoid wasting cores you'll never use
Redis, OpenSearch JVM heap	`mem-*`	Same
Build farm, video transcode, HPC, compression	`cpu-*`	Saturates cores; RAM mostly idle
Bastion / jump-box	`std-1x4`	Minimum viable

If you're not sure, start std-* and migrate after 7 days of metrics. Cross-family resize is supported (with a reboot).

Right-sizing policy¶

Quarterly, review these metrics per VM:

Metric	Threshold	Action
`cpu.busy` p95	< 20% for 30 days	Resize down one step
`cpu.busy` p95	> 75% for 14 days	Resize up, or move to scaling group
`mem.committed` p95	> 85% of allocation	Resize up (mem family if 1:4 was wrong)
`mem.committed` p95	< 40% for 30 days	Cross-family to `cpu-` or smaller `std-`

The console Cost Explorer → Right-sizing report does this automatically and proposes a BDT savings figure.

Flavor allowlists¶

For larger orgs: restrict which flavors which projects can launch. Useful when:

A small team keeps creating std-48x192 for "future use"
You want prod-only access to cpu-96x192

Define in Project → Policies → Compute → Allowed flavors, or via API:

bash cd project policy set acme-prod \ --allowed-flavors 'std-*,mem-*,cpu-{2..16}x*'

Wildcards are glob-style. Denied flavors return FlavorNotAllowed at create time with the policy name in the error.

Commitment plans¶

Commitments are flavor-family-agnostic — you commit vCPU-hours, GiB-RAM-hours, and net-Gbps-hours separately. The platform applies the discount to whichever VMs are running.

Term	Discount range	Withdraw penalty
1-year	20–30%	50% of remaining
3-year	35–55%	70% of remaining

Only commit your baseline — the always-on floor. Above-baseline goes on-demand or into scaling groups.

vNUMA and pinning¶

For VMs ≥ 16 vCPU, the guest sees a vNUMA topology that mirrors the hypervisor socket layout. To take advantage:

```bash

Inside the guest — pin a latency-sensitive process to NUMA node 0¶

numactl --cpunodebind=0 --membind=0 ./trading-engine ```

The platform does not allow custom vcpupin from the API — that would break live migration. Latency-sensitive workloads should run on Dedicated Hosts with numa=strict.

Related¶

Running flavor changes safely: planning the reboot, validating performance afterwards, and tracking commitment utilization.

Resize procedure¶

Same-family (no stop):

```bash cd compute vm resize --vm web-01 --flavor std-4x16

VM: Running → Resizing → Rebooting → Running (~45 s)¶

```

Cross-family (stop required):

```bash cd compute vm stop --vm db-01 cd compute vm resize --vm db-01 --flavor mem-8x64 cd compute vm start --vm db-01

Total downtime: ~90 s typical¶

```

The Console UX hides the stop step, but the API requires it explicitly — automation should wait --for stopped before resize.

Validation checklist after a resize¶

VM lifecycle is Running in console and via cd compute vm show.
Inside the guest: bash nproc # should match new vCPU count free -h # total RAM matches
For mem-family targets: verify your app's heap/buffer config has been updated — Postgres shared_buffers, JVM -Xmx, etc. The OS sees the new RAM but apps usually don't unless configured.
Metrics flow — confirm cpu.busy and mem.used are still emitting in console.
Network bandwidth — bigger flavors have higher Gbps caps; if you're network-bound this is where you'll see the win.

Live migration and flavor changes¶

The platform may live-migrate a VM during hypervisor maintenance. Same-flavor target host is guaranteed — your VM never lands on a host that can't satisfy it. If no compatible target is available in the AZ, the platform schedules the maintenance for off-hours and notifies the project owner 72 h in advance.

Commitment plan utilization¶

Track via console Financial → Commitments:

Field	What it means
Committed vCPU-h	What you're paying for, per hour
Used vCPU-h	What you're actually consuming
Utilization %	Used ÷ Committed × 100
Wasted BDT (MTD)	The discount you'd have gotten if you matched usage

Target utilization > 90%. Below 70% sustained means you over-committed — wait for renewal and recommit lower.

Per-second billing edge cases¶

VMs stopped via API stop billing for vCPU/RAM immediately. Volumes keep billing — that's separate.
Reboot does not stop billing.
Resize via stop/resize/start doesn't reset the 60-second billing minimum if you re-start within the same second-bucket.

Capacity planning¶

Each AZ publishes a public capacity gauge in the status page: green / yellow / red per flavor family. Treat yellow as "spread your fleet across AZs"; red as "don't launch large fleets there this week — open a capacity reservation request instead."

Capacity reservations: paid placeholder reservations that guarantee N flavors of family X in AZ Y for a window. Use for known events (Eid traffic, fiscal-year close).

Related¶

Resize failures, performance regressions, and quota surprises.

`FlavorNotAllowed`¶

Error at VM create / resize:

ERROR: FlavorNotAllowed: flavor 'cpu-96x192' is not in the project policy 'acme-prod'

The project has a flavor allowlist. Either:

Use an allowed flavor (cd project policy show <project> to list)
Or, if you're the project admin, amend the policy

`InsufficientCapacity`¶

The chosen flavor isn't available in that AZ right now:

Action	When to use
Retry in another AZ (`bd-dha-1-az2` etc.)	Quickest; works most of the time
Step down one size (`std-32x128` → `std-16x64`)	When the workload fits
Open a Capacity Reservation	For known fleet expansion
Open a Support ticket	Sustained red on the status page

Resize succeeded but app didn't grow¶

After resizing to a mem-* flavor, your app is still using the old memory footprint. Common cases:

Postgres shared_buffers is set in postgresql.conf — it doesn't auto-scale. Edit, restart.
JVM -Xmx is hardcoded in systemd unit / launch script. Edit, restart.
Redis maxmemory config directive needs updating.

Verify the OS sees the RAM (free -h), then check the app's runtime config — not the kernel side.

Performance regression after resize¶

Symptom	Likely cause	Check
Higher latency on same load	Hypervisor with different CPU SKU	`cd compute vm show --vm <id> --json \\| jq .placement.cpu_model` — compare before/after
Lower IOPS on same volume	Volume queue-depth not scaled with vCPU	Inside guest: `echo 256 > /sys/block/vda/queue/nr_requests`
Network throughput plateaus at old cap	App's connection pool too small	Increase pool size, retest

vNUMA misconfiguration¶

Symptom: a 32-vCPU VM uses only 16 cores effectively, or memory access is 2× slower than expected.

```bash

Check vNUMA topology the guest sees¶

numactl --hardware

Should show as many nodes as the platform exposed (usually 2 for ≥ 16 vCPU)¶

```

If you see only 1 node on a ≥ 16 vCPU VM: the guest is ignoring the SLIT/SRAT tables. Common with old kernels (RHEL 7, Ubuntu 18.04). Upgrade, or use NUMA-naive flavors (≤ 8 vCPU).

Quota errors on resize¶

ERROR: QuotaExceeded: project vcpu quota 200, current 196, requested +8

Resize counts the delta against the project quota. Free another VM (or shrink one) or request a quota bump. The error message includes the delta and current usage.

Commitment plan "wasted BDT"¶

If Wasted BDT (MTD) is climbing month-over-month: you're under-utilizing the commitment. Options:

Wait for renewal, recommit lower
Migrate eligible workloads from on-demand to the committed family
Sell the commitment back in a 50–70% penalty exit (rarely worth it)

The platform does not auto-extend commitments. Set a calendar reminder 30 days before expiry.

VM flavors¶

Standard (std-*) — 1:4 vCPU:RAM ratio¶

Memory-optimized (mem-*) — 1:8 vCPU:RAM ratio¶

CPU-optimized (cpu-*) — 1:2 vCPU:RAM ratio¶

Resize¶

vNUMA¶

Related¶

Operate this service¶

Picking the right family¶

Right-sizing policy¶

Flavor allowlists¶

Commitment plans¶

vNUMA and pinning¶

Inside the guest — pin a latency-sensitive process to NUMA node 0¶

Related¶

Resize procedure¶

VM: Running → Resizing → Rebooting → Running (~45 s)¶

Total downtime: ~90 s typical¶

Validation checklist after a resize¶

Live migration and flavor changes¶

Commitment plan utilization¶

Per-second billing edge cases¶

Capacity planning¶

Related¶

FlavorNotAllowed¶

InsufficientCapacity¶

Resize succeeded but app didn't grow¶

Performance regression after resize¶

vNUMA misconfiguration¶

Check vNUMA topology the guest sees¶

Should show as many nodes as the platform exposed (usually 2 for ≥ 16 vCPU)¶

Quota errors on resize¶

Commitment plan "wasted BDT"¶

Related¶

Standard (`std-*`) — 1:4 vCPU:RAM ratio¶

Memory-optimized (`mem-*`) — 1:8 vCPU:RAM ratio¶

CPU-optimized (`cpu-*`) — 1:2 vCPU:RAM ratio¶

`FlavorNotAllowed`¶

`InsufficientCapacity`¶