Block Storage (Provisioned IOPS, NetApp)¶

Service ownership

Owner: storage-platform (storage-pm@clouddigit.ai) — Status: GA — Last audited: 2026-05-11

NetApp ONTAP-backed block storage with guaranteed IOPS — pick your number, and we deliver it. The right tier for OLTP databases and other latency-sensitive transactional workloads.

What it is¶

NetApp AFF (All-Flash FAS) systems running ONTAP, presented as iSCSI / NVMe-TCP block volumes to your VMs. Decoupled from compute (separate NetApp HA pair), so you can scale storage independently of the VM fleet.

Performance model¶

You provision IOPS — pick a target number, and we guarantee it. Two knobs:

Volume size	Max provisionable IOPS	Throughput cap
100 GiB	30,000	500 MB/s
500 GiB	100,000	2 GB/s
2 TiB	250,000	4 GB/s
8 TiB	500,000	8 GB/s
16 TiB	500,000	8 GB/s

Latency target: sub-millisecond at provisioned IOPS, with NetApp HA pair survivability.

When to pick this over NVMe HCI¶

OLTP databases (PostgreSQL, MySQL, Oracle, SQL Server)
Workloads that need predictable tail latency, not just average
ESG / FI workloads under audit where storage performance is contractually defined

Features¶

ONTAP snapshots (separate from cloud-level snapshots; both available)
AES-256 at rest, cluster-managed keys; BYOK on roadmap
Multi-attach (read-write) for clustered filesystems
Replication to a second NetApp pair (in-region or cross-region)

Pricing¶

Per GiB-month + per provisioned-IOPS-month, billed in BDT. Higher than NVMe HCI but with hard performance guarantees. See Pricing.

Limits¶

Max 16 TiB per volume
Up to 8 multi-attach VMs per volume
Per-region NetApp capacity is finite — quota requests for very large footprints (>100 TiB) get a sizing review

Block Storage (NVMe HCI) — default, hyper-converged
Managed PostgreSQL — uses Provisioned IOPS by default
Backup-as-a-Service

Operate this service¶

AdministrationOperationTroubleshooting

PIOPS is HCI's predictable sibling: you pay for guaranteed IOPS, not best-effort.

When to choose PIOPS over HCI¶

Workload	Pick
OLTP DB with strict latency SLO	PIOPS
Boot disk for typical VM	HCI
Logging / time-series ingest hot tier	PIOPS
Build artefact cache	HCI
Anything where you can quote IOPS budget	PIOPS
Anything where you'd shrug at noisy-neighbour	HCI

IOPS classes¶

Class	Provisioned IOPS	Use case
`piops-1000`	1,000	Small Postgres / MySQL
`piops-3000`	3,000	Mid OLTP
`piops-10000`	10,000	Large OLTP, hot indices
`piops-30000`	30,000	High-throughput time-series
`piops-100000`	100,000	Stadium-scale (rare; talk to SE)

Volume size is independent of class — you can have a piops-30000 100 GiB volume.

IAM¶

Same role model as HCI (block-nvme-administration). PIOPS has a separate quota (default 100,000 total IOPS per project).

Tagging and chargeback¶

The premium over HCI is significant (~3–5×). Tag every PIOPS volume with the business owner so the Finance portal can produce a chargeback report at month close.

Capacity reservation¶

Unlike HCI, very-high-class PIOPS volumes (piops-30000+) may need a reservation — open a Support ticket 5+ BWD before provisioning a fleet.

Related¶

Metrics¶

Metric	Healthy	Alert
`disk.iops.used`	< 80% of provisioned	sustained > 95%
`disk.iops.throttled_per_s`	0	> 0
`disk.read_latency_ms` p99	< 3 ms	> 8 ms for 10 min
`disk.write_latency_ms` p99	< 5 ms	> 10 ms for 10 min

disk.iops.throttled_per_s > 0 means the workload exceeded what you provisioned. Either upgrade the class or shape the workload.

Right-sizing¶

Quarterly review:

Pattern (30-day)	Action
`disk.iops.used` p95 < 40% of provisioned	Downgrade one class
`disk.iops.used` p95 > 85% of provisioned	Upgrade one class
Brief spikes only	Stay; spikes don't justify upgrade

Class change (no downtime)¶

```bash cd storage volume reclass --volume db-data --to piops-10000

IO continues throughout; billing changes at completion¶

```

Class downgrades are subject to a 24h commit lock-in (prevents thrash).

Snapshot strategy¶

PIOPS volumes typically back critical DBs — snapshot more aggressively than HCI:

Hourly snapshots, 7-day retention
Daily application-consistent snapshots (require qemu-guest-agent)
BaaS daily for cross-region durability

Replication for DR¶

PIOPS supports synchronous replication to a peer volume in another AZ (no cross-region sync):

bash cd storage volume replicate \ --primary vol-db-prod \ --replica-az bd-dha-1-az3

Adds ~1.3× cost, ~1 ms write-latency overhead. Worth it for the strict-SLA tier.

Related¶

Throttling despite provisioned class¶

disk.iops.throttled_per_s > 0 though disk.iops.used is below provisioned:

Queue depth mismatch — Linux block-layer queue too shallow. echo 256 > /sys/block/vdb/queue/nr_requests
Bursty workload pattern — provisioning is sustained; brief spikes can throttle. Spread the workload or upgrade.
Block size mismatch — PIOPS is sized for 16 KiB ops. Workloads emitting many small (4 KiB) ops can hit the IOPS cap faster than expected. Tune the application's IO unit.

Latency higher than spec¶

Spec: < 3 ms read p99. If you see 8+ ms:

Symptom	Likely cause
All ops slow	Volume reclass in progress
Read slow, write fine	Page cache cold; warm-up needed
Write slow, read fine	Sync replica enabled but partner AZ degraded
Spikes only on snapshot creation	Application-consistent freeze hangs

cd storage volume diagnose --volume <id> returns recent latency events.

Reclass stuck¶

INFO: Reclass in progress: 67% (~10 min remaining)

Reclass moves data online; throughput depends on volume size and current load. >60 min on a < 500 GiB volume is suspicious — open a Support ticket.

Sync replica out of sync¶

WARN: Replica vol-db-prod-replica is 4.3s behind primary

Sync replication should never lag. If it does:

Check disk.iops.used on the replica side — if maxed, the replica is undersized; upgrade
Check inter-AZ link health
If sustained: temporarily promote to async, fix root cause, then re-sync

cd storage volume replica status --primary vol-db-prod shows current lag.

Quota exceeded on create¶

ERROR: QuotaExceeded: project piops-iops 100000, current 95000, requested +10000

PIOPS has a separate IOPS-class quota. Request a bump or downgrade an existing volume.