Block Storage (Provisioned IOPS, NetApp)¶
Service ownership
Owner: storage-platform (storage-pm@clouddigit.ai) — Status: GA — Last audited: 2026-05-11
NetApp ONTAP-backed block storage with guaranteed IOPS — pick your number, and we deliver it. The right tier for OLTP databases and other latency-sensitive transactional workloads.
What it is¶
NetApp AFF (All-Flash FAS) systems running ONTAP, presented as iSCSI / NVMe-TCP block volumes to your VMs. Decoupled from compute (separate NetApp HA pair), so you can scale storage independently of the VM fleet.
Performance model¶
You provision IOPS — pick a target number, and we guarantee it. Two knobs:
| Volume size | Max provisionable IOPS | Throughput cap |
|---|---|---|
| 100 GiB | 30,000 | 500 MB/s |
| 500 GiB | 100,000 | 2 GB/s |
| 2 TiB | 250,000 | 4 GB/s |
| 8 TiB | 500,000 | 8 GB/s |
| 16 TiB | 500,000 | 8 GB/s |
Latency target: sub-millisecond at provisioned IOPS, with NetApp HA pair survivability.
When to pick this over NVMe HCI¶
- OLTP databases (PostgreSQL, MySQL, Oracle, SQL Server)
- Workloads that need predictable tail latency, not just average
- ESG / FI workloads under audit where storage performance is contractually defined
Features¶
- ONTAP snapshots (separate from cloud-level snapshots; both available)
- AES-256 at rest, cluster-managed keys; BYOK on roadmap
- Multi-attach (read-write) for clustered filesystems
- Replication to a second NetApp pair (in-region or cross-region)
Pricing¶
Per GiB-month + per provisioned-IOPS-month, billed in BDT. Higher than NVMe HCI but with hard performance guarantees. See Pricing.
Limits¶
- Max 16 TiB per volume
- Up to 8 multi-attach VMs per volume
- Per-region NetApp capacity is finite — quota requests for very large footprints (>100 TiB) get a sizing review
Related¶
- Block Storage (NVMe HCI) — default, hyper-converged
- Managed PostgreSQL — uses Provisioned IOPS by default
- Backup-as-a-Service
Operate this service¶
PIOPS is HCI's predictable sibling: you pay for guaranteed IOPS, not best-effort.
When to choose PIOPS over HCI¶
| Workload | Pick |
|---|---|
| OLTP DB with strict latency SLO | PIOPS |
| Boot disk for typical VM | HCI |
| Logging / time-series ingest hot tier | PIOPS |
| Build artefact cache | HCI |
| Anything where you can quote IOPS budget | PIOPS |
| Anything where you'd shrug at noisy-neighbour | HCI |
IOPS classes¶
| Class | Provisioned IOPS | Use case |
|---|---|---|
piops-1000 | 1,000 | Small Postgres / MySQL |
piops-3000 | 3,000 | Mid OLTP |
piops-10000 | 10,000 | Large OLTP, hot indices |
piops-30000 | 30,000 | High-throughput time-series |
piops-100000 | 100,000 | Stadium-scale (rare; talk to SE) |
Volume size is independent of class — you can have a piops-30000 100 GiB volume.
IAM¶
Same role model as HCI (block-nvme-administration). PIOPS has a separate quota (default 100,000 total IOPS per project).
Tagging and chargeback¶
The premium over HCI is significant (~3–5×). Tag every PIOPS volume with the business owner so the Finance portal can produce a chargeback report at month close.
Capacity reservation¶
Unlike HCI, very-high-class PIOPS volumes (piops-30000+) may need a reservation — open a Support ticket 5+ BWD before provisioning a fleet.
Related¶
Metrics¶
| Metric | Healthy | Alert |
|---|---|---|
disk.iops.used | < 80% of provisioned | sustained > 95% |
disk.iops.throttled_per_s | 0 | > 0 |
disk.read_latency_ms p99 | < 3 ms | > 8 ms for 10 min |
disk.write_latency_ms p99 | < 5 ms | > 10 ms for 10 min |
disk.iops.throttled_per_s > 0 means the workload exceeded what you provisioned. Either upgrade the class or shape the workload.
Right-sizing¶
Quarterly review:
| Pattern (30-day) | Action |
|---|---|
disk.iops.used p95 < 40% of provisioned | Downgrade one class |
disk.iops.used p95 > 85% of provisioned | Upgrade one class |
| Brief spikes only | Stay; spikes don't justify upgrade |
Class change (no downtime)¶
```bash cd storage volume reclass --volume db-data --to piops-10000
IO continues throughout; billing changes at completion¶
```
Class downgrades are subject to a 24h commit lock-in (prevents thrash).
Snapshot strategy¶
PIOPS volumes typically back critical DBs — snapshot more aggressively than HCI:
- Hourly snapshots, 7-day retention
- Daily application-consistent snapshots (require
qemu-guest-agent) - BaaS daily for cross-region durability
Replication for DR¶
PIOPS supports synchronous replication to a peer volume in another AZ (no cross-region sync):
bash cd storage volume replicate \ --primary vol-db-prod \ --replica-az bd-dha-1-az3
Adds ~1.3× cost, ~1 ms write-latency overhead. Worth it for the strict-SLA tier.
Related¶
Throttling despite provisioned class¶
disk.iops.throttled_per_s > 0 though disk.iops.used is below provisioned:
- Queue depth mismatch — Linux block-layer queue too shallow.
echo 256 > /sys/block/vdb/queue/nr_requests - Bursty workload pattern — provisioning is sustained; brief spikes can throttle. Spread the workload or upgrade.
- Block size mismatch — PIOPS is sized for 16 KiB ops. Workloads emitting many small (4 KiB) ops can hit the IOPS cap faster than expected. Tune the application's IO unit.
Latency higher than spec¶
Spec: < 3 ms read p99. If you see 8+ ms:
| Symptom | Likely cause |
|---|---|
| All ops slow | Volume reclass in progress |
| Read slow, write fine | Page cache cold; warm-up needed |
| Write slow, read fine | Sync replica enabled but partner AZ degraded |
| Spikes only on snapshot creation | Application-consistent freeze hangs |
cd storage volume diagnose --volume <id> returns recent latency events.
Reclass stuck¶
INFO: Reclass in progress: 67% (~10 min remaining)
Reclass moves data online; throughput depends on volume size and current load. >60 min on a < 500 GiB volume is suspicious — open a Support ticket.
Sync replica out of sync¶
WARN: Replica vol-db-prod-replica is 4.3s behind primary
Sync replication should never lag. If it does:
- Check
disk.iops.usedon the replica side — if maxed, the replica is undersized; upgrade - Check inter-AZ link health
- If sustained: temporarily promote to async, fix root cause, then re-sync
cd storage volume replica status --primary vol-db-prod shows current lag.
Quota exceeded on create¶
ERROR: QuotaExceeded: project piops-iops 100000, current 95000, requested +10000
PIOPS has a separate IOPS-class quota. Request a bump or downgrade an existing volume.