Secrets Management (OpenBao)¶
Service ownership
Owner: security-platform (security-pm@clouddigit.ai) — Status: GA — Last audited: 2026-05-11
A managed OpenBao deployment — the open-source HashiCorp Vault fork — for secrets storage, dynamic credentials, KMS, and PKI.
Why OpenBao¶
OpenBao is the Linux-Foundation-shepherded fork of HashiCorp Vault, post-license-change. It's open-source, API-compatible with Vault, and has the same operational model. We picked it because:
- Open governance under the Linux Foundation
- API-compatible with Vault — every Vault client and CLI works
- Sovereignty-friendly — no commercial-licence-tied closed components
What it does¶
| Engine | What |
|---|---|
| KV v2 | Static secrets, versioned |
| Database secrets engine | On-demand short-lived DB credentials |
| Transit (KMS) | Envelope encryption / decryption as a service |
| PKI | Certificate authority — issue / sign / revoke |
| SSH CA | Sign SSH host and user certificates |
| AWS / Azure / GCP | Issue short-lived cloud creds (for hybrid) |
| Kubernetes auth | K8s service-account authentication |
| JWT / OIDC | Federate with Cloud Digit IAM, Keycloak, etc. |
Topologies¶
- Shared, multi-tenant (default) — namespaced per-customer
- Dedicated cluster (Enterprise) — single-tenant OpenBao on dedicated VMs
High availability¶
3-node Raft-backed cluster across AZs in-region. Auto-unseal via Object Storage-stored sealing key (KMS-encrypted).
Audit¶
Every API call written to two places (file + syslog → SIEM) by default. Required for BB ICT 4.0 §10 (cryptography) audits.
Pricing¶
- Per-namespace-hour (shared)
- Per-cluster-hour (dedicated)
- Storage at the underlying class
See Pricing.
Related¶
- Managed Kubernetes — common consumer
- Managed PostgreSQL etc. — issue dynamic DB creds
- SIEM — receive audit logs
Operate this service¶
Managed OpenBao (HashiCorp Vault-compatible) for secrets, certificates, and dynamic credentials.
What it stores¶
- Application secrets (DB passwords, API keys)
- TLS certificates (with PKI engine)
- SSH keys
- Cloud-platform credentials (dynamic AWS-equivalent, etc.)
IAM model¶
Two layers: 1. Cloud Digit IAM controls access to the OpenBao cluster 2. OpenBao policies control access to secrets within
| CD Role | Can do |
|---|---|
openbao.viewer | List clusters and engines |
openbao.consumer | Authenticate and use the cluster |
openbao.admin | Manage cluster, configure engines, root policies |
OpenBao policies are HCL inside OpenBao itself.
Engine types¶
| Engine | Use |
|---|---|
kv-v2 | Versioned key-value secrets |
pki | Issue X.509 certs |
transit | Encryption-as-a-service (don't store, just encrypt) |
ssh | Issue signed SSH certs |
database | Issue dynamic DB credentials |
Pick the right engine; don't shoehorn certs into kv-v2 or vice versa.
Authentication methods¶
| Method | Use |
|---|---|
| AppRole | Apps in CI/CD (issue role-id + secret-id) |
| JWT/OIDC | Apps with identity tokens (K8s service accounts) |
| Cloud-platform | Apps already authenticated to Cloud Digit IAM |
| Token | Direct human use (short-lived) |
Prefer K8s/cloud-platform auth — no static credentials to manage.
Secret rotation¶
Most secrets should auto-rotate:
```hcl
Database engine config¶
path "database/config/acme-prod-pg" { rotation_period = "24h" } ```
Apps fetch fresh credentials each connection (or every TTL); no manual rotation needed.
Audit¶
OpenBao audit logs every secret access. Stream to SIEM:
bash cd openbao audit enable --cluster acme-openbao --destination siem://acme-siem/openbao
Required for compliance — every secret access tracked.
Related¶
Metrics¶
| Metric | Healthy | Alert |
|---|---|---|
openbao.requests_per_sec | matches load | |
openbao.latency_ms p99 | < 50 | > 200 |
openbao.leases.active | varies | sudden 10× jump (leaks) |
openbao.unseal_status | unsealed | sealed (cluster down) |
openbao.audit.failures_24h | 0 | > 0 |
openbao.tokens.expiring_24h | varies | many expiring at once (renewal storm) |
Reading secrets in apps¶
```bash
Authenticate via AppRole¶
export VAULT_TOKEN=$(vault write -field=token auth/approle/login \ role_id=$ROLE_ID secret_id=$SECRET_ID)
Read¶
vault kv get -field=password kv/acme/prod/db ```
Apps usually use a sidecar / agent that handles auth + caching, not raw CLI calls.
Dynamic DB credentials¶
Apps request a fresh credential per connection (or per TTL):
```bash vault read database/creds/acme-prod-pg-role
Returns: username, password, valid for TTL¶
```
Credentials auto-revoke at TTL expiry. No long-lived DB passwords sitting in config.
Certificate issuance¶
Apps issue their own short-lived TLS certs from the PKI engine:
bash vault write pki/issue/acme-internal \ common_name=internal-api.acme.local \ ttl=24h
Auto-renew via the agent; mTLS-everywhere becomes operationally feasible.
Sealing for emergencies¶
In a breach scenario, sealing the cluster locks all secret access:
```bash cd openbao cluster seal --cluster acme-openbao --reason "incident-2026-05-11"
Requires multi-person approval¶
```
Unseal requires shamir key-share holders to provide their shares.
Cluster operations¶
- Replication: enabled to peer region for DR
- Snapshots: hourly automated
- Restore: tested quarterly
bash cd openbao cluster snapshot --cluster acme-openbao cd openbao cluster restore --snapshot <id> --target acme-openbao-recovery
Related¶
Cluster sealed¶
openbao.unseal_status = sealed. Cluster restarted and lost the seal key.
```bash cd openbao cluster unseal --cluster acme-openbao
Prompts for Shamir key shares from key-holders¶
```
Multiple key-holders must each submit a share until quorum (default 3 of 5).
If a key-holder is unreachable: use the next available, up to quorum. If can't reach quorum: ticket SRE for emergency recovery (requires multi-party CD authorization).
"Permission denied" on secret read¶
| Cause | Check |
|---|---|
| Token expired | vault token lookup |
| Token's policy doesn't grant access | vault policy read <name> |
| Path doesn't exist | vault kv list kv/acme/prod |
| Path requires different engine type | vault read sys/mounts to find engine path |
Dynamic credential creation slow¶
WARN: openbao.database.credential_create_latency > 5s
The DB engine is creating users on the target DB; this requires a connection to the DB. Slow when: - Target DB under load - DB-side CREATE USER permission missing (engine root creds insufficient) - Network slow between OpenBao cluster and target DB
AppRole secret-id consumed unexpectedly¶
AppRole secret-ids can have limited use counts (secret_id_num_uses):
- A CI job retrying consumed multiple uses
- Multiple instances of the same app each using the secret-id
Fix: - Use AppRole with secret_id_num_uses=0 (unlimited) - Or, switch to K8s/cloud auth (no secret-id at all)
Lease leak¶
openbao.leases.active climbing without corresponding traffic increase:
- App not releasing leases when done
- App not setting TTL appropriately
- A bug causing infinite credential requests
bash cd openbao leases list --filter expired=false --order created-desc
Manually revoke leaked leases:
bash vault lease revoke -prefix database/creds/acme-prod-pg-role/
Cross-region replication lag¶
For DR: replication lag should be < 5 min. If higher: - Inter-region link saturated - Burst of writes on primary - Replication conflict (rare; needs SRE)
Audit log gaps¶
OpenBao requires an audit device; if none configured (or all failed), writes block (by design).
```bash cd openbao audit list --cluster acme-openbao
Should have at least one device "enabled=true"¶
```
Enable a secondary device for resilience.