Skip to content

Managed MySQL / MariaDB

Service ownership

Owner: data-platform (data-pm@clouddigit.ai) — Status: GA — Last audited: 2026-05-11

Managed MySQL 8 and MariaDB 10/11 clusters with HA and replicas.

What it is

Managed MySQL or MariaDB. We provision and operate the engine, you operate the schema and SQL. Same operational model as Managed PostgreSQL: HA via semi-sync or async replication, PITR via binlog archive, multiple read replicas.

Versions

Engine Versions
MySQL 8.0, 8.4 (LTS)
MariaDB 10.6 (LTS), 10.11 (LTS), 11.4 (LTS)

Topologies

Topology Use case
Single instance Dev / non-prod
HA (primary + replica with auto-failover) Production default
HA + read replicas Read scaling
Cross-region replica DR / geo-read
Group replication (MySQL InnoDB Cluster, MariaDB Galera) Multi-writer (advanced)

Compute & storage

Same flavor families as Managed PostgreSQL. Storage on Provisioned IOPS by default for OLTP; NVMe HCI option for cost-sensitive workloads.

Backup & PITR

  • Daily snapshots, retained 7 days default (configurable to 35)
  • Binlog archived continuously for PITR
  • Manual snapshots, cross-region copy

Maintenance & upgrades

Weekly 4-hour window for minor patches. Major upgrades opt-in.

Pricing

Compute + Provisioned-IOPS storage rates; see Pricing.

Operate this service

InnoDB-backed MySQL 8.x and MariaDB 11.x clusters.

Engine choice

MySQL 8.x MariaDB 11.x
GTID-based replication GTID + multi-source replication
JSON functions, window functions JSON, window, sequence engine
Default for most apps Some Drupal/legacy stacks prefer it

Topology

Topology RPO RTO Cost
Single instance 24h (backup) minutes
Primary + semi-sync replica <1s < 30 s 2.1×
Primary + 2 replicas (1 sync, 1 async) <1s < 30 s + read scale 3×+

IAM

Same shape as PostgreSQL: viewer, connector, dba-operator, cluster-admin.

In-database: the platform provisions an admin role (acme_admin) with GRANT OPTION; you create app-scoped users from there. Root is never exposed.

Parameter groups

bash cd db mysql param-group create \ --name acme-prod \ --params innodb_buffer_pool_size=12G,max_connections=500,innodb_log_file_size=1G

innodb_buffer_pool_size should be ~70% of RAM. innodb_log_file_size larger = better write performance, longer crash recovery.

Backups & PITR

  • Continuous binary-log archival → PITR within retention
  • Nightly full backup via xtrabackup (no lock for InnoDB)
  • Default retention 7 days; bump per workload

SSL/TLS

Required by default for client connections:

mysql --ssl-mode=REQUIRED --ssl-ca=cd-ca.pem -h cluster-acme.bd-dha-1 -u acme -p

Disable per-cluster (not recommended) via parameter group.

Metrics

Metric Healthy Alert
mysql.connections.threads_connected < 80% of max > 90%
mysql.replication.seconds_behind_master < 1 s > 5 s
mysql.innodb.buffer_pool.hit_pct > 99% < 95%
mysql.innodb.row_lock_waits varies spike
mysql.innodb.deadlocks_per_min 0 > 0
mysql.slow_queries_per_min varies spike from baseline

Failover

bash cd db mysql failover --cluster acme-prod

Promotes semi-sync replica. RTO < 30 s. Apps reconnect using the cluster endpoint (auto-redirects to new primary).

Slow query analysis

Enable slow log:

bash cd db mysql param set --cluster acme-prod --slow_query_log=1 --long_query_time=1

Stream to a S3 bucket; analyze with mysqldumpslow or pt-query-digest. Add indexes for the top offenders.

Schema migrations

For tables > 1 GB, ALTER TABLE blocks. Use pt-online-schema-change or gh-ost (recommended):

bash gh-ost \ --host=cluster-acme.bd-dha-1 \ --database=acme \ --table=orders \ --alter="ADD COLUMN customer_segment VARCHAR(32)" \ --execute

Both tools copy to a shadow table and swap; near-zero-downtime.

Read replicas

bash cd db mysql replica create --cluster acme-prod --az bd-dha-1-az3

Async; lag typically < 1 s. Route read-only queries via a separate connection string. Apps must understand the consistency tradeoff.

Major version upgrades

bash cd db mysql upgrade --cluster acme-prod --target-version 8.4 --window <ts>

In-place for minor versions; logical-replication-based for majors (low downtime, same as Postgres upgrade flow).

Too many connections

ERROR 1040 (08004): Too many connections

  • Connection pooler (ProxySQL, recommended) — apps go through pooler not raw cluster
  • Drop idle connections: KILL <id> for sessions idle > 1h
  • Raise max_connections (RAM-costly)

Replication broken

mysql.replication.seconds_behind_master = NULL:

The replica IO/SQL thread stopped. Check:

sql SHOW SLAVE STATUS\G -- Look at Last_Errno and Last_Error

Common causes: - A row missing on replica (PK conflict) — usually a manual delete. Skip the event or reinitialize. - Schema drift — replica's table doesn't match primary - Storage full on replica

bash cd db mysql replica restart --cluster acme-prod --replica <id>

Long-running ALTER blocks queries

The straightforward fix: don't run ALTER directly on busy tables. Use gh-ost or pt-online-schema-change. If you must run ALTER directly:

  • Run during off-hours
  • Kill the query if the queue builds; the platform has a query timeout configurable per parameter group

InnoDB deadlocks

bash cd db mysql innodb status --cluster acme-prod | grep -A 20 "LATEST DETECTED DEADLOCK"

Most common: app code that updates rows in different orders across transactions. Fix at the app — consistent lock ordering, smaller transactions, or SELECT ... FOR UPDATE with ordered iteration.

Buffer pool hit ratio < 95%

Cold cache after restart is normal — should climb within hours.

Persistent low: - Working set > innodb_buffer_pool_size. Resize the cluster or shrink the working set. - New query pattern scanning large tables. Add indexes.

Slow query log filling disk

Slow query log can grow fast under bad-query bursts. Rotate:

bash cd db mysql slowlog rotate --cluster acme-prod

Better: stream to S3 instead of local disk via the slow-log shipper.

Replication lag during backup

Backup window can cause replica IO contention → replication lag. Schedule backups during low-traffic windows; consider taking backups from a dedicated backup replica.