Database Migration Service¶

Service ownership

Owner: data-platform (data-pm@clouddigit.ai) — Status: GA — Last audited: 2026-05-11

Online and offline migrations from on-prem or other clouds into Cloud Digit managed databases — with logical replication for near-zero-downtime cutovers.

What it is¶

A managed orchestration layer that does the heavy lifting of database migration:

Snapshot the source
Restore into the target
Stream changes (logical replication or CDC) until catch-up
Cut over

Supported source/target combinations span PostgreSQL (logical replication), MySQL (binlog), MongoDB (change streams), and Oracle → PostgreSQL (heterogeneous).

Source/target matrix¶

Source	Target	Mode
PostgreSQL	Managed PostgreSQL	Online (logical replication)
MySQL / MariaDB	Managed MySQL	Online (binlog)
MongoDB	Managed MongoDB	Online (change streams)
Oracle	Managed PostgreSQL	Heterogeneous (with schema conversion)
Self-managed Postgres on VM	Managed PostgreSQL	Online

Workflow¶

graph LR
    Source[Source DB<br/>on-prem / other cloud]
    DMS[DMS instance]
    Target[Cloud Digit managed DB]
    Source -->|snapshot + CDC| DMS
    DMS --> Target
    Source -.lag delta.- Target

Network connectivity¶

DMS instances are launched into your VPC; they need IP reachability to the source. Common patterns:

VPN site-to-site — fastest to set up, ~minutes
BDIX Peering Direct Connect — for very large initial loads
Public over TLS — when the source has a public endpoint

Schema conversion (Oracle → PostgreSQL)¶

Schema-conversion tooling reports incompatibilities (PL/SQL constructs, types, sequences) before migration. Conversion follows AWS SCT-equivalent rules; complex PL/SQL still needs human review.

Pricing¶

Per-DMS-instance-hour during migration. No charge for the data — only for the orchestration. See Pricing.

Operate this service¶

AdministrationOperationTroubleshooting

DMS replicates data between heterogeneous and homogeneous sources/targets — useful for cloud migrations, version upgrades, and continuous replication for DR.

Supported pairs¶

Source	Target
PostgreSQL (any 11+)	Cloud Digit Managed PostgreSQL
MySQL/MariaDB (5.7+, 10.3+)	Cloud Digit Managed MySQL/MariaDB
MongoDB (3.6+)	Cloud Digit Managed MongoDB
Oracle (11g+)	Managed PostgreSQL (with conversion)
SQL Server (2014+)	Managed MySQL or PostgreSQL
Cloud Digit DBs	Another region (DR replication)

Conversion (Oracle → Postgres, SQL Server → MySQL) handles schema and most syntax; manual review of stored procedures.

IAM¶

Role	Can do
`dms.viewer`	List replication tasks, view progress
`dms.operator`	Start / pause / resume replication tasks
`dms.builder`	Create replication tasks and endpoints
`dms.admin`	Above + delete endpoints, manage CDC config

Endpoint configuration¶

Source and target endpoints store credentials in Cloud Digit Secrets Manager (no plain-text DSNs in DMS config):

bash cd dms endpoint create \ --name onprem-pg \ --engine postgres \ --host 192.168.10.5 --port 5432 \ --database prod \ --credential-secret openbao://acme/onprem-pg-rw

Migration phases¶

A full DMS migration:

Full load — bulk copy current data
CDC — Change-Data-Capture, replicates ongoing changes
Cutover — apps switch from source to target; CDC catches the final delta

The platform supports all three or any subset (full-load only, CDC only, or both).

Network¶

Source must be reachable from Cloud Digit: - Public-IP source: target must allow Cloud Digit egress IPs - VPN/Direct Connect source: routable from the DMS workload subnet - Cloud Digit-to-Cloud Digit: automatic

Related¶

Metrics¶

Metric	Healthy	Alert
`dms.task.state`	`running`	`error`, `failed`
`dms.full_load.percent_complete`	rising	stuck
`dms.cdc.lag_seconds`	< 5 s	> 30 s
`dms.cdc.events_per_sec`	matches source rate	< 50% of source rate (lag building)
`dms.errors_24h`	0	> 0

Pre-migration assessment¶

Before the real cutover, do a dry-run:

bash cd dms assess --source onprem-pg --target managed-pg-prod

Reports: - Schema objects (tables, views, sequences, functions) - Datatype compatibility issues - Estimated full-load duration - Required source modifications (e.g., enable WAL for CDC)

Full load + CDC¶

bash cd dms task create \ --name onprem-to-cloud \ --source onprem-pg \ --target managed-pg-prod \ --type full-load-and-cdc \ --tables 'public.*' \ --start now

Full load uses parallel workers per table (configurable). For 1 TB+ databases, expect hours.

Cutover ceremony¶

Quiesce app writes (read-only mode, or scheduled maintenance)
Wait for dms.cdc.lag_seconds = 0
Final consistency check: row counts, checksum sample
Stop the task: cd dms task stop --task <id>
Repoint app to new database
Resume traffic

Total downtime typically 5–10 minutes for an organized cutover.

Continuous replication for DR¶

Don't stop the task post-cutover — use CDC for ongoing replication to a DR target:

bash cd dms task create --type cdc-only --source managed-pg-prod-dha --target managed-pg-prod-ctg

dms.cdc.lag_seconds becomes the RPO metric.

Related¶

Endpoint test fails¶

bash cd dms endpoint test --endpoint onprem-pg

Symptom	Likely cause
Connection refused	Network / SG / firewall
Authentication failed	Wrong credential in Secrets Manager
`must be superuser to access`	DMS user needs replication permissions
`WAL level must be logical`	Source Postgres not configured for CDC

For Postgres CDC: wal_level = logical, max_replication_slots ≥ 10, max_wal_senders ≥ 10. Restart required.

Full load slow¶

Tune parallelism:

bash cd dms task update --task <id> --parallel-workers 8

Bottleneck identification: - Source disk read saturated → can't go faster - Target write saturated → resize target - Network plane → check inter-region link - Single huge table → table-level parallelism with WHERE slicing

CDC lag climbing¶

dms.cdc.lag_seconds > 30 and rising:

Source generating more events than target can apply
A specific large transaction stuck (DDL on a huge table)
Replication slot conflict (Postgres) — another consumer reading the WAL

Pause writes briefly to let CDC catch up; investigate; resume.

"Replication slot was removed"¶

A Postgres replication slot lost during a primary failover or manual cleanup. CDC can't resume from a missing slot.

Re-create the slot on the source
Re-bootstrap the task (full load again, then CDC)

Data drift after cutover¶

Source and target counts don't match, or specific rows differ:

bash cd dms validate --task <id> --tables 'public.orders' --sample 10000

Common causes: - Trigger on source modified data after the task started; trigger fired on target too (double-apply) - Sequence values not migrated; new inserts on target reused IDs - Default values differed between Oracle and Postgres conversion

Schema conversion gaps¶

After Oracle→Postgres conversion, things that often need manual review: - Stored procedures with Oracle-specific syntax - Materialized views - Indexes with function expressions - Trigger logic

DMS reports unresolved issues in the assessment; address before cutover.