Skip to content

Managed SIEM

Service ownership

Owner: security-platform (security-pm@clouddigit.ai) — Status: GA — Last audited: 2026-05-11

Centralized log analytics and threat detection on top of Managed OpenSearch — sovereign-resident, with retention you can defend in audit.

What it is

A SIEM service that:

  1. Collects events from Cloud Digit services (audit logs, VPC flow logs, WAF logs, LB access logs, K8s audit)
  2. Collects from your workloads (syslog, agents, application logs)
  3. Stores in OpenSearch with configurable retention tiers
  4. Runs detections (Sigma rule-compatible) and raises alerts

Sources, out of the box

Source Mode
Cloud Digit account audit log Native (no setup)
VPC Flow Logs Native
WAF logs Native
Load Balancer access logs Native
Managed Kubernetes audit Native
Managed databases slow / error logs Native
Custom: syslog, OTel, Fluent Bit Agents shipped

Detection content

  • Sigma-rule library — open-source detection content, kept current
  • Cloud Digit-authored rules — for misuse of platform APIs (mass-snapshot-export, etc.)
  • Customer-authored rules — your own Sigma / DSL rules, version-controlled
  • Threat-intel feeds — community + private (Enterprise tier)

Retention tiers

Tier Storage Search latency Use case
Hot Provisioned IOPS ms Last 7–30 days, active hunt
Warm NVMe HCI 100s of ms 30–365 days
Cold Object Archive seconds (rehydrate) > 1 year, compliance

Tier transition happens on a per-index lifecycle policy you set.

Alerting

  • Out to email / webhook / Slack / Microsoft Teams / your ITSM (PagerDuty, Opsgenie, ServiceNow)
  • Alert grouping, deduplication, suppression windows
  • On-call runbooks attached to alerts

Pricing

  • Ingest — per GiB-day (low; we don't gouge ingest like off-shore vendors)
  • Storage — at the tier rate (Provisioned IOPS / NVMe / Archive)
  • Detection content — Sigma is included; premium feeds are an add-on

See Pricing.

Operate this service

Security Information & Event Management — log aggregation, correlation, and alerting across Cloud Digit + your apps.

Architecture

Sources → Ingestion → Storage (hot/warm/cold) → Correlation → Alerts/Dashboards

Sources include: - Audit logs (all CD API calls) - WAF events - VPC flow logs - DDoS mitigation events - CSPM findings - Application logs (your apps) - Endpoint logs (where applicable)

IAM

Role Can do
siem.viewer Search logs, view dashboards
siem.analyst Create alerts, build dashboards
siem.responder Acknowledge alerts, triage cases
siem.admin Configure sources, retention, integrations

Retention tiers

Tier Retention Searchable Cost
Hot 7 days Instant High
Warm 30 days < 1 minute Medium
Cold 365+ days Hours (rehydrate) Low (Archive class)

Standard: 7d hot, 30d warm, 365d cold (compliance default).

Detection content

Pre-built detections for: - Common attack patterns (privilege escalation, lateral movement, data exfil) - Bangladesh-specific threats - Compliance violations (e.g., access to PII outside business hours)

Custom rules: sql SELECT principal, count(*) AS attempts FROM events WHERE action = 'iam.login' AND result = 'failed' GROUP BY principal HAVING attempts > 10 WINDOW '5m'

Alerting

Tiered: - info — dashboard only - low — email - medium — email + Slack - high — paging - critical — paging + auto-escalation after 15 min

Metrics

Metric Healthy Alert
siem.ingest.events_per_sec matches sources sudden drop (ingestion broken)
siem.ingest.bytes_per_day within plan spikes (chatty source)
siem.alerts.open_count < target climbing
siem.alerts.mean_time_to_ack_min < 15 (critical) breach
siem.search.query_latency_ms p95 < 5000 > 30000

Daily SOC routine

  1. Triage overnight alerts
  2. Investigate high + critical
  3. Sweep medium for patterns
  4. Update detections based on findings

A SIEM with no daily attention is just expensive log storage.

Threat hunting

Periodic proactive search beyond alerts:

bash cd siem search --query ' SELECT principal, action, resource, count(*) AS events FROM events WHERE source = "iam.role_assumption" AND timestamp > now() - interval 7 day AND principal NOT IN (SELECT principal FROM expected_assumptions) GROUP BY 1, 2, 3 '

Hypothesis-driven queries find issues the canned detections miss.

Tuning detections

Per detection, weekly review: - True-positive count - False-positive count - Mean time to resolution

bash cd siem detection stats --name 'unusual-login-location' --since 30d

Detections with > 50% false-positive rate need tuning or retirement.

For investigations spanning months:

```bash cd siem search rehydrate --query --since 2026-01-01 --until 2026-04-30

Returns a job ID; result available in 1-4 hours

```

Rehydrate is expensive — use only for genuine investigations.

Compliance retention

Verify quarterly:

```bash cd siem retention audit --tier all

Should show no gaps in any tier per the configured policy

```

Ingestion dropped

WARN: siem.ingest.events_per_sec dropped from 12k to 4k

A source stopped sending. Diagnose:

```bash cd siem source status --all

Lists sources with last_received_at

```

Common causes: - Source IAM principal credentials expired - Source app crashed or stopped logging - Network path broken (VPC route, SG) - Source rate-limited by SIEM ingest quota — bump

Alert flood

siem.alerts.open_count climbing fast:

  • A detection misfiring (regex too broad)
  • Real attack in progress
  • Routine event mis-classified

Triage by priority; deduplicate via grouping (group by principal reduces 100 alerts to 1 case).

Query timeouts

Searches > 30 s: - Time range too wide - Query missing index hints - Hot/warm tier overloaded

Optimize: - Narrow time range - Use index columns (timestamp, principal, action) - Pre-aggregate for dashboard queries

False negatives

Real attack happened but SIEM didn't alert: - Source wasn't ingesting that event type - Detection rule had a gap (specific user-agent or pattern) - Alert routed but suppressed (over-aggressive deduplication)

Add a detection covering the specific pattern; replay historical events to verify the new rule would have caught it.

Storage cost spike

siem.ingest.bytes_per_day 2× normal: - Chatty source (a new app emitting verbose logs) - Debug-level logging accidentally enabled in production - A loop in the app producing repeating logs

bash cd siem ingest top-sources --since 24h

Filter at source or sample heavily for verbose-but-low-value events.

Cold-tier rehydrate slow

Rehydrate is bandwidth and storage-limited; 1–4 h is normal for a 30-day window. Faster: provide more specific time/principal/action filters.

Compliance retention gap

cd siem retention audit reports a gap (events missing from a window): - Ingestion was down during that window - Bug in lifecycle policy moved data prematurely - Account misconfiguration

Engage SRE; rehydrate from cold may recover; if not, document the gap for the auditor.