Agentic AI-as-a-Service¶
Service ownership
Owner: application-services (apps-pm@clouddigit.ai) — Status: GA — Last audited: 2026-05-11
Build tool-using agents on sovereign-resident LLMs — with built-in connectors, sandboxes, audit trails, and approval workflows.
What it is¶
Where Chatbot-as-a-Service is "have a conversation," Agentic AI is "go do things." An agent platform that:
- Plans multi-step actions
- Calls tools (APIs, databases, your internal systems)
- Executes (with approval gates for risky actions)
- Reports back
- Logs every step for audit
Built on top of LLM-as-a-Service so the model never sees data outside Bangladesh; built on top of standard agent patterns (ReAct, tool-calling, function-calling) so behaviour is debuggable.
Built-in connectors¶
| Category | Examples |
|---|---|
| Cloud Digit | Every Cloud Digit API (compute, storage, K8s, DB) |
| Productivity | Microsoft 365, Google Workspace |
| Communication | Slack, Teams, WhatsApp, email |
| CRM / Ticketing | Salesforce, Zoho, Freshdesk, Zendesk |
| Code / VCS | GitHub, GitLab |
| Data | Postgres, MySQL, MongoDB, Redis, S3 (read access patterns) |
| Custom | OpenAPI 3.x ingestor — drop a spec, get tools |
Approval gates¶
Each action class is risk-rated. Risky actions (anything mutating, anything spending money, anything sending external messages) require human approval by default. Configurable:
- Always require approval — default for new tools
- Auto-approve in sandbox — let the agent iterate freely in a non-prod env
- Pre-approved patterns — specific argument patterns auto-approved (e.g., "any read-only SELECT against the reporting replica")
Audit trail¶
Every plan step, tool call, argument, response, and approval is logged to SIEM by default. This is non-negotiable for regulated environments.
Use cases¶
- Internal-IT helper that can actually act ("provision a VM", "open a ticket")
- Banking back-office assistants (with strict approval gates)
- Customer-success agents that pull from CRM and act in helpdesk
- Operations assistants for the NOC
Pricing¶
- Per-active-agent-hour
- Plus underlying LLMaaS tokens
- Plus connector fees for premium connectors
See Pricing.
Related¶
- LLM-as-a-Service
- Chatbot-as-a-Service
- SIEM — audit trail destination
Operate this service¶
Autonomous AI agents for back-office automation — multi-step reasoning, tool use, long-horizon tasks.
What "agentic" means¶
Beyond single-turn LLM completion: - Plan multi-step tasks - Call tools, observe results, plan next step - Persist state across turns - Recover from errors - Escalate when stuck
Typical use cases¶
- Loan application processing (gather docs → verify → score → recommend)
- Insurance claims (review → assess → approve/escalate)
- IT operations (diagnose alert → look up runbook → execute fix → verify)
- Procurement (sourcing → quote comparison → recommendation)
Not chat — task automation.
IAM¶
| Role | Can do |
|---|---|
agent.viewer | View agents, task logs |
agent.designer | Define agents, tools, workflows |
agent.deployer | Deploy agents to production |
agent.admin | Manage system, audit |
Agent definition¶
yaml agent: name: loan-application-processor model: llama-3.1-70b-instruct tools: - lookup_customer - fetch_credit_report - calculate_eligibility - send_notification max_steps: 20 escalation_criteria: "credit_score < 600 OR loan_amount > 5000000"
Safety¶
Agents acting autonomously are powerful — and risky: - Bounded action set (only configured tools) - Step limit (max-steps prevents runaway) - Escalation criteria (when to involve humans) - Audit log per agent run - Sandboxing (agent can't access arbitrary resources)
Don't grant agents permissions you wouldn't give to a junior employee.
Related¶
Metrics¶
| Metric | Healthy | Alert |
|---|---|---|
agent.tasks_per_hour | matches input rate | |
agent.completion_rate | > 80% | < 60% |
agent.escalation_rate | 10-25% | > 40% (agent too weak) |
agent.average_steps | < 10 | > 15 (going in circles) |
agent.cost_bdt.per_task | within budget | climbing |
agent.tool_errors_per_hour | low | spikes |
Task lifecycle¶
Input → Plan → Execute → Observe → Next? → Done or escalate
Each step: 1. Agent prompts LLM with current state 2. LLM decides: call tool, ask user, or finish 3. Tool/result observed 4. Loop until done or escalation
Tool registry¶
Agents only have access to registered tools:
bash cd agent tool register --name lookup_customer \ --endpoint https://internal.acme.com/customers/:id \ --schema @lookup_customer_schema.json \ --auth bearer:acme-internal-token
Cost monitoring¶
Per-task cost depends on: - Number of LLM calls (steps) - Token volume per call - Tool execution cost
bash cd agent cost report --agent loan-processor --since 24h
Outlier expensive tasks usually loop unnecessarily — tune planning.
Human-in-the-loop¶
For uncertain decisions:
yaml agent: escalation_criteria: "uncertainty > 0.5" human_review_required: - loan_amount > 1000000 - first_time_applicant - score_borderline (550-650)
Escalations route to human queue; humans approve / reject / send back to agent with feedback.
Audit¶
Per task: - Full step-by-step trace - Each tool input/output - LLM prompts and responses - Final decision
Auditable for compliance (regulated industries) and debugging.
Related¶
Agent loops¶
agent.average_steps > 15:
- Agent stuck in retry loop on same tool
- Planning logic failing to converge
- Tool returning ambiguous results
Inspect trace; usually obvious pattern. Common fixes: - Tool returns clearer signal (success/failure/retry) - Agent prompt directs different tool after N failures - Hard step cap forces escalation
Tool call failures¶
agent.tool_errors_per_hour spike:
- Backend service down (transient)
- Auth credentials expired (refresh)
- Schema drift (tool changed; agent expectations didn't)
For transient: agent should retry. For persistent: escalate.
Hallucinated tool calls¶
Agent calls a tool that doesn't exist or with wrong parameters: - System prompt clearer about available tools - Lower temperature for tool-call decisions - Tool registry validation rejects invalid calls
Escalation queue piling up¶
Humans can't keep up with escalations: - Reduce escalation criteria (handle more autonomously) - Improve agent quality (fewer false escalations) - Add reviewers / shifts
Don't lower criteria to relieve queue if agent's decisions are actually borderline.
Cost runaway¶
agent.cost_bdt.per_task climbing: - Step count rising (loops) - Tool calls expensive (one calls another that calls another) - Long context per step
Tune; cap per-task budget:
bash cd agent budget set --agent loan-processor --max-bdt-per-task 50
Agent decision audited as wrong¶
Compliance / audit reviews and disagrees with agent decision: - Inspect trace; was logic sound given inputs? - If inputs wrong: data quality issue - If logic wrong: tune agent prompt or escalation criteria - Document; learn
New regulation requires agent change¶
Regulator updates rules; agent's logic needs update: - Update prompt / tools - Test in shadow mode (agent runs but doesn't act) - Switch to production after validation
Plan for regulatory change as ongoing cost.