Agentic RCA

Incident context becomes async evidence collection.

Agentic RCA expands the landing page flow: an incident or alert becomes a plan, scoped agents collect MCP evidence in parallel, and the results return into one reviewable analysis.

RCA flow board

Alert, plan, evidence, review, action.

  1. 01

    Input

    Incident / Alert

    The starting point can be an alert, incident note, scheduled risk window, or responder question.

  2. 02

    Plan

    Planner

    Planner turns the incident brief into focused investigation tasks.

  3. 03

    Context

    Custom Knowledge

    Custom Knowledge adds topology, runbooks, policy context, ownership notes, and service assumptions.

  4. 04

    Parallel work

    Async Agents

    Scoped agents collect evidence asynchronously instead of forcing dashboard hopping.

  5. 05

    Evidence

    MCP Evidence

    MCP Servers return logs, metrics, traces, provider state, deploy history, edge rules, and alerts.

  6. 06

    Validation

    Reviewer

    Reviewer checks whether the evidence supports the causal chain.

  7. 07

    Brief

    RCA Brief

    The brief consolidates likely causes, affected scope, supporting signals, and missing checks.

  8. 08

    Decision

    Action Plan

    The output ends with validation checks, rollback candidates, mitigation steps, handoff notes, and prevention items.

Async agents

Parallel work without losing one RCA thread.

The flow fans out only after the plan exists, then each lane returns evidence into the same analysis.

Lane A

Observability agent

Logs, metrics, traces, alerts, and deploy markers are aligned by time and impact.

Lane B

Deploy agent

Rollouts, config changes, release notes, and ownership notes are compared against symptom start.

Lane C

Infrastructure agent

Kubernetes, cloud APIs, autoscaling, ingress, and provider state are checked for affected paths.

Lane D

Edge and alert agent

CDN, WAF, edge rules, blocked requests, traffic shifts, and alert grouping are validated.

Reviewer gate

Proposed causes have to survive evidence review.

Supported

Evidence-backed cause

A likely cause stays in the RCA only when the collected signals explain what changed and what is affected.

Needs check

Validation still required

Weak hypotheses are kept visible as missing checks instead of being promoted into action too early.

Inspect

Read the evidence chain

Every proposed cause should show the signals that support it and the checks still needed before action.

Challenge

Reject unsupported causes

Responders can push back on weak hypotheses when evidence is missing or when another change better explains the symptoms.

Continue

Keep one investigation thread

The same RCA context can continue through chat, scheduled RCA, MCP evidence paths, Codex CLI, or Claude CLI.

Next step

See how this fits your on-call flow.

GET DEMO