Docs Safety & Control 3-Step Onboarding Design Partners GitHub Try the Demo
Open Source Claude Code Native Self-Hosted

From page to root cause.
In minutes, not hours.

RunbookAI investigates production incidents like your best on-call engineer: forms hypotheses, gathers evidence, and drives to root cause with recommended fixes. Built for production ops with approval gates, audit trails, and cross-system context.

Proven workflow in the demo: ranked hypotheses with confidence scores, full evidence traces, and remediation steps your team can approve in channel.

npx @runbook-agent/runbook demo
No API keys needed

Works with your stack

AWS Kubernetes PagerDuty OpsGenie Slack Datadog Confluence AWS Kubernetes PagerDuty OpsGenie Slack Datadog Confluence

One agent, end-to-end incident response

Investigate

Diagnose incidents

Forms hypotheses, gathers evidence, finds root cause automatically.

Execute

Run your runbooks

Step-by-step execution with approval gates for every mutation.

Connect

Query your infra

Natural language queries across AWS, Kubernetes, and CloudWatch.

Learn

Build operational knowledge

Indexes runbooks, postmortems, and architecture docs automatically.

Integrate

Meet your team where they are

Slack, PagerDuty, OpsGenie, Claude Code — all connected.

Protect

Safety by default

Every mutation requires approval. Full audit trail. Always.

Scale

Share knowledge across your team

Self-host a shared knowledge server. One deployment, every engineer connected.

From alert to resolution

1

Alert

An incident fires from PagerDuty, OpsGenie, or a Slack mention

2

Hypothesize

Forms ranked hypotheses from symptoms and organizational knowledge

3

Investigate

Runs targeted queries against your infrastructure for evidence

4

Resolve

Delivers root cause with confidence scores and remediation steps

Your coding AI meets your ops AI.

Surface relevant runbooks, known issues, and postmortems inside Claude Code sessions — so operational context is already there when you're debugging.

runbook integrations claude enable
Claude Code
> What's causing high latency in payments?
RunbookAI Context
Runbook: payments-high-latency.md
Known issue: Connection pool exhaustion
Last incident: 2024-12-03 (resolved)
Based on the RunbookAI context, this matches a known pattern of connection pool exhaustion under high load...

Up and running in seconds

Try it now — no API keys needed
$ npx @runbook-agent/runbook demo
Install globally
$ npm i -g @runbook-agent/runbook

Pick your path in under 60 seconds

Design Partner

Share your stack and incident pain

Tell us your role, infra setup, top incident pain, and how to reach you. We use this to prioritize roadmap work.

Open intake form
Onboarding

Use the AWS + Kubernetes + PagerDuty + Slack path

Follow an opinionated 3-step setup sequence with exact commands and links to the right docs sections.

Open 3-step onboarding

Tell us where incidents hurt

We are onboarding a small number of teams running production workloads on AWS and Kubernetes. Share a few details and we will follow up with a focused setup session.

If the form submit does not open your mail app, use the fallback link below.

Mailto fallback

Start investigating smarter.

Run a full incident investigation in minutes, then point it at your own stack. Open source. Self-hosted. No vendor lock-in.