On-call integrations: PagerDuty and Opsgenie
Adding PagerDuty and Opsgenie as alert channels with severity mapping and incident deduplication, so Lookout can actually page the on-call engineer.
Next: on-call integrations — PagerDuty and Opsgenie.
The alert engine already routed to Slack, Discord, Teams, Telegram, SMS, and webhooks. But a Slack message at 3am doesn't wake anyone. PagerDuty and Opsgenie do — they're built for on-call: schedules, escalation, acknowledgement. Integrating them is how an alert becomes a page.
Mostly new transports on an old seam
Because the engine was built pluggable, this was mostly two new channel transports:
- PagerDuty — POST to the Events API v2 with
event_action: trigger, adedup_key, and a severity. - Opsgenie — POST to the Alerts API with
GenieKeyauth, a priority, and an alias for dedup. US and EU regions supported.
No changes to routing, subscriptions, or logging — they just plugged in. That's the dividend of designing the seam right earlier in the sprint.
Severity that means something
I added an AlertSeverity map so events arrive at the right urgency instead of all looking identical. A fatal crash, an outage, a missed cron, an SLO burn → critical (it pages). A new issue, a failed job, an anomaly → error. A recovery or a digest → info. PagerDuty gets its severity levels; Opsgenie gets P1–P5. The on-call platform then applies its escalation policy based on that urgency — which is exactly the division of labor you want: Lookout decides how bad, PagerDuty decides who and when.
Dedup, again
Incident tools dedup by key, so a re-firing alert folds into one open incident instead of spawning a hundred. The engine already carried a dedup key for every event — I just had to pass it through. The groundwork kept paying off.
Not everyone uses PagerDuty, though. Next: a native escalation engine for teams that don't.