Lookout
000 015 030 045 060 075 090 105 120 135 150 165 180 195 210 225 240 255 270 285 300 315 330 345 360
2 min read Tom Shafer

PII redaction and retention

Scrubbing personal data at ingest and pruning old events on a schedule — privacy controls that make an observability tool safe to point at production.

You can't responsibly collect this much data without controls on it. So part of this stretch was PII redaction and retention.

Redact at the chokepoint

The safest place to scrub personal data is before it's stored — at ingest, not in a nightly cleanup that leaves a window of exposure. I added a PiiRedactor that runs at the auth and error ingest chokepoints, with a redaction_settings JSON per project and a Privacy tab to control it: choose what to scrub — query bindings, request bodies, auth headers, and so on.

The principle: data you never store can never leak. Redaction at the door beats deletion after the fact every time.

Retention, per signal

The other half is retention — old events shouldn't live forever. I rewrote the lookout:prune command to be driven by the central signal registry (IngestSignalTypes), so each signal type prunes on its own configured retention window using its own time column. Errors might keep 90 days; high-volume traces, far less. One scheduled command, per-signal policy.

Driving prune off the same registry that powers quota and config means there's one source of truth for "what signals exist and where their timestamps live." Add a new watcher and pruning just works — no separate cleanup code to forget.

Why it's not optional

An observability tool asks you to point it at your most sensitive surface: every request, every auth attempt, every error with its context. That's a lot of trust. Redaction and retention are how you earn it — they're the difference between a tool security will approve and one they'll ban. Privacy isn't a feature here; it's a license to operate.

Last piece of this run: anomaly detection.

build-in-public privacy pii retention