Skip to content

monitoring: Observables with Warnings should require a Critical alert to be defined as well

Created by: tyates-indeed

Feature request description

Grafana contains many alerts across the various components of Sourcegraph. These alerts come in two levels: warning and critical. Many alerts only have warning level but no critical level. This makes it confusing for teams that manage Sourcegraph to understand the state of their instance. Every alert with a warning level should also have a corresponding critical level.

Is your feature request related to a problem? If so, please describe.

Due to noise in alerting, we only alerted on Slack when a critical alert fired. One alert, gitserver: 25+ repository clone queue size only has warning level and no critical level. This meant that the queue grew unbounded and we were not notified of the problem on Slack.

Describe alternatives you've considered.

For every alert in Grafana, ensure that both a warning and critical level are present.

Additional context

None