Skip to content

monitoring: fix alert state query, fix alert firing threshold

Warren Gifford requested to merge monitoring/alerts into master

Created by: bobheadxi

Follows up https://github.com/sourcegraph/sourcegraph/pull/12395, https://github.com/sourcegraph/sourcegraph/pull/12483 both of which contained mistakes:

  • the field for alert state is alertstate, not state
  • alertQuery should only fire on >= 1 - see this query:
image

The above is critical_frontend_hard_error_search_responses, and:

  • green line indicates actual value above critical threshold
  • blue line indicates current, incorrectly triggered alerts (excluding times when it should actually be triggered)
  • orange line indicates new query that correctly aligns with the value actually being above the critical threshold

Some more examples for sanity checks:

Merge request reports

Loading