monitoring: fix alert state query, fix alert firing threshold
Created by: bobheadxi
Follows up https://github.com/sourcegraph/sourcegraph/pull/12395, https://github.com/sourcegraph/sourcegraph/pull/12483 both of which contained mistakes:
- the field for alert state is
alertstate
, notstate
-
alertQuery
should only fire on>= 1
- see this query:
The above is critical_frontend_hard_error_search_responses
, and:
- green line indicates actual value above critical threshold
- blue line indicates current, incorrectly triggered alerts (excluding times when it should actually be triggered)
- orange line indicates new query that correctly aligns with the value actually being above the critical threshold
Some more examples for sanity checks: