frontend: hard errors alerts as ratio/percentage instead of absolute-value
Created by: bobheadxi
Writeup by @slimsag in https://github.com/sourcegraph/sourcegraph/issues/12011 re: frontend: hard_error_search_responses
It's not clear to me if this one is a fluke or not, because sourcegraph.com has some quite bad search perf issues we haven't been following up on - but a more direct query shows me the percentage of requests resulting in a hard error here is low so I presume we do not care and would like to relax the alert to account for a percentage instead of an absolute number of failing requests.
shows that the cause is about 300 browser code intel requests per day (find-references, jump-to-def, etc.) end in a hard error, where the user gets an error back instead of a result. If I had to guess, these are something like a timeout edge case we aren't recording properly (e.g. the user navigates away from the page while it is in flight), or correlated with deployments of services.
While that is worrying, the same query above shows that only accounts for only around 0.1% of requests soI think the TODO here is two-fold:
- File an issue to track turning this into a ratio/percentage instead of absolute-value alert.
- Silence the alert for now on Sourcegraph.com