insights: retry queries that encounter `shard-timeout` events
Created by: leonore
closes #36226 (closed)
also adds logging for skipped reasons in the JIT path as we didn't have it there
Test plan
Artificially added a shard timeout event in search's progress handler (couldn’t figure out how to hit it locally).
skipped := []Skipped{
{
Reason: ShardTimeout,
Title: "shard-timeout-leo",
Message: "i'm fake!",
},
}
And then created a backend insight, saw the error in the insights_query_runner_job
queue, which eventually succeeded after retrying once the artificial event was removed and the retry period had passed.
Once this is landed there will be a post-merge test on k8s.sgdev to see if backfill works better