Search Performance Data
Created by: poojaj-tech
Measure additional data for search queries for internal search performance dashboards to quantify indicators of poor search performance. This helps us monitor baseline performance for customers and dig into specifics for slow searches. --We should analyze this on Sourcegraph Server first because we have latencies for every query. --I've bolded the data points that have surfaced as most important to measure so far based on customer insights and conversations.
For common search types (all/aggregated view, literal, regexp, symbol, file, repo)
- Default search experience is slow
-
# searches resulting in errors
- Comprehensive (expensive) search query is run
-
# system-thrown and user-specified timeouts -
# system-thrown and user-specified high counts (count:9999999 #14316)
- Known sources of relatively slower search queries
-
API -
Unindexed -
Saved Searches (automated) -
src-cli
- Where time is spent in the system
-
# of repos we attempted to search
To understand the impact of this data, we will additionally need to connect it to search latency. One way to do this is to tie it back to P99, P90, P50 daily/weekly/monthly data but it is more actionable to see this breakdown by % of slow searches that respond in between 0-1 sec, 1-10 sec and 10+ sec. Sample solution: -customer name -> 1 - 10 seconds -> Literal -> Dimensions (e.g. # of errors) outlined above
Note on Tools for exploration Honeycomb - Useful for running SQL-like queries to dive deeper Looker - Better for building dashboards and analyzing customer data