Search Performance Data

Created by: poojaj-tech

Measure additional data for search queries for internal search performance dashboards to quantify indicators of poor search performance. This helps us monitor baseline performance for customers and dig into specifics for slow searches. --We should analyze this on Sourcegraph Server first because we have latencies for every query. --I've bolded the data points that have surfaced as most important to measure so far based on customer insights and conversations.

For common search types (all/aggregated view, literal, regexp, symbol, file, repo)

Default search experience is slow

# searches resulting in errors

Comprehensive (expensive) search query is run

# system-thrown and user-specified timeouts
# system-thrown and user-specified high counts (count:9999999 #14316)

Known sources of relatively slower search queries

API
Unindexed
Saved Searches (automated)
src-cli

Where time is spent in the system

# of repos we attempted to search

To understand the impact of this data, we will additionally need to connect it to search latency. One way to do this is to tie it back to P99, P90, P50 daily/weekly/monthly data but it is more actionable to see this breakdown by % of slow searches that respond in between 0-1 sec, 1-10 sec and 10+ sec. Sample solution: -customer name -> 1 - 10 seconds -> Literal -> Dimensions (e.g. # of errors) outlined above

Note on Tools for exploration Honeycomb - Useful for running SQL-like queries to dive deeper Looker - Better for building dashboards and analyzing customer data