[WIP] search: further reduce redundant conversions and iteration over large repo lists
Created by: tsenart
This PR further reduces redundant conversions and iterations over large repo lists on the critical search path. This is done by:
-
No longer converting
[]types.RepoName
lists to[]*search.RepositoryRevisions
. Instead, we pass around a larger shared*search.Repos
that contains revisions in aRepoRevs map[api.RepoName]search.RevSpecs
field. -
We also no longer add
HEAD
revisions toRevRepos
unless explicitly specified by the user. Search backends must check if revs for a repo are empty and if so, usesearch.DefaultRevSpecs
. This should shave off 1s of latency on dot-com today. -
Avoiding copying cached repos in
ListSearchable
andListIndexable
. This is done by:- Never mutating these lists, and instead populating an
Excluded *types.RepoSet
field in*search.Repos
when we need to exclude repos from the final result set. This field is checked bysearch.Repos.ForEach
which is the canonical way to iterate over this type. - Separating private and public repos returned by these methods, so that the dynamic private repos which are dependent on the signed-in user don't have to be merged with public repos which don't change per-request.
- Never mutating these lists, and instead populating an
Additionally, this PR:
- Moves ref glob expansion to repo resolution. Since we were already calling
git.ResolveRevision
for each non HEAD RevSpec, is it that bad to call out to gitserver for globs at this stage too? It really simplifies the code. - Introduces
StreamingListRepoNames
andStreamingListIndexableRepos
which take a callback function which gets called as soon as a record has been read from the DB. This allowed keeping the API of the non streaming versions of these methods intact and paves the way to make repo resolution a streaming operation instead of batch like it is today.
I recognize this is a large change. I'm happy to either pair review live or split it up into smaller independent PRs.