Skip to content

[WIP] search: further reduce redundant conversions and iteration over large repo lists

Warren Gifford requested to merge optimize-associate-validate-revs into main

Created by: tsenart

This PR further reduces redundant conversions and iterations over large repo lists on the critical search path. This is done by:

  • No longer converting []types.RepoName lists to []*search.RepositoryRevisions. Instead, we pass around a larger shared *search.Repos that contains revisions in a RepoRevs map[api.RepoName]search.RevSpecs field.

  • We also no longer add HEAD revisions to RevRepos unless explicitly specified by the user. Search backends must check if revs for a repo are empty and if so, use search.DefaultRevSpecs. This should shave off 1s of latency on dot-com today.

  • Avoiding copying cached repos in ListSearchable and ListIndexable. This is done by:

    • Never mutating these lists, and instead populating an Excluded *types.RepoSet field in *search.Repos when we need to exclude repos from the final result set. This field is checked by search.Repos.ForEach which is the canonical way to iterate over this type.
    • Separating private and public repos returned by these methods, so that the dynamic private repos which are dependent on the signed-in user don't have to be merged with public repos which don't change per-request.

Additionally, this PR:

  • Moves ref glob expansion to repo resolution. Since we were already calling git.ResolveRevision for each non HEAD RevSpec, is it that bad to call out to gitserver for globs at this stage too? It really simplifies the code.
  • Introduces StreamingListRepoNames and StreamingListIndexableRepos which take a callback function which gets called as soon as a record has been read from the DB. This allowed keeping the API of the non streaming versions of these methods intact and paves the way to make repo resolution a streaming operation instead of batch like it is today.

I recognize this is a large change. I'm happy to either pair review live or split it up into smaller independent PRs.

Merge request reports

Loading