perf: Load minimal repos from DB & hydrate on demand in GraphQL layer (!4749) · Merge requests · Warren Gifford / sourcegraph

Warren Gifford requested to merge core/repo-hydration into master Jul 01, 2019

Created by: mrnugget

This is currently still a draft, but it works: instead of a complete *types.Repo we only load a *MinimalRepo from the database. When a GraphQL user requests the Language, Description or URI for a Repository we load the rest. Per default, we never request those fields anyway.

The current approach has the big drawback of resulting in N+1 queries: when you get 50 repositories in your results and ask for the Description of all of them, we issue 50 more queries to the database.

The advantage of this dumb approach is that it's easier to build than a solution that hydrates all repositories contained in a result set at once (lazily or eagerly).

TODOs and ideas/suggestions welcome!:

Possibly find better names for *MinimalRepo, RepoIdentifier and its methods
Better error handling in (r *repositoryResolver) hydrate
Get rid of the duplication between minimalAuthzFilter and authzFilter (see comment here)

Benchmarks:

There is no noticeable difference, but I suspect that's due to the database not being involved

name             old time/op    new time/op    delta
SearchResults-8    4.69ms ± 5%    5.61ms ±34%    ~     (p=0.052 n=10+10)

name             old alloc/op   new alloc/op   delta
SearchResults-8    2.32MB ± 0%    2.32MB ± 0%  +0.10%  (p=0.001 n=9+10)

name             old allocs/op  new allocs/op  delta
SearchResults-8     20.7k ± 0%     20.6k ± 0%  -0.18%  (p=0.000 n=10+10)

Update after discussion

TODOs:

Add RepoIDs and *RepoFields structs to types.Repo
Get rid of db.MinimalRepo
Change "hydration" code to hydrate the types.Repo
Change types.Repo.RepoIDs.ExternalRepo to not be a pointer
Get rid of types.RepoIdentifier interface now that we have lightweight types.Repo

perf: Load minimal repos from DB & hydrate on demand in GraphQL layer

Update after discussion

Merge request reports