Skip to content

perf: Load minimal repos from DB & hydrate on demand in GraphQL layer

Warren Gifford requested to merge core/repo-hydration into master

Created by: mrnugget

This is currently still a draft, but it works: instead of a complete *types.Repo we only load a *MinimalRepo from the database. When a GraphQL user requests the Language, Description or URI for a Repository we load the rest. Per default, we never request those fields anyway.

The current approach has the big drawback of resulting in N+1 queries: when you get 50 repositories in your results and ask for the Description of all of them, we issue 50 more queries to the database.

The advantage of this dumb approach is that it's easier to build than a solution that hydrates all repositories contained in a result set at once (lazily or eagerly).

TODOs and ideas/suggestions welcome!:

  • Possibly find better names for *MinimalRepo, RepoIdentifier and its methods
  • Better error handling in (r *repositoryResolver) hydrate
  • Get rid of the duplication between minimalAuthzFilter and authzFilter (see comment here)

Benchmarks:

There is no noticeable difference, but I suspect that's due to the database not being involved

name             old time/op    new time/op    delta
SearchResults-8    4.69ms ± 5%    5.61ms ±34%    ~     (p=0.052 n=10+10)

name             old alloc/op   new alloc/op   delta
SearchResults-8    2.32MB ± 0%    2.32MB ± 0%  +0.10%  (p=0.001 n=9+10)

name             old allocs/op  new allocs/op  delta
SearchResults-8     20.7k ± 0%     20.6k ± 0%  -0.18%  (p=0.000 n=10+10)

Update after discussion

TODOs:

  • Add RepoIDs and *RepoFields structs to types.Repo
  • Get rid of db.MinimalRepo
  • Change "hydration" code to hydrate the types.Repo
  • Change types.Repo.RepoIDs.ExternalRepo to not be a pointer
  • Get rid of types.RepoIdentifier interface now that we have lightweight types.Repo

Merge request reports

Loading