perf: Load minimal repos from DB & hydrate on demand in GraphQL layer
Created by: mrnugget
This is currently still a draft, but it works: instead of a complete *types.Repo
we only load a *MinimalRepo
from the database. When a GraphQL user requests the Language
, Description
or URI
for a Repository
we load the rest. Per default, we never request those fields anyway.
The current approach has the big drawback of resulting in N+1 queries: when you get 50 repositories in your results and ask for the Description
of all of them, we issue 50 more queries to the database.
The advantage of this dumb approach is that it's easier to build than a solution that hydrates all repositories contained in a result set at once (lazily or eagerly).
TODOs and ideas/suggestions welcome!:
-
Possibly find better names for *MinimalRepo
,RepoIdentifier
and its methods -
Better error handling in (r *repositoryResolver) hydrate
-
Get rid of the duplication between minimalAuthzFilter
andauthzFilter
(see comment here)
Benchmarks:
There is no noticeable difference, but I suspect that's due to the database not being involved
name old time/op new time/op delta
SearchResults-8 4.69ms ± 5% 5.61ms ±34% ~ (p=0.052 n=10+10)
name old alloc/op new alloc/op delta
SearchResults-8 2.32MB ± 0% 2.32MB ± 0% +0.10% (p=0.001 n=9+10)
name old allocs/op new allocs/op delta
SearchResults-8 20.7k ± 0% 20.6k ± 0% -0.18% (p=0.000 n=10+10)
Update after discussion
TODOs:
-
Add RepoIDs
and*RepoFields
structs totypes.Repo
-
Get rid of db.MinimalRepo
-
Change "hydration" code to hydrate the types.Repo
-
Change types.Repo.RepoIDs.ExternalRepo
to not be a pointer -
Get rid of types.RepoIdentifier
interface now that we have lightweighttypes.Repo