Search results load quick, but the code does not render quickly
Created by: slimsag
Issue
Sometimes search results themselves load very quickly, but the code in the search results UX does not load quickly. When using our Bitbucket Server integration, results can sometimes take 10+ seconds to load.
Root cause TL;DR
Effectively, the bottleneck is that for every search result viewed in our UX, we trigger two repo-updater /repo-lookup
requests. Each one of these /repo-lookup
requests in turn makes two code host API requests.
- The root cause of this bug affects ALL code hosts. For every search result viewed in our UX, we consume 4 code host API requests.
- In the case of Bitbucket Server, we impose strict self-enforced API rate limiting (7200 req/hr). When we run out of our API quota (which happens rather quickly), the code in search results begin to load very slowly as we wait for more API quota to serve the request.
- In the case of other code hosts, the effects are less pronounced it seems (or else we would've caught this sooner). This is likely to be the largest cause of us running out of API rate limit quota on any code host.
Root cause detailed explanation
Every time we render the code for a search result, we perform a GraphQL request.
`/.api/graphql?HighlightedFile`
query HighlightedFile(
$repoName: String!
$commitID: String!
$filePath: String!
$disableTimeout: Boolean!
$isLightTheme: Boolean!
) {
repository(name: $repoName) {
commit(rev: $commitID) {
file(path: $filePath) {
isDirectory
richHTML
highlight(disableTimeout: $disableTimeout, isLightTheme: $isLightTheme) {
aborted
html
}
}
}
}
}
This request on the backend is translated into multiple complex codepaths, obviously. We can break them down into four main groups:
(1) repository(name: $repoName)
- Invokes
backend.Repos.GetByName
- Invokes
db.Repos.GetByName
- Invokes SQL, Repository ACL validation (all relatively cheap)
- Invokes
(2) commit(rev: $commitID)
- Invokes
backend.Repos.ResolveRev
- Invokes
GitRepo
- Does not hit the
quickGitserverRepo
codepath, because that only works on github.com and gitlab.com specifically (i.e. for sourcegraph.com). - Invokes
repoupdater.DefaultClient.RepoLookup
- Does not hit the
- Invokes
git.ResolveRevision
- Invokes
- Invokes
backend.Repos.GetCommit
- Invokes
GitRepo
- (same as above section)
- Invokes
(3) file(path: $filePath)
- Invokes
Blob
- Invokes
backend.CachedGitRepo
- Invokes
(4) highlight(...)
- Invokes
backend.CachedGitRepo
What you can glean from the above is that:
- We sometimes use
backend.CachedGitRepo
(twice per one of these GraphQL requests). - Other times, we use
backend.GitRepo
(twice per one of these GraphQL requests). - When we use a
CachedGitRepo
, no request is made to repo-updater. - When we use a
GitRepo
, a/repo-lookup
request is made to repo-updater.
So now the question is: How expensive is a /repo-lookup
request? What cost does it incur? Well..
- The code which serves /repo-lookup requests attempts to locate an authoritative source or each repository, by calling out to each code-host.
- In the case of Bitbucket Server,
repos.GetBitbucketServerRepository
is called. - (cheap) InvokesgetBitbucketServerConnection
. - COSTLY: Invokesconn.client.Repo
- COSTLY: Invokesconn.client.Repo
again.
All in all, we can conclude that for each search result viewed, we are consuming two code host API requests. This is true for Bitbucket Server, and also appears to be true for all other code hosts (but I have not verified this for each one).
SOLUTION: TBD.