frontend: Increase batch size from 500 to 1250
Created by: mrnugget
This is a follow-up to #4279 and adjusts the batch size to a value backed up by benchmarks.
Local benchmark setup
- Disabled cloning in
gitserver
- Added
4230
repos NO_KEYCLOAK=1 ./enterprise/dev/start.sh
Test script
Since none of the repos is cloned, we ask for the first 20 cloned
repositories, because for that we need to traverse all 4230
to get the
first 20.
$ cat get_cloned_repos.sh
#!/usr/bin/env bash
time curl 'http://localhost:3080/.api/graphql?Repositories' \
-H "Authorization: token $SRC_TOKEN" \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-s \
--data-binary '{
"query": "query Repositories( $first: Int $query: String $cloned: Boolean $cloneInProgress: Boolean $notCloned: Boolean $indexed: Boolean $notIndexed: Boolean ) { repositories( first: $first query: $query cloned: $cloned cloneInProgress: $cloneInProgress notCloned: $notCloned indexed: $indexed notIndexed: $notIndexed ) { nodes { id name createdAt viewerCanAdminister url mirrorInfo { cloned cloneInProgress updatedAt } } totalCount(precise: true) pageInfo { hasNextPage } } }",
"variables": {
"cloned": true,
"cloneInProgress": false,
"notCloned": false,
"indexed": true,
"notIndexed": true,
"first": 20,
"query": ""
}
}' >/dev/null
Per batch I then ran the script 10 times in a loop:
for i in $(seq 1 10); do echo "---- Run ${i} ----\n"; ./get_cloned_repos.sh; done
Results
Batch size: 500, avg req duration after 10 requests: 1.4s
Batch size: 750, avg req duration after 10 requests: 0.984s
Batch size: 1000, avg req duration after 10 requests: 0.848s
Batch size: 1250, avg req duration after 10 requests: 0.663s
Batch size: 1500, avg req duration after 10 requests: 0.686s
Batch size: 1750, avg req duration after 10 requests: 0.702s
With batch sizes greater than 1250
diminishing returns kick in (and
memory consumption increases), so I chose 1250
Caveats
- Of course, with a larger
repo
table the increases in performance might continue after1250
, because more roundtrips are saved. But since we're already talking about ~600ms, which is quite tolerable, we can put off further tuning. - Benchmarking this locally in a realiable way is quite hard, since the benchmark runs are quite short and, in my case, the filter operations seemingly triggered macOS's
opendirectoryd
to go crazy in regards to CPU usage. In short: I can't guarantee that there weren't any interferences when running these tests
Test plan: go test & manual looking at repositories list in browser