Approach 2: add repo_statistics and gitserver_repos_statistics tables
Created by: mrnugget
This is the 2nd approach to implementing these statistics tables after we discovered that the original approach in https://github.com/sourcegraph/sourcegraph/pull/39660 lead to contention around the repo_statistics
table.
90% of the code in here is the same as in #39660, what changes is that now we have multiple rows in the repo_statistics
table:
- Every time a
repo
row (and in certain cases: agitserver_repo
row) is updated/inserted/deleted, we append (!) a row torepo_statistics
with a diff of the total counts before/after the row change. Example: if arepo
is deleted we append a row withtotal = -1
to therepo_statistics
table. - At query time we use
SELECT SUM(total), SUM(cloned), SUM(deleted), ...
to get the current total counts. - A worker periodically (right now: every 30min) compacts the table by (1) getting the current counts, (2) updating the first row's columns to reflect total counts, (3) deleting all other rows.
Demo video
https://user-images.githubusercontent.com/1185253/185577375-d6d2d7da-f6a4-4aad-b940-3927a4d7dd6b.mp4
Test plan
- Existing and new unit tests
- Manual testing