repo-updater: Add feature-flagged `StreamingSync` method to `Syncer`
Created by: mrnugget
(See bottom for context on this PR)
This PR adds a StreamingSync
method to the Syncer
in repo-updater
that inserts/updates repositories one-by-one, while they're being fetched from the code hosts.
As you can see, there's not that much code here: a helper method on the repos store called DeleteReposExcept
and the rather (too?) long method called StreamingSync
. I've annotated it with comments for my own understanding, but that might also give you context when reading through it.
I've also changed the existing Syncer tests to also run with this new StreamingSync
method. The main difference is that when we do a streaming sync, we don't care about the "complete diff" (I don't think that's required, but if we do want that, for debugging purposes, I think that's also possible) in the tests.
See #5145 and previous PR #5366 for more information on the goals/ideas behind this.
Since I've now switched to work on the higher-priority A8N RFC20 and RFC28, I thought that I could already gather feedback on this.
But this PR is complete, as in: tests are passing, it works, but the "only" thing missing is extensive testing/analyzing:
- Test this on k8s.sgdev.org
- Analyze the "intermediate phases" of a sync (i.e. do we delete and insert the same repos in a single sync? If so, how many useless writes do we do)
- It's probably wise to port the naming conflict tests from the
TestNewDiff
andTestSyncSubset
tests to the streaming-sync tests