api: Add status-messages to repo-updater and GraphQL API
Created by: mrnugget
This is the first part of the implementation of #4120 as described in https://github.com/sourcegraph/sourcegraph/issues/4120#issuecomment-499911568. (The second part will include the web
component, UI, and changelog entry)
- Adds a
/clone-queue-status
endpoint togitserver
that returns the current and maximum number of repositories currently being cloned in parallel - Adds a
/status-messages
endpoint torepo-updater
that returns an array of status messages. Currently only one type exists,CURRENTLYCLONING
, and that is only returned if the number of currently cloning repositories, as returned bygitserver
, is non-zero. - Adds a
statusMessages
query to thegraphqlbackend
that allows querying for status messages for site admins
The second part of this implementation will be a second PR that includes the web
component of this feature.
Note: this is the minimal-viable implementation of this feature, as described in the ticket, and returns only the number of currently-being-cloned repositories.
The big caveat is that this number is not usable to indicate progress, it indicates activity. We have to evaluate/test/discuss whether that is "minimal-viable" for site-admins. Since the number could potentially be stuck at 5
for hours at a time if the Sourcegraph instance has been configured to only clone 5 concurrently (that's the default). But that would at least be an indicator that something happen, it's just not a progress-indicator and we have to be clear about this.
(cc @christinaforney)
Test plan: go test
Merge request reports
Activity
Created by: codecov[bot]
Codecov Report
Merging #4440 into master will increase coverage by
0.02%
. The diff coverage is76.47%
.@@ Coverage Diff @@ ## master #4440 +/- ## ========================================== + Coverage 47.09% 47.12% +0.02% ========================================== Files 714 715 +1 Lines 43168 43200 +32 Branches 1741 1741 ========================================== + Hits 20332 20357 +25 - Misses 20928 20934 +6 - Partials 1908 1909 +1
Impacted Files Coverage Δ cmd/repo-updater/repos/scheduler.go 54.23% <0%> (-0.73%)
cmd/frontend/graphqlbackend/status_messages.go 86.66% <86.66%> (ø)
cmd/repo-updater/repoupdater/server.go 62.45% <92.85%> (+1.6%)
web/src/components/shared.tsx 78.94% <0%> (ø)
shared/src/commandPalette/CommandList.tsx 33.33% <0%> (ø)
Created by: mrnugget
The "currently being cloned" number comes from
gitserver
'scloneLimiter
, a*mutablelimiter.Limiter
that limits the number of concurrent clones.Interestingly: we not only use (and thus increment/decrement the number of "currently being cloned" repos) when doing a "git clone"-clone, but also when we do a
repoUpdate
: https://sourcegraph.com/github.com/sourcegraph/sourcegraph@e3c1a4a0b66bea5252ef973c38f2c4f975aea062/-/blob/cmd/gitserver/server/server.go#L1149That gets also triggered when we do an
ensureRevision
or update a repos information (when syncing metadata for example). The result is that we say "currently cloning" far more often than one would suspect when one had "initial clone" in mind.I think we should either
- use a different limiter for
repoUpdate
(gitserver
has acloneableLimiter
that's used forls-remote
calls that could be used) to reduce number of occasions when "currently cloning" number is increased - change wording in message from "Currently cloning %d repositories" to "Currently updating %d repositories"
Option 1 makes the indicator happen far less and option 2 makes clear that it's a "git fetch/clone" indicator instead of just a clone indicator
Opinions?
- use a different limiter for
Created by: keegancsmith
We should probably separate the limiters. However, the limiter being shared is intentional, since cloning and fetching both use the same amount of resources on the code host. But, I don't mind ignoring that since the implication is not that bad given our steady state is no repos cloning and just fetching.
Created by: christinaforney
Options for clone vs. fetch
Option 1 makes the indicator happen far less and option 2 makes clear that it's a "git fetch/clone" indicator instead of just a clone indicator
I read this issue after the other one (#4441), and mentioned changing the wording in the other as well as concern around how often the active state is shown - really, I think we actually want to do both. Using separate limiters will improve the usability around how often the icon is "active", and changing the wording will properly indicate what is actually going on.
Number shown
I am a little worried about showing the number of currently cloning repos not changing - this feels misleading, especially when there could be thousands queued. Is there any way to identify the total number of repos that were queued to clone (shows progress)? I'm assuming not since you didn't do that - alternatively, what about how many remain in the queue? (e.g. "Currently cloning 5 repositories (1234 remaining)")
I am comfortable with getting this out with the current state (though if we can improve before the release, that's the ideal), and then talking with admins to see if this works for them (my hunch is this won't be sufficient but is a step in the right direction).
Created by: mrnugget
@christinaforney I talked this through with @tsenart and @keegancsmith again. Turns out there is an easy way to get to the "number of repositories enqueued for clone/update". In fact, I already switched to using this number in the latest commit: 57db965
Now, with this change, we now only have a single number in the message: the number of repositories that have been enqueued to be cloned/updated as soon as possible. In practical terms:
- If you add 500 repositories to your Sourcegraph instance, you will see this number starting at
~495
(since 5 are probably already cloning) and then decrease with time until it reaches zero - Since the scheduler/syncer run in the background and enqueue repositories for updates on a regular basis, you will see this number in the UI when we update repositories.
I'm afraid that separating the second from the first behavior to only show "enqueued for initial/first-time cloning" would be quite a bit more effort (since we would need to track this somewhere outside of the scheduler, in the database probably)
- If you add 500 repositories to your Sourcegraph instance, you will see this number starting at