Sync & store code host versions and send in ping data
Created by: mrnugget
This fixes https://github.com/sourcegraph/sourcegraph/issues/20033.
Approved RFC on metrics collection: RFC 417 Review: Add code host versions to pings PR to update the Big Query schema: https://github.com/sourcegraph/analytics/pull/211
What we have here works roughly as follows:
- Every 24 hours we iterate over the list of external services
- We group those external services by "unique code host identifier" (see comments in code) to avoid sending >1 requests to the same code host instance
- We send a request to each* unique code host instance to get the version (*currently only GitHub, Gitlab, Bitbucket Server, since other versions are irrelevant to us in Batch Changes and I don't think querying the versions of gitolite, phabricator, etc. makes a lot of sense right now)
- We store the versions in Redis
- We only store external service kind (GitHub, Gitlab, etc) next to the version
- Whenever ping data is sent up to Sourcegraph.com these versions are included, as a JSON array that would look roughly like this:
[
{
"external_service_kind": "GITHUB",
"version": "unknown"
},
{
"external_service_kind": "GITHUB",
"version": "2.22.6"
},
{
"external_service_kind": "BITBUCKETSERVER",
"version": "7.11.2"
},
{
"external_service_kind": "GITLAB",
"version": "12.7.2-ee"
}
]