fetch SRC_GIT_SERVERS from frontend if not configured
Created by: slimsag
Problem
Many of our services need to talk to the gitserver service. In specific, frontend
, indexer
, query-runner
, repo-updater
, searcher
, and symbols
services. It is likely that more in the future will also need to.
When updating a Kubernetes cluster deployment to add more gitserver instances, it is currently required that you updated every single one of these services SRC_GIT_SERVERS
environment variable to point to the new gitserver instance. This is very tedious, complex, and error-prone.
For Kubernetes, it may be that we could do what we do for a few other services by using our pkg/endpoint
package. That would theoretically enable us to use Kubernetes built-in service discovery such that e.g. SRC_GIT_SERVERS=k8s+http://gitserver
would always be distributed between all running gitserver instances (but I have not confirmed this is possible, and I suspect there is a reason we do not do it today).
However, this means that our Kubernetes deployment would be nice, while other non-Kubernetes cluster deployments would suffer. This problem is made worse because it is generally very easy to e.g. update every service with Kubernetes, but this may not be the case in a different container deployment system (like a pure-Docker one) where each service would need to be individually updated and restarted manually.
Proposed solution
This PR proposes a solution where any service that does not specify SRC_GIT_SERVERS
will automatically fetch the value of SRC_GIT_SERVERS
via the frontend's internal HTTP API periodically (every 5s).
This means that:
- Every service does not need to be independently updated to add a new gitserver instance, only the frontend needs to be updated. All other services will respect the frontend's config in a few seconds.
- We are tied to the frontend for service discovery; this means: a. If the frontend goes down, other services may not be able to talk to gitserver (unless they've already queried the list via frontend in the past successfully). This seems acceptable, given that the frontend going down would also mean Sourcegraph is down from a user POV. b. We do not need to implement complex service discovery for every deployment option (Kubernetes, Titus, Mesosphere, etc., etc.) natively / we can keep it simple.
- We inherently retain backwards compatibility for services that do specify
SRC_GIT_SERVERS
, such as go-langserver which is in an odd limbo state here (this same logic allows the frontend to talk to gitserver, and for tests, so it is not additional complexity).
This PR does not need to update the CHANGELOG because it will be in deploy-sourcegraph history instead.