searcher and symbols deployments should switch to statefulsets
Created by: ggilmore
Note: all of the arguments below also apply to the symbols
service:
Right now, searcher
is implemented in Kubernetes as a plain deployment. I believe that this is largely due to historical reasons (the implementation predates the existence of stateful sets). searcher
also utilizes a local cache to store zips of repo@commit
that it fetches from gitserver
. However, searcher’s current implementation as a deployment doesn’t play nicely with this caching scheme:
- Because
searcher
is a deployment, we can’t cleanly assign a persistent disk to store each replica’s cache. This forces us to use rely on the node’s ephemeral storage instead: https://github.com/sourcegraph/deploy-sourcegraph/blob/81cd19ef60d5d236d193ffeb2c28500fb58d703f/base/searcher/searcher.Deployment.yaml#L67 . This has two problems:
- Kubernetes will forcefully evict the the searcher pod once it uses up all of its ephemeral storage. See https://github.com/sourcegraph/customer/issues/740 for an example of this affecting a customer.
- The cache is wiped out whenever the pod is restarted (either through rescheduling or through a normal deployment - which happens frequently on sourcegraph.com). This forces the cache to be repopulated.
- Because
searcher
is a deployment, each pod doesn’t get a stable network identifier. This identifier can change on every re-deployment. This means that a repository can bounce around between replicas during this team - which also harms cache performance.
Both of these issues can be fixed by switching searcher over to a statefulset. A StatefulSet allows us to use easily assign a persistent volume to each replica for the cache and have a stable network identifier (searcher-0
, ...) to get rid of the repo bouncing problem.