zoekt-webserver: Optimize startup
- Truncate descriptions
Created by: keegancsmith
When starting up zoekt-webserver on a large instance it can take several minutes for it to load all the shards. While it is loading the shards, zoekt-webserver is unavailable. Additionally there is nothing logged.
Context
@tsenart and I adjusted the resource allocation of zoekt-webserver on k8s.sgdev.org. It took several minutes until zoekt-webserver pod was marked as "ready" and the port (6070) was listening. We had no indication of what was happening (no logs or debug port available yet). We execed into the pod and inspected CPU usage (low, only single core being partially used), vmstat (high mem usage, but expected) but not IO unfortunately (so not sure if fully IO bound). I suspected the webserver was loading shards, which involves mmap + parsing some contents of the shard into in-memory structures (the indexData struct). lsof failed to respond (likely due to the amount of data). We ended up confirming it was indeed mmap by monitoring the output of grep /data/index /proc/9/maps | wc -l
and seeing that number increase. Note 9 is the PID of zoekt-webserver in this case.
Proposal
- Concurrently load shards.
- Make webserver available before all shards are loaded.
Note we intentionally made the webserver available only after all shards were loaded. However, in the case of large instances I believe it is better to make the partial index available than no index given how long startup can take. In either case for repositories that are not yet available, we fallback to searcher. The only downside is we will say a repository is not indexed in the admin UI. But I think this is fair, since the index is not yet available.
- Show labels
- Show closed items