zoekt: Configure and instrument zoekt-webserver watchdog
Created by: keegancsmith
We add configuration to tune how often the watchdog runs, the error threshold as well as the ability to turn it off.
- ZOEKT_WATCHDOG_TICK :: Duration of how often it runs. (30s)
- ZOEKT_WATCHDOG_ERRORS :: Consecutive error count before exit. (3)
If either is 0 the watchdog is disabled.
Additionally we add logging around when the state of the watchdog changes. We include metrics around the state of the watchdog. We have noticed in practice the watchdog failing. This is likely due to an overloaded system. The extra metrics/logging will help us see how overloaded the system is.
See commits in the Zoekt repo https://github.com/sourcegraph/zoekt/compare/58ac958bfd1d...48642cac2d97
- https://github.com/sourcegraph/zoekt/commit/48642ca zoekt-webserver: Configure watchdog via environment variables
- https://github.com/sourcegraph/zoekt/commit/c984eb3 zoekt-webserver: metrics and logs for watchdog