docker: use a short timeout when inspecting images
Created by: LawnGnome
Docker Desktop seems to be prone to occasionally entering states where it is nominally available (in the sense that it listens for connections), but can't actually do anything useful. Memory exhaustion events from previous src batch
runs tend to be the culprit here.
This seems to be most easily detectable by looking at the behaviour of docker image inspect
. This command doesn't have to hit the network, nor does it perform any real work, so it should always be "quick". If it's not, that implies that Docker is not in a happy place.
This commit adds a five second timeout to docker image inspect
invocations, along with a secret environment variable to adjust this behaviour. If docker
times out, then a (hopefully) helpful error message is generated to hint towards the things that the user should investigate next (basically, "is Docker really working?").
The only real concern I have with this is that I've basically picked five seconds out of thin air: practically, Docker for Mac seems to be able to do this in tens of milliseconds, but Sourcegraph has bought me a rather nice laptop. I have trouble imagining a scenario where multiple second delays are common where there aren't other serious issues, but the world is a large, weird place.
Test plan
There's decent test coverage here, and I've added a new test case for this specific change.
I also tested this manually in these scenarios:
- Regular operation (to confirm nothing actually changed)
-
docker
fails to respond (bykillall -STOP dockerd
in the Docker Desktop VM) - Set the timeout to
0
via the environment variable sosrc
never actually waits and immediately fails