Skip to content

Services: Add a generic form of network throttling

Created by: slimsag

When any of our services tries to transfer even semi-large requests or responses to another service on the same machine, since it doesn't cross the network we are able to transfer data at incredible speeds (80 Gbps+) and this temporarily takes down the local network and effectively DoS's the network temporarily.

This was first observed with our Jaeger integration, and then with zoekt-indexserver and I have since observed it also occur with zoekt-webserver when users run heavy search queries pulling around ~1 GiB of search results.

Thus, we need a form of generic network throttling that can apply to all containers.

  • This has the worst outcomes in deploy-sourcegraph-docker where it is sometimes ran on just one single host machine -- thus taking down the network for all services temporarily.
  • But, it can also happen just as easily with sourcegraph/server (but usually these deployments aren't large enough for us to notice).
  • This can also occur with Kubernetes deployments in which two pods are scheduled on the same machine. For example, this could've been an underlying cause in #4101 which we patched at an application-level.

So we need to add network "throttling" (actually just at very fast network speeds, like the fastest network speed GCE offers for example) on all services running in:

  • deploy-sourcegraph-docker
  • sourcegraph/server (single-container deployments)
  • deploy-sourcegraph (Kubernetes)

Observed at https://app.hubspot.com/contacts/2762526/company/407948923