extsvc: Reduce load on dependency code hosts
Created by: tsenart
Context
The current implementations of dependency code host integrations (i.e. JVM, npm, Go) make too many unnecessary requests to those hosts. Existing rate-limiting attenuates that impact, but we can do better.
Notes:
- We have HTTP caching enabled in httpcli.ExternalDoer. This is helping reduce load already, despite the inefficient request pattern. Definitely doesn't apply to JVM since we invoke
coursier
which then issues requests.- We probably don't want to rate limit requests when they are served from Redis / cache. The current rate limit package we use doesn't seem to allow "giving" back tokens in case a request was served from cache. Let's figure out how to do this properly (cc @ryanslade).
- Gitserver also issues requests to the dependency hosts, but doesn't need to, since we have already sent those same requests in repo-updater. Can we store the "validity" of each version's metadata in the
repo.metadata
column and reuse that in gitserver's vcsDependenciesSyncer?- Idea: store map of version to download URL in metadata. This will make sure we have determinsitic packages for each version. Fallback to request if 404.
- Similarly to #34289, whatever solution we land on in repo-updater, we'd benefit from factoring out a
repo.DependenciesSource
that would abstract the common bits betweenNPMPackagesSource
,JVMPackagesSource
andGoModulesSource
.