Use TTL for HTTP cache to stop Redis running OOM
Created by: mrnugget
(See this Slack message to get the full context)
This changes the httputil.Cache
that's used by repo-updater
when doing HTTP requests to set an expiration date on the cache entries it adds to Redis.
Why? Read on...
We discovered that on Sourcegraph.com the redis-cache
POD would crash because it ran out of memory.
Redis' INFO
command reported that only 212 out of 650k keys have an expiration date:
db0:keys=656993,expires=212,avg_ttl=2489492
That's interesting for a Redis instance that's used for caching purposed. And as it turns out, there are a few places in the codebase that use our internal rcache
package to use Redis as a cache. Most of them use rcache.NewWithTTL
which sets a TTL on every cache entry.
But the httputil
package uses rcache.New
— without a TTL. Everything that's added through that caching layer is persisted forever in Redis if not deleted manually.
Where is it used? repo-updater
uses it in its standard HTTP client factory:
The resulting client is then used when talking to external services (GitHub, Amazon CodeCommit, etc.).
That explains where the majority of the keys in Sourcegraph.com instance comes from. They all have the v1:http:
prefix used by the httputil.Cache
.
That also means that the memory usage of Redis is basically unbounded. Granted, there are a few places where the used httpcache
deletes something from the cache:
https://sourcegraph.com/search?q=repo:%5Egithub%5C.com/gregjones/httpcache%24+t.Cache.Delete
But that depends on the actual cache headers sent in the HTTP responses that are being cached and it only happens for requests that have been done twice. So one-off requests stay in Redis forever. Others potentially forever.
Two questions to reviewers:
- Could this explain why customers were seeing memory usage grow unbounded on their instances?
- Why did this cache not use a TTL even though the comment above it mentions a TTL of a week?
(cc @nicksnyder @beyang since they might have historical context, since the commit that added this got lost)