authz: Bitbucket Server ACLs caching
Created by: tsenart
Summary
This change set introduces per-user Bitbucket ACLs caching that respects a user configured TTL. It leverages Postgres row locking for concurrency control of cache fill events, so that many concurrent requests during an expiration don't overload the Bitbucket Server API.
It features a second in-memory read-through cache to avoid the marshaling and network round trip costs during steady state.
We cache the complete set of Sourcegraph repository ids a user has access to. This set is represented as a highly compressed bitmap to save on storage, memory and compute resources. For a hypothetical upper bound of 100k repositories, and 10k users, the aggregate memory usage would be rounding 329,36 MB per frontend API instance. For 30k repositories and 1k users, that number would be 16,472MB. This is computed with https://godoc.org/github.com/RoaringBitmap/roaring#BoundSerializedSizeInBytes
Closes #1108
Benchmarks
benchmark iter time/iter bytes alloc allocs
--------- ---- --------- ----------- ------
BenchmarkStore/ttl=0-12 1 2010936381.00 ns/op 121320 B/op 450 allocs/op
BenchmarkStore/ttl=60s/no-in-memory-cache-12 3000 448181.00 ns/op 30705 B/op 78 allocs/op
BenchmarkStore/ttl=60s/in-memory-cache-12 10000000 134.00 ns/op 0 B/op 0 allocs/op
Load tests
Before this PR
Requests [total, rate] 240, 100.42
Duration [total, attack, wait] 12.750183628s, 2.389969s, 10.360214628s
Latencies [mean, 50, 95, 99, max] 257.711098ms, 21.508257ms, 45.953569ms, 8.382180325s, 10.360214628s
Bytes In [total, mean] 49490, 206.21
Bytes Out [total, mean] 16560, 69.00
Success [ratio] 100.00%
Status Codes [code:count] 200:240
Error Set:
After this PR
Requests [total, rate] 500, 100.16
Duration [total, attack, wait] 5.000159564s, 4.991773s, 8.386564ms
Latencies [mean, 50, 95, 99, max] 6.555395ms, 6.271367ms, 8.724946ms, 13.90822ms, 31.487489ms
Bytes In [total, mean] 105000, 210.00
Bytes Out [total, mean] 34500, 69.00
Success [ratio] 100.00%
Status Codes [code:count] 200:500
Error Set: