Skip to content

authz: Bitbucket Server ACLs caching

Warren Gifford requested to merge core/bitbucket-server-authz-caching into master

Created by: tsenart

Summary

This change set introduces per-user Bitbucket ACLs caching that respects a user configured TTL. It leverages Postgres row locking for concurrency control of cache fill events, so that many concurrent requests during an expiration don't overload the Bitbucket Server API.

It features a second in-memory read-through cache to avoid the marshaling and network round trip costs during steady state.

We cache the complete set of Sourcegraph repository ids a user has access to. This set is represented as a highly compressed bitmap to save on storage, memory and compute resources. For a hypothetical upper bound of 100k repositories, and 10k users, the aggregate memory usage would be rounding 329,36 MB per frontend API instance. For 30k repositories and 1k users, that number would be 16,472MB. This is computed with https://godoc.org/github.com/RoaringBitmap/roaring#BoundSerializedSizeInBytes

Closes #1108 (closed)

Benchmarks

benchmark                                          iter             time/iter   bytes alloc          allocs
---------                                          ----             ---------   -----------          ------
BenchmarkStore/ttl=0-12                               1   2010936381.00 ns/op   121320 B/op   450 allocs/op
BenchmarkStore/ttl=60s/no-in-memory-cache-12       3000       448181.00 ns/op    30705 B/op    78 allocs/op
BenchmarkStore/ttl=60s/in-memory-cache-12      10000000          134.00 ns/op        0 B/op     0 allocs/op

Load tests

Before this PR

Requests      [total, rate]            240, 100.42
Duration      [total, attack, wait]    12.750183628s, 2.389969s, 10.360214628s
Latencies     [mean, 50, 95, 99, max]  257.711098ms, 21.508257ms, 45.953569ms, 8.382180325s, 10.360214628s
Bytes In      [total, mean]            49490, 206.21
Bytes Out     [total, mean]            16560, 69.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:240  
Error Set:

After this PR

Requests      [total, rate]            500, 100.16
Duration      [total, attack, wait]    5.000159564s, 4.991773s, 8.386564ms
Latencies     [mean, 50, 95, 99, max]  6.555395ms, 6.271367ms, 8.724946ms, 13.90822ms, 31.487489ms
Bytes In      [total, mean]            105000, 210.00
Bytes Out     [total, mean]            34500, 69.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:500  
Error Set:

Merge request reports

Loading