Improve access token lookup speed (!9567) · Merge requests · Warren Gifford / sourcegraph

Warren Gifford requested to merge perf/access-token-lookup into master Apr 04, 2020

Created by: eseliger

This query was the slowest by far (cumulated over the # of invocations):

This is invoked every time an incoming request is authenticated using an authorization token. That means every request to such a route took 117ms on average just looking up that token. With the improved query, I got it down to much less than 1 ms.

Query plans

Before

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Update on access_tokens t  (cost=1504.17..2477.71 rows=15565 width=142) (actual time=17.219..46.637 rows=1 loops=1)
   ->  Nested Loop  (cost=1504.17..2477.71 rows=15565 width=142) (actual time=17.172..36.142 rows=14075 loops=1)
         ->  Index Scan using access_tokens_lookup on access_tokens t  (cost=0.00..8.03 rows=1 width=116) (actual time=0.016..0.020 rows=1 loops=1)
               Index Cond: (value_sha256 = 'XXXXXXXX'::bytea)
               Filter: ((deleted_at IS NULL) AND ('user:all'::text = ANY (scopes)))
         ->  Hash Join  (cost=1504.17..2275.12 rows=15565 width=18) (actual time=17.152..34.053 rows=14075 loops=1)
               Hash Cond: (t2.creator_user_id = creator_user.id)
               ->  Hash Join  (cost=752.09..1482.11 rows=15585 width=16) (actual time=8.872..19.621 rows=14075 loops=1)
                     Hash Cond: (t2.subject_user_id = subject_user.id)
                     ->  Seq Scan on access_tokens t2  (cost=0.00..689.05 rows=15605 width=14) (actual time=0.011..4.459 rows=15605 loops=1)
                     ->  Hash  (cost=542.71..542.71 rows=16750 width=10) (actual time=8.786..8.786 rows=16853 loops=1)
                           Buckets: 32768  Batches: 1  Memory Usage: 981kB
                           ->  Seq Scan on users subject_user  (cost=0.00..542.71 rows=16750 width=10) (actual time=0.006..4.693 rows=16853 loops=1)
                                 Filter: (deleted_at IS NULL)
                                 Rows Removed by Filter: 21
               ->  Hash  (cost=542.71..542.71 rows=16750 width=10) (actual time=8.219..8.219 rows=16853 loops=1)
                     Buckets: 32768  Batches: 1  Memory Usage: 981kB
                     ->  Seq Scan on users creator_user  (cost=0.00..542.71 rows=16750 width=10) (actual time=0.006..4.832 rows=16853 loops=1)
                           Filter: (deleted_at IS NULL)
                           Rows Removed by Filter: 21
 Planning Time: 0.490 ms
 Execution Time: 46.714 ms
(22 rows)

After

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on access_tokens t  (cost=24.93..32.96 rows=1 width=142) (actual time=0.146..0.148 rows=1 loops=1)
   ->  Nested Loop  (cost=24.93..32.96 rows=1 width=142) (actual time=0.046..0.048 rows=1 loops=1)
         ->  HashAggregate  (cost=24.64..24.65 rows=1 width=26) (actual time=0.039..0.040 rows=1 loops=1)
               Group Key: t2.id
               ->  Nested Loop  (cost=0.57..24.64 rows=1 width=26) (actual time=0.033..0.034 rows=1 loops=1)
                     ->  Nested Loop  (cost=0.29..16.34 rows=1 width=24) (actual time=0.027..0.028 rows=1 loops=1)
                           ->  Index Scan using access_tokens_lookup on access_tokens t2  (cost=0.00..8.03 rows=1 width=22) (actual time=0.014..0.015 rows=1 loops=1)
                                 Index Cond: (value_sha256 = 'XXXXXXXX'::bytea)
                                 Filter: ((deleted_at IS NULL) AND ('user:all'::text = ANY (scopes)))
                           ->  Index Scan using users_pkey on users subject_user  (cost=0.29..8.30 rows=1 width=10) (actual time=0.011..0.011 rows=1 loops=1)
                                 Index Cond: (id = t2.subject_user_id)
                                 Filter: (deleted_at IS NULL)
                     ->  Index Scan using users_pkey on users creator_user  (cost=0.29..8.30 rows=1 width=10) (actual time=0.005..0.005 rows=1 loops=1)
                           Index Cond: (id = t2.creator_user_id)
                           Filter: (deleted_at IS NULL)
         ->  Index Scan using access_tokens_pkey on access_tokens t  (cost=0.29..8.30 rows=1 width=116) (actual time=0.006..0.006 rows=1 loops=1)
               Index Cond: (id = t2.id)
 Planning Time: 0.672 ms
 Execution Time: 0.249 ms
(19 rows)

Also tested this locally with a DB of 25k tokens, old logic in lookup function takes 150ms, new takes only 7ms.

Improve access token lookup speed

Query plans

Merge request reports