Skip to content

db: Add index to speed up extension list query

Administrator requested to merge ef/extensions-list-missing-index into main

Created by: efritz

This index speeds up the list extensions queryl. This is called ~7k/h and takes ~260ms. It now takes ~20ms. This is currently the sixth heaviest query by load according to query insights results from the last hour (and the second heaviest query over the last day).

Representative query:

This query is a slight reduction of the query given in this trace.

SELECT
    x.id,
    x.uuid,
    x.publisher_user_id,
    x.publisher_org_id,
    x.name,
    x.created_at,
    x.updated_at,
    users.username,
    orgs.name,
    rer.manifest
FROM registry_extensions x
LEFT JOIN users ON users.id=publisher_user_id AND users.deleted_at IS NULL
LEFT JOIN orgs ON orgs.id=publisher_org_id AND orgs.deleted_at IS NULL
LEFT JOIN registry_extension_releases rer ON rer.registry_extension_id=x.id AND rer.deleted_at IS NULL
WHERE
    NOT EXISTS (
        SELECT 1 
        FROM registry_extension_releases rer2 
        WHERE 
            rer.registry_extension_id=rer2.registry_extension_id AND 
            rer2.deleted_at IS NULL AND 
            rer2.created_at > rer.created_at 
    )
    AND x.deleted_at IS NULL
;

Previous query plan:

                                                                                                   QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Anti Join  (cost=1150.97..1979.84 rows=5699 width=238) (actual time=6.392..266.323 rows=298 loops=1)
   Hash Cond: (rer.registry_extension_id = rer2.registry_extension_id)
   Join Filter: (rer2.created_at > rer.created_at)
   Rows Removed by Join Filter: 1600190
   ->  Hash Right Join  (cost=674.16..1395.46 rows=8549 width=250) (actual time=1.330..11.451 rows=11621 loops=1)
         Hash Cond: (rer.registry_extension_id = x.id)
         ->  Seq Scan on registry_extension_releases rer  (cost=0.00..587.04 rows=13003 width=175) (actual time=0.004..4.195 rows=13003 loops=1)
               Filter: (deleted_at IS NULL)
               Rows Removed by Filter: 1
         ->  Hash  (cost=669.99..669.99 rows=334 width=75) (actual time=1.289..1.293 rows=298 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 39kB
               ->  Hash Left Join  (cost=23.32..669.99 rows=334 width=75) (actual time=0.389..1.192 rows=298 loops=1)
                     Hash Cond: (x.publisher_org_id = orgs.id)
                     ->  Nested Loop Left Join  (cost=0.29..646.09 rows=334 width=66) (actual time=0.017..0.742 rows=298 loops=1)
                           ->  Seq Scan on registry_extensions x  (cost=0.00..12.08 rows=334 width=55) (actual time=0.010..0.124 rows=298 loops=1)
                                 Filter: (deleted_at IS NULL)
                                 Rows Removed by Filter: 144
                           ->  Index Scan using users_pkey on users  (cost=0.29..1.90 rows=1 width=15) (actual time=0.002..0.002 rows=1 loops=298)
                                 Index Cond: (id = x.publisher_user_id)
                                 Filter: (deleted_at IS NULL)
                     ->  Hash  (cost=14.19..14.19 rows=707 width=13) (actual time=0.353..0.353 rows=707 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 41kB
                           ->  Seq Scan on orgs  (cost=0.00..14.19 rows=707 width=13) (actual time=0.004..0.176 rows=707 loops=1)
                                 Filter: (deleted_at IS NULL)
                                 Rows Removed by Filter: 12
   ->  Hash  (cost=314.27..314.27 rows=13003 width=12) (actual time=4.930..4.931 rows=13003 loops=1)
         Buckets: 16384  Batches: 1  Memory Usage: 738kB
         ->  Index Only Scan using registry_extension_releases_registry_extension_id on registry_extension_releases rer2  (cost=0.29..314.27 rows=13003 width=12) (actual time=0.022..2.377 rows=13003 loops=1)
               Heap Fetches: 1119
 Planning Time: 1.751 ms
 Execution Time: 266.545 ms
(31 rows)

Current query plan:

                                                                                                        QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Anti Join  (cost=1123.01..1951.87 rows=5699 width=238) (actual time=5.420..18.935 rows=298 loops=1)
   Hash Cond: (rer.registry_extension_id = rer2.registry_extension_id)
   Join Filter: (rer2.created_at > rer.created_at)
   Rows Removed by Join Filter: 11478
   ->  Hash Right Join  (cost=674.16..1395.46 rows=8549 width=250) (actual time=1.295..10.531 rows=11621 loops=1)
         Hash Cond: (rer.registry_extension_id = x.id)
         ->  Seq Scan on registry_extension_releases rer  (cost=0.00..587.04 rows=13003 width=175) (actual time=0.004..3.627 rows=13003 loops=1)
               Filter: (deleted_at IS NULL)
               Rows Removed by Filter: 1
         ->  Hash  (cost=669.99..669.99 rows=334 width=75) (actual time=1.283..1.294 rows=298 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 39kB
               ->  Hash Left Join  (cost=23.32..669.99 rows=334 width=75) (actual time=0.311..1.196 rows=298 loops=1)
                     Hash Cond: (x.publisher_org_id = orgs.id)
                     ->  Nested Loop Left Join  (cost=0.29..646.09 rows=334 width=66) (actual time=0.016..0.794 rows=298 loops=1)
                           ->  Seq Scan on registry_extensions x  (cost=0.00..12.08 rows=334 width=55) (actual time=0.009..0.142 rows=298 loops=1)
                                 Filter: (deleted_at IS NULL)
                                 Rows Removed by Filter: 144
                           ->  Index Scan using users_pkey on users  (cost=0.29..1.90 rows=1 width=15) (actual time=0.002..0.002 rows=1 loops=298)
                                 Index Cond: (id = x.publisher_user_id)
                                 Filter: (deleted_at IS NULL)
                     ->  Hash  (cost=14.19..14.19 rows=707 width=13) (actual time=0.288..0.289 rows=707 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 41kB
                           ->  Seq Scan on orgs  (cost=0.00..14.19 rows=707 width=13) (actual time=0.004..0.154 rows=707 loops=1)
                                 Filter: (deleted_at IS NULL)
                                 Rows Removed by Filter: 12
   ->  Hash  (cost=286.31..286.31 rows=13003 width=12) (actual time=4.087..4.087 rows=13003 loops=1)
         Buckets: 16384  Batches: 1  Memory Usage: 738kB
         ->  Index Only Scan using registry_extension_releases_registry_extension_id_created_at on registry_extension_releases rer2  (cost=0.29..286.31 rows=13003 width=12) (actual time=0.038..1.884 rows=13003 loops=1)
               Heap Fetches: 1119
 Planning Time: 0.726 ms
 Execution Time: 19.019 ms
(31 rows)

The top Hash Anti-Join node is the heaviest and occurs due to the NOT EXISTS correlated subquery. Adding this index allows fewer rows to be fetched from the table so that the join condition does not need to filter them - the index scan filters them by virtue of an ordered index.

Merge request reports

Loading