Skip to content

database: Optimise CountUserAddedRepos

Warren Gifford requested to merge core/optimise-CountUserAddedRepos into main

Created by: ryanslade

We can use the denormalised user_id column.

Old query plan:

Finalize Aggregate  (cost=886887.47..886887.48 rows=1 width=8) (actual time=3042.739..3202.279 rows=1 loops=1)
  ->  Gather  (cost=886887.26..886887.47 rows=2 width=8) (actual time=3040.260..3202.271 rows=3 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Partial Aggregate  (cost=885887.26..885887.27 rows=1 width=8) (actual time=3035.477..3035.483 rows=1 loops=3)
              ->  Parallel Hash Semi Join  (cost=107730.21..882462.24 rows=1370005 width=0) (actual time=2721.411..3035.227 rows=1329 loops=3)
                    Hash Cond: (r.id = sr.repo_id)
                    ->  Parallel Seq Scan on repo r  (cost=0.00..729194.32 rows=2352534 width=4) (actual time=16.145..1698.554 rows=1657234 loops=3)
                         Filter: (deleted_at IS NULL)
                          Rows Removed by Filter: 9189
                    ->  Parallel Hash  (cost=83619.86..83619.86 rows=1469548 width=4) (actual time=664.685..664.688 rows=1403 loops=3)
                          Buckets: 1048576  Batches: 8  Memory Usage: 8288kB
                          ->  Hash Join  (cost=141.18..83619.86 rows=1469548 width=4) (actual time=659.161..661.040 rows=1403 loops=3)
                                Hash Cond: (sr.external_service_id = s.id)
                                ->  Parallel Index Only Scan using external_service_repos_idx on external_service_repos sr  (cost=0.43..76813.73 rows=2523466 width=12) (actual time=0.052..485.313 rows=1657641 loops=3)
                                      Heap Fetches: 1183526
                                ->  Hash  (cost=135.80..135.80 rows=396 width=8) (actual time=0.776..0.777 rows=417 loops=3)
                                      Buckets: 1024  Batches: 1  Memory Usage: 25kB
                                      ->  Seq Scan on external_services s  (cost=0.00..135.80 rows=396 width=8) (actual time=0.031..0.681 rows=417 loops=3)
                                            Filter: ((namespace_user_id IS NOT NULL) AND (deleted_at IS NULL))
                                            Rows Removed by Filter: 264
Planning Time: 2.110 ms
Execution Time: 3202.481 ms

New query plan:

Aggregate  (cost=4633.49..4633.50 rows=1 width=8) (actual time=33.722..33.726 rows=1 loops=1)
  ->  Nested Loop  (cost=1366.71..4630.59 rows=1160 width=0) (actual time=8.757..33.116 rows=3986 loops=1)
        ->  HashAggregate  (cost=1366.28..1378.72 rows=1244 width=4) (actual time=8.736..10.329 rows=3986 loops=1)
              Group Key: sr.repo_id
              ->  Hash Join  (cost=141.32..1363.17 rows=1244 width=4) (actual time=0.612..6.422 rows=4208 loops=1)
                    Hash Cond: (sr.external_service_id = s.id)
                    ->  Index Scan using external_service_user_repos_idx on external_service_repos sr  (cost=0.28..1216.80 rows=2019 width=12) (actual time=0.029..3.867 rows=4208 loops=1)
                    ->  Hash  (cost=135.80..135.80 rows=419 width=8) (actual time=0.575..0.576 rows=420 loops=1)
                          Buckets: 1024  Batches: 1  Memory Usage: 25kB
                          ->  Seq Scan on external_services s  (cost=0.00..135.80 rows=419 width=8) (actual time=0.011..0.467 rows=420 loops=1)
                                Filter: (deleted_at IS NULL)
                                Rows Removed by Filter: 261
        ->  Index Scan using repo_pkey on repo r  (cost=0.43..2.61 rows=1 width=4) (actual time=0.005..0.005 rows=1 loops=3986)
              Index Cond: (id = sr.repo_id)
              Filter: (deleted_at IS NULL)
Planning Time: 0.639 ms
Execution Time: 33.876 ms

Merge request reports

Loading