db: Improve defaultRepos.List query
Created by: tsenart
This commit improves the defaultRepos.List
query to use a better plan — it results in rougly 480ms runtime in production currently, vs the previous 1.4s. It also fixes a bug with the old query (we weren't filtering out deleted repos from the default_repos table).
Thanks @felixge for the SQL hacking session :)
Before
Unique (cost=1325604.75..1351922.72 rows=3509063 width=36) (actual time=1135.181..1410.477 rows=206416 loops=1)
-> Sort (cost=1325604.75..1334377.41 rows=3509063 width=36) (actual time=1135.179..1157.578 rows=206416 loops=1)
Sort Key: r.id, r.name
Sort Method: external merge Disk: 9984kB
-> Append (cost=168704.21..831721.53 rows=3509063 width=36) (actual time=760.768..1046.177 rows=206416 loops=1)
-> Hash Semi Join (cost=168704.21..639660.16 rows=3302671 width=39) (actual time=760.395..760.395 rows=0 loops=1)
Hash Cond: (r.id = sr.repo_id)
-> Seq Scan on repo r (cost=0.00..347065.21 rows=3302671 width=39) (actual time=0.006..0.006 rows=1 loops=1)
Filter: (deleted_at IS NULL)
-> Hash (cost=107855.17..107855.17 rows=3708883 width=4) (actual time=760.255..760.255 rows=0 loops=1)
Buckets: 131072 Batches: 64 Memory Usage: 1024kB
-> Hash Join (cost=1.38..107855.17 rows=3708883 width=4) (actual time=760.254..760.254 rows=0 loops=1)
Hash Cond: (sr.external_service_id = s.id)
-> Seq Scan on external_service_repos sr (cost=0.00..95690.83 rows=3708883 width=12) (actual time=0.008..443.226 rows=3643696 loops=1)
-> Hash (cost=1.17..1.17 rows=17 width=8) (actual time=0.013..0.013 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on external_services s (cost=0.00..1.17 rows=17 width=8) (actual time=0.008..0.010 rows=3 loops=1)
Filter: (namespace_user_id IS NOT NULL)
Rows Removed by Filter: 16
-> Gather (cost=1000.85..139425.42 rows=206392 width=39) (actual time=0.371..269.532 rows=206416 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Nested Loop (cost=0.85..117786.22 rows=85997 width=39) (actual time=0.096..279.144 rows=68805 loops=3)
-> Parallel Index Only Scan using default_repos_pkey on default_repos (cost=0.42..3522.35 rows=85997 width=4) (actual time=0.047..37.204 rows=68805 loops=3)
Heap Fetches: 206416
-> Index Scan using repo_pkey on repo (cost=0.43..1.33 rows=1 width=39) (actual time=0.003..0.003 rows=1 loops=206416)
Index Cond: (id = default_repos.repo_id)
Planning Time: 0.547 ms
Execution Time: 1425.366 ms
After (attempt 1)
Unique (cost=493745.14..495867.59 rows=424491 width=39) (actual time=421.170..476.224 rows=206400 loops=1)
-> Sort (cost=493745.14..494806.37 rows=424491 width=39) (actual time=421.169..443.556 rows=206400 loops=1)
Sort Key: repo.id
Sort Method: external merge Disk: 9976kB
-> Gather (cost=1000.43..447265.20 rows=424491 width=39) (actual time=0.369..321.361 rows=206400 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Nested Loop (cost=0.43..403816.10 rows=176871 width=39) (actual time=0.079..322.571 rows=68800 loops=3)
-> Parallel Append (cost=0.00..60190.47 rows=304167 width=4) (actual time=0.033..13.560 rows=68805 loops=3)
-> Nested Loop (cost=0.43..54996.19 rows=218170 width=4) (actual time=0.040..0.040 rows=0 loops=1)
-> Seq Scan on external_services es (cost=0.00..1.17 rows=1 width=8) (actual time=0.040..0.040 rows=0 loops=1)
Filter: ((namespace_user_id IS NOT NULL) AND (deleted_at IS NULL))
Rows Removed by Filter: 19
-> Index Scan using external_service_repos_external_service_id on external_service_repos esr (cost=0.43..45722.81 rows=927221 width=12) (never executed)
Index Cond: (external_service_id = es.id)
-> Parallel Seq Scan on default_repos (cost=0.00..2128.07 rows=121407 width=4) (actual time=0.018..7.518 rows=68805 loops=3)
-> Index Scan using repo_pkey on repo (cost=0.43..1.13 rows=1 width=39) (actual time=0.004..0.004 rows=1 loops=206416)
Index Cond: (id = esr.repo_id)
Filter: (deleted_at IS NULL)
Rows Removed by Filter: 0
Planning Time: 0.333 ms
Execution Time: 487.791 ms
After (attempt 2)
Gather (cost=59179.51..199339.22 rows=206358 width=39) (actual time=1.589..437.872 rows=206400 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Nested Loop (cost=58179.51..177703.42 rows=85982 width=39) (actual time=1.025..370.151 rows=68800 loops=3)
-> Hash Left Join (cost=58179.08..63431.76 rows=85997 width=8) (actual time=0.968..67.309 rows=68805 loops=3)
Hash Cond: (dr.repo_id = esr.repo_id)
-> Parallel Index Only Scan using default_repos_pkey on default_repos dr (cost=0.42..3522.35 rows=85997 width=4) (actual time=0.040..37.065 rows=68805 loops=3)
Heap Fetches: 206416
-> Hash (cost=55265.88..55265.88 rows=177503 width=4) (actual time=0.058..0.058 rows=0 loops=3)
Buckets: 131072 Batches: 4 Memory Usage: 1024kB
-> Nested Loop (cost=0.43..55265.88 rows=177503 width=4) (actual time=0.057..0.057 rows=0 loops=3)
-> Seq Scan on external_services es (cost=0.00..2.21 rows=1 width=8) (actual time=0.057..0.057 rows=0 loops=3)
Filter: ((namespace_user_id IS NOT NULL) AND (deleted_at IS NULL))
Rows Removed by Filter: 21
-> Index Scan using external_service_repos_external_service_id on external_service_repos esr (cost=0.43..45944.79 rows=931888 width=12) (never executed)
Index Cond: (external_service_id = es.id)
-> Index Scan using repo_pkey on repo (cost=0.43..1.33 rows=1 width=39) (actual time=0.004..0.004 rows=1 loops=206416)
Index Cond: (id = COALESCE(esr.repo_id, dr.repo_id))
Filter: (deleted_at IS NULL)
Rows Removed by Filter: 0
Planning Time: 2.565 ms
Execution Time: 447.685 ms