Improve performance of changeset spec expirer
Created by: eseliger
Noticed this query was very slow in prod. Made it somewhat better, from 27s to 3.5s. Since slow queries can cause major instability to instances, I think this is a good fix.
Notebook:
Query (modified to not actually delete):
explain analyze SELECT id FROM
changeset_specs cspecs
WHERE
(
created_at < NOW() - interval '2 day'
AND
batch_spec_id IS NULL
)
OR
(
created_at < NOW() - interval '7 day'
AND
NOT EXISTS (SELECT 1 FROM batch_changes WHERE batch_spec_id = cspecs.batch_spec_id)
AND
NOT EXISTS (SELECT 1 FROM changesets WHERE current_spec_id = cspecs.id OR previous_spec_id = cspecs.id)
AND
NOT (SELECT created_from_raw FROM batch_specs WHERE id = cspecs.batch_spec_id)
);
Current query plan:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on changeset_specs cspecs (cost=0.00..16720972.07 rows=24960 width=8) (actual time=27301.444..27301.448 rows=0 loops=1)
Filter: (((created_at < (now() - '2 days'::interval)) AND (batch_spec_id IS NULL)) OR ((created_at < (now() - '7 days'::interval)) AND (NOT (alternatives: SubPlan 1 or hashed SubPlan 2)) AND (NOT (SubPlan 3)) AND (NOT (SubPlan 4))))
Rows Removed by Filter: 232317
SubPlan 1
-> Seq Scan on batch_changes (cost=0.00..6.78 rows=1 width=0) (never executed)
Filter: (batch_spec_id = cspecs.batch_spec_id)
SubPlan 2
-> Seq Scan on batch_changes batch_changes_1 (cost=0.00..6.22 rows=222 width=8) (actual time=26.121..26.167 rows=228 loops=1)
SubPlan 3
-> Seq Scan on changesets (cost=0.00..125.10 rows=2 width=0) (actual time=0.143..0.143 rows=0 loops=182428)
Filter: ((current_spec_id = cspecs.id) OR (previous_spec_id = cspecs.id))
Rows Removed by Filter: 1139
SubPlan 4
-> Index Scan using batch_specs_pkey on batch_specs (cost=0.28..2.49 rows=1 width=1) (actual time=0.001..0.001 rows=1 loops=181961)
Index Cond: (id = cspecs.batch_spec_id)
Planning Time: 0.317 ms
JIT:
Functions: 33
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 9.374 ms, Inlining 77.607 ms, Optimization 213.145 ms, Emission 138.279 ms, Total 438.405 ms
Execution Time: 27310.578 ms
(21 rows)
Proposed indexes:
create index on changesets (current_spec_id, previous_spec_id);
create index erik_test2 on changeset_specs (created_at);
New result:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on changeset_specs cspecs (cost=2118.15..5685332.74 rows=24960 width=8) (actual time=3529.156..3529.160 rows=0 loops=1)
Recheck Cond: ((batch_spec_id IS NULL) OR (created_at < (now() - '7 days'::interval)))
Filter: (((created_at < (now() - '2 days'::interval)) AND (batch_spec_id IS NULL)) OR ((created_at < (now() - '7 days'::interval)) AND (NOT (alternatives: SubPlan 1 or hashed SubPlan 2)) AND (NOT (SubPlan 3)) AND (NOT (SubPlan 4))))
Rows Removed by Filter: 183141
Heap Blocks: exact=26966
-> BitmapOr (cost=2118.15..2118.15 rows=199682 width=0) (actual time=393.099..393.101 rows=0 loops=1)
-> Bitmap Index Scan on changeset_specs_batch_spec_id (cost=0.00..1.53 rows=1 width=0) (actual time=0.009..0.009 rows=0 loops=1)
Index Cond: (batch_spec_id IS NULL)
-> Bitmap Index Scan on erik_test2 (cost=0.00..2104.14 rows=199682 width=0) (actual time=393.089..393.089 rows=183141 loops=1)
Index Cond: (created_at < (now() - '7 days'::interval))
SubPlan 1
-> Seq Scan on batch_changes (cost=0.00..6.78 rows=1 width=0) (never executed)
Filter: (batch_spec_id = cspecs.batch_spec_id)
SubPlan 2
-> Seq Scan on batch_changes batch_changes_1 (cost=0.00..6.22 rows=222 width=8) (actual time=27.133..27.174 rows=228 loops=1)
SubPlan 3
-> Bitmap Heap Scan on changesets (cost=17.91..20.12 rows=2 width=0) (actual time=0.014..0.014 rows=0 loops=182428)
Recheck Cond: ((current_spec_id = cspecs.id) OR (previous_spec_id = cspecs.id))
Heap Blocks: exact=467
-> BitmapOr (cost=17.91..17.91 rows=2 width=0) (actual time=0.014..0.014 rows=0 loops=182428)
-> Bitmap Index Scan on changesets_current_spec_id_previous_spec_id_idx (cost=0.00..1.39 rows=1 width=0) (actual time=0.001..0.001 rows=0 loops=182428)
Index Cond: (current_spec_id = cspecs.id)
-> Bitmap Index Scan on changesets_current_spec_id_previous_spec_id_idx (cost=0.00..16.53 rows=1 width=0) (actual time=0.013..0.013 rows=0 loops=182428)
Index Cond: (previous_spec_id = cspecs.id)
SubPlan 4
-> Index Scan using batch_specs_pkey on batch_specs (cost=0.28..2.49 rows=1 width=1) (actual time=0.001..0.001 rows=1 loops=181961)
Index Cond: (id = cspecs.batch_spec_id)
Planning Time: 0.318 ms
JIT:
Functions: 38
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 6.551 ms, Inlining 16.569 ms, Optimization 227.673 ms, Emission 163.927 ms, Total 414.720 ms
Execution Time: 3535.557 ms
(33 rows)
Closes https://github.com/sourcegraph/sourcegraph/issues/37072
Test plan
See description, tested in dogfood.