Skip to content

repos: remove NULL check on `gitserver_repos.clone_status`

Warren Gifford requested to merge jc/gr-notcloned into main

Created by: unknwon

The gitserver_repos.clone_status column is actually not nullable and remove this condition alone reduces query time from ~40s to <10s on production for fetching status messages.

From @ryanslade:

It could be null because we were doing a left join. BUT, it should be safe to assume it’s there now as we’ve deprecated repo.cloned and are relying on gitserver_repos now

Before
Aggregate  (cost=23717790.45..23717790.46 rows=1 width=8) (actual time=38596.262..38596.268 rows=1 loops=1)
  ->  Merge Left Join  (cost=10.87..23716484.88 rows=522228 width=0) (actual time=242.606..38575.548 rows=130783 loops=1)
        Merge Cond: (repo.id = gr.repo_id)
-        Filter: ((gr.clone_status = 'not_cloned'::text) OR (gr.clone_status IS NULL))
-        Rows Removed by Filter: 2213924
        ->  Nested Loop  (cost=0.86..23408423.49 rows=1119787 width=4) (actual time=0.027..27034.931 rows=2344707 loops=1)
              ->  Index Only Scan using external_service_repos_repo_id_external_service_id_unique on external_service_repos esr  (cost=0.43..130103.93 rows=2078558 width=4) (actual time=0.012..3321.124 rows=2344776 loops=1)
                    Filter: (external_service_id = ANY ('{1839,1868,2934}'::integer[]))
                    Rows Removed by Filter: 1487148
                    Heap Fetches: 1012432
              ->  Index Scan using repo_pkey on repo  (cost=0.43..11.20 rows=1 width=4) (actual time=0.009..0.009 rows=1 loops=2344776)
                    Index Cond: (id = esr.repo_id)
                    Filter: ((deleted_at IS NULL) AND (blocked IS NULL) AND ((NOT private) OR (SubPlan 1) OR (alternatives: SubPlan 2 or hashed SubPlan 3) OR (SubPlan 4)))
                    Rows Removed by Filter: 0
                    SubPlan 1
                      ->  Nested Loop  (cost=0.71..5.33 rows=1 width=0) (never executed)
                            ->  Index Only Scan using external_service_repos_repo_id_external_service_id_unique on external_service_repos esr_1  (cost=0.43..2.65 rows=1 width=8) (never executed)
                                  Index Cond: (repo_id = repo.id)
                                  Heap Fetches: 0
                            ->  Index Scan using external_services_pkey on external_services es  (cost=0.28..2.50 rows=1 width=8) (never executed)
                                  Index Cond: (id = esr_1.external_service_id)
                                  Filter: (unrestricted AND (deleted_at IS NULL))
                    SubPlan 2
                      ->  Index Only Scan using external_service_user_repos_idx on external_service_repos  (cost=0.29..2.51 rows=1 width=0) (never executed)
                            Index Cond: ((user_id = 7313) AND (repo_id = repo.id))
                            Heap Fetches: 0
                    SubPlan 3
                      ->  Index Only Scan using external_service_user_repos_idx on external_service_repos external_service_repos_1  (cost=0.29..9.03 rows=138 width=4) (never executed)
                            Index Cond: (user_id = 7313)
                            Heap Fetches: 0
                    SubPlan 4
                      ->  Index Scan using user_permissions_perm_object_unique on user_permissions  (cost=0.29..2.52 rows=1 width=1) (never executed)
                            Index Cond: ((user_id = 7313) AND (permission = 'read'::text) AND (object_type = 'repos'::text))
        ->  Index Scan using gitserver_repos_pkey on gitserver_repos gr  (cost=0.43..275984.03 rows=7213505 width=12) (actual time=0.008..9019.306 rows=6760426 loops=1)
Planning Time: 1.204 ms
Execution Time: 38596.400 ms
After
Aggregate  (cost=10117353.28..10117353.29 rows=1 width=8) (actual time=7387.122..7387.130 rows=1 loops=1)
  ->  Nested Loop  (cost=7.74..10116047.71 rows=522228 width=0) (actual time=29.880..7368.252 rows=130781 loops=1)
        Join Filter: (esr.repo_id = repo.id)
        ->  Merge Join  (cost=7.30..219685.85 rows=896482 width=8) (actual time=29.854..6242.368 rows=130781 loops=1)
              Merge Cond: (esr.repo_id = gr.repo_id)
              ->  Index Only Scan using external_service_repos_repo_id_external_service_id_unique on external_service_repos esr  (cost=0.43..130105.03 rows=2078558 width=4) (actual time=0.012..2657.856 rows=2344776 loops=1)
                    Filter: (external_service_id = ANY ('{1839,1868,2934}'::integer[]))
                    Rows Removed by Filter: 1487407
                    Heap Fetches: 1012683
+              ->  Index Only Scan using gitserver_repos_not_cloned_status_idx on gitserver_repos gr  (cost=0.43..69702.98 rows=3111185 width=4) (actual time=0.029..2713.267 rows=2902003 loops=1)
                    Heap Fetches: 2243026
        ->  Index Scan using repo_pkey on repo  (cost=0.43..11.03 rows=1 width=4) (actual time=0.008..0.008 rows=1 loops=130781)
              Index Cond: (id = gr.repo_id)
              Filter: ((deleted_at IS NULL) AND (blocked IS NULL) AND ((NOT private) OR (SubPlan 1) OR (alternatives: SubPlan 2 or hashed SubPlan 3) OR (SubPlan 4)))
              SubPlan 1
                ->  Nested Loop  (cost=0.71..5.29 rows=1 width=0) (never executed)
                      ->  Index Only Scan using external_service_repos_repo_id_external_service_id_unique on external_service_repos esr_1  (cost=0.43..2.65 rows=1 width=8) (never executed)
                            Index Cond: (repo_id = repo.id)
                            Heap Fetches: 0
                      ->  Index Scan using external_services_pkey on external_services es  (cost=0.28..2.50 rows=1 width=8) (never executed)
                            Index Cond: (id = esr_1.external_service_id)
                            Filter: (unrestricted AND (deleted_at IS NULL))
              SubPlan 2
                ->  Index Only Scan using external_service_user_repos_idx on external_service_repos  (cost=0.29..2.51 rows=1 width=0) (never executed)
                      Index Cond: ((user_id = 7313) AND (repo_id = repo.id))
                      Heap Fetches: 0
              SubPlan 3
                ->  Index Only Scan using external_service_user_repos_idx on external_service_repos external_service_repos_1  (cost=0.29..9.03 rows=138 width=4) (never executed)
                      Index Cond: (user_id = 7313)
                      Heap Fetches: 0
              SubPlan 4
                ->  Index Scan using user_permissions_perm_object_unique on user_permissions  (cost=0.29..2.52 rows=1 width=1) (never executed)
                      Index Cond: ((user_id = 7313) AND (permission = 'read'::text) AND (object_type = 'repos'::text))
Planning Time: 1.467 ms
Execution Time: 7387.264 ms

Merge request reports

Loading