Skip to content

codeintel: Add RepoUsageStatistics to db

Administrator requested to merge pci-repo-usage-stats into master

Created by: efritz

Part of RFC 153.

Add a method to the codeintel db layer to determine the search and lsif-based code intel event counts in the past week for each unique repo known to the instance.

This will be used by the auto indexer to determine which repos should be eligible for automatic LSIF indexing. The basic idea is that things with a code intel event count over a threshold and a high enough search/(precise+search) ratio would benefit greatly from precise indexes.

Query plan of new query:

                                                                              QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=428850.23..639314.06 rows=92757 width=20) (actual time=10122.697..11508.229 rows=4614 loops=1)
   ->  GroupAggregate  (cost=428849.67..443148.53 rows=92757 width=48) (actual time=10121.537..10874.261 rows=12988 loops=1)
         Group Key: ("substring"(event_logs.url, '//[^/]+/(.+)/-/'::text))
         ->  Sort  (cost=428849.67..431039.57 rows=875960 width=52) (actual time=10121.520..10670.309 rows=1014416 loops=1)
               Sort Key: ("substring"(event_logs.url, '//[^/]+/(.+)/-/'::text))
               Sort Method: external merge  Disk: 63240kB
               ->  Index Scan using event_logs_timestamp on event_logs  (cost=0.44..307314.71 rows=875960 width=52) (actual time=0.075..7587.788 rows=1014416 loops=1)
                     Index Cond: ("timestamp" >= (now() - '7 days'::interval))
   ->  Index Scan using repo_uri_idx on repo r  (cost=0.56..2.09 rows=1 width=39) (actual time=0.048..0.048 rows=0 loops=12988)
         Index Cond: (uri = (("substring"(event_logs.url, '//[^/]+/(.+)/-/'::text)))::citext)
 Planning Time: 0.447 ms
 Execution Time: 11536.736 ms

This query will only be called periodically (probably on the order of a handful of times an hour) from the auto indexer service (coming soon) so doesn't need to be terribly efficient (just not waste resources of other processes).

Merge request reports

Loading