Skip to content

insights: add historical data enqueuer

Warren Gifford requested to merge sg/insights-historical-enqueuer into main

Created by: slimsag

Here's the deal: This PR makes backend insights useful. Without this, it takes months for insights to produce any data - effectively making it useless.

The first thing you should do is read this high-level explanation of what this does:

https://github.com/sourcegraph/sourcegraph/blob/6b3438f5574391f8bbedde6b9708d59eca827209/enterprise/internal/insights/background/historical_enqueuer.go#L29-L52

The second thing you should do is read this high-level explanation of what the code is doing:

https://github.com/sourcegraph/sourcegraph/blob/6b3438f5574391f8bbedde6b9708d59eca827209/enterprise/internal/insights/background/historical_enqueuer.go#L113-L145

Additional context: I've spent almost 2 weeks now on this code/problem, and this was the best solution I could come up with. I did a lot (see these ~6 PRs) already to try and reduce the problem to its individual components as much as possible, but it's still complex in nature. The code is good, the tests are good, but it's complex. More complex than I'd like. I would love to reduce complexity here further, but I don't have more time to spend on this - I will have to leave it up to the next person. I'm open to quick ways I can improve clarity.

Note: On a dev server this takes around 10-20 minutes to backfill data for two search insights. On larger instances with many more repositories, it is expected to take hours. I will begin to test throughput and load on gitserver/search/repo-updater in order to tune this further after merging. We benefit from people scaling up gitserver, but do not want to cause anyone to have to do so.

Fixes #18398 (closed)

Signed-off-by: Stephen Gutekanst [email protected]

Merge request reports

Loading