Code Insights load 3-10x faster and always provide ongoing data snapshots (unify the frontend-powered and backend-powered insight types)
Created by: Joelkw
Problem to solve
Right now, Code Insights data is either generated just-in-time in the frontend via the search API, or on the backend via a worker and queued jobs. There are many subtle differences between these two generation methods that confuse customers, our support team trying to help those customers, and CE/AEs. Not all features are available on both types. This also makes it harder to build new features – everything we build has to be supported by both, or intentionally omitted from one type (we'd have to omit support for monitoring/reporting for just-in-time insight until we do this work).
Measure of success
This effort is successful if we retain the benefits of both types, but have only a single unified flow to generate data. The benefits we must retain are:
- Ability to live preview insights
- Ability to run insights over just a few repositories
- Ability to filter insights
- Ability to save future snapshots of insights rather than just re-run all data
- Ability to load cached insights rather than regenerate live data (10x+ faster load times)
And the additional benefits we need to obtain are:
- A single deterministic flow all insights data is generated from, so debugging and stability work benefits all insight types
- New drilldown and code monitor features in Q2/Q3 immediately benefit all search-powered insight types.
Solution summary
This will be a large engineering effort that will likely involve elements of: streaming search enabled, caching frontend-powered results, combining our generation engines, and forking generation based on how many repos are involved.
Artifacts:
What specific customers are we iterating on the problem and solution with?
- https://github.com/sourcegraph/accounts/issues/537
- https://github.com/sourcegraph/accounts/issues/280
- https://github.com/sourcegraph/accounts/issues/542
Impact on use cases
This is work that improves our ability to show customers more data, faster. As a result, this benefits all of the use cases.
Delivery plan
-
Product-started RFC https://docs.google.com/document/d/1ZBwHw8HqKVcOt_IOsSesC7lTadb5iJ_IY_RMg7kYq7M/edit -
RFC open for engineering review -
Tracking issue of ongoing work https://github.com/sourcegraph/sourcegraph/issues/35387