insights: test monorepo improvements and confirm success
Created by: Joelkw
We've been making improvements to code insights running over large monorepos. This issue is focused on the testing and confirming we are "done" with our monorepo support work, and monorepo codebases (repo size 4GB+) are supported as well as any other codebase.
To test this, we should create common persisted (backend) insights over the megarepo and gigarepo for some common flows:
- A single-series insight with a literal search
- A single-series insight with a regexp search
- A 5-series insight with literal searches, at least one file: or lang: filter on a series
- A 5-series insight with regexp searces (using OR modifiers as well, like
|
), at least one file: or lang: filter on a series - A capture group series over just the monorepo for something with 20+ match types (think licenses, go versions, docker versions)
- A capture group series over all repos including the monorepo for something with 20+ match types
- a language pie chart insight over the monorepo
We should then test that those insights (where applicable) handle:
- Filtering to only include the monorepo
- Filtering to exclude the monorepo
- A context that includes the monorepo
When we test this, we should track (roughly) time to completion to see if performance is an order of magnitude worse. Heuristically, if these test insights take >24 hours to run individually, that approaches the upper bound of acceptable first-class-supported performance.