insights: [spike] investigate how to record a series non-globally
Created by: coury-clark
Currently all series that are recorded are global. This was a mechanism to simplify the logic - allowing the process to simply iterate over the entire set of repositories. We have a strong signal and desire to make it faster to backfill these series by reducing the scope of repostiories that they execute against, likely using Search Contexts.
This spike is to investigate and propose the requisite architectural changes in the backend to support this behavior.
Some questions we may answer:
- Do we retain a global iterator for series that are actually executed globally?
- Can we cache any data to reduce network calls if we separate series out individually?
- Do we run these concurrently?
- What happens if the set of repositories changes? Currently we would identify it as a new data series - would changing the set of repos precipitate a new backfill? How does that work if the set of repositories is dynamic (from a context query)?
- How do we observe / get visibility into these executions - does anything change here?
- What assumptions / changes in the API need to happen to support this? (many things are assumed around global series)