Skip to content

insights: [spike] historical backfill compression may misbehave for time frames prior to the earliest index

Created by: coury-clark

The historical backfiller uses a compression mechanism to reduce the number of frames we need to sample based on the commit history of a repository.

Prior to 3.35 we used a fixed width historical size of 12 months for all series, to align with a guarantee that we have indexed at least as much commit history. However, with 3.35 we can now edit the time range to exceed that 12 months of history, which raises some questions about the compression mechanism:

  1. Are frames that exceed the oldest index evaluated correctly (ie. not compressed) in our compression logic?
  2. Do we have a useful or good way to determine the oldest frame that has been compressed, and does this differentiate in any meaningful way from the oldest commit in the repository?
  3. Should we consider indexing the entire repository history, and if so what would be the performance / storage impact of this? I believe search-product did some early investigations into this space we might be able to learn from.