Skip to content

web: search: major code excerpt refactor (better testing, code structure, etc.)

Administrator requested to merge sg/code-excerpt-simplify into main

Created by: slimsag

Overview

This is a prerequisite requirement for implementing optimizations like #6992 and #16467 in a clean fashion.

It refactors how CodeExcerpt, FileMatchChildren, and FileMatch work together, giving us:

  • Better, more isolated unit-testing of the functionality which breaks line matches up into groups.
  • More tests (added in this PR.)
  • A more clear separation of concerns.

The problem

The code structure behind CodeExcerpt is a bit spaghetti today.

  1. (ok) FileMatch takes the actual individual-line matches returned from the search GraphQL API and passes those to FileMatchChildren.
  2. (ok) FileMatch tells FileMatchChildren exactly how many matches (FileMatchProps.subsetMatches) to show when the file is collapsed (user has not clicked "Show N more matches").
  3. (ok) FileMatchChildren groups the individual line matches into groups based on:
    • The number of desired matches to show (FileMatchProps.subsetMatches)
    • Whether or not the matches are contiguous regions or not.
  4. (ok) FileMatchChildren renders a CodeExcerpt for each group of matches it wants rendered.
  5. (?) FileMatchChildren gives the CodeExcerpt a function to fetch the entire highlighted file instead of giving it the subset of matches and syntax highlighted lines it is responsible for.
  6. (??) CodeExcerpt must calculate how many lines it should display which is very complex.
  7. (???) CodeExcerpt must account for the user-configurable number of context lines, instead of rendering the section of code the parent told it. 8 (???!) CodeExcerpt must account for adjacent CodeExcerpts

In short, the actual logic for how we go from "individual line matches -> groups of matches" is today split across multiple locations:

  • FileMatchChildren
  • CodeExcerpt
  • FileMatchContext

A change in any of the three can break the behavior in totality, and it is quite brittle / hard to test.

Solution

This change lifts the logic for "individual line matches -> groups of matches" out of the three components and into a single function:

/**
 * Calculates how to group together matches for display. Takes into account:
 *
 * - Whether or not displaying a subset or all matches is desired
 * - A surrounding number of context lines to display
 * - Whether or not context lines have (or do not have) matches within them
 * - Grouping based on whether or not there is overlapping or adjacent context.
 *
 * @param matches The matches to split into groups.
 * @param maxMatches The maximum number of matches to show, or 0 for all.
 * @param context The number of surrounding context lines to show for each match.
 */
export const calculateMatchGroups = (matches: IMatchItem[], maxMatches: number, context: number): IGroupedMatches

Thus making the behavior of the components here very straightforward:

  • FileMatch renders a file match by creating FileMatchChildren.
  • FileMatchChildren takes the search results and sends them to calculateMatchGroups, rendering a CodeExcerpt for each group.
  • Each CodeExcerpt is now a simple, straightforward component only responsible for the excerpt of code it is rendering.
    • No longer responsible for splitting the excerpt of code it is rendering out of the entire file.
    • No longer responsible for caring about adjacent excerpts of code and performing complex end-of-excerpt calculations.
    • No longer is the parent partially responsible for splitting, and passing "fun" parameters like this to the CodeExcerpt:
          /** The highest line number among the subset matches. */
          lastSubsetMatchLineNumber: number

Future improvements & confidence in this change

I think calculateMatchGroups can be made a lot more straightforward in how it operates. The logic is sound, but the way it is written is convoluted. I initially tried to tackle this, but due to the lack of tests and isolation I ended up wasting ~7d here attempting to rewrite this logic and discovering numerous edge cases that needed to be handled when doing side-by-side comparisons.

This change is mostly just moving the logic that previously existed into another location, so there are zero behavior changes here and I have done extensive automated, and manual, testing to ensure that is the case.

Now that this code is isolated and heavily tested, we should be able to do some more fearless refactoring of it to make it more legible.

Helps #6992 and #16467

Signed-off-by: Stephen Gutekanst [email protected]

Merge request reports

Loading