Skip to content

Search backend: introduce `query.Flat`

Administrator requested to merge cc/query-flat into main

Created by: camdencheek

Stacked on https://github.com/sourcegraph/sourcegraph/pull/34771

Currently, we have two query types in the backend:

  • query.Q represents the full tree structure of a query
  • query.Basic restricts tree structure to only the pattern nodes

We split a query.Q into a set of query.Basic so that we can query backends that don't support full tree-shaped queries.

However, we also have an implicit third type, which I'm calling query.Flat. A query.Flat has a flat list of parameters and zero or one pattern nodes. Stated differently, query.Flat is the subset of query.Basic that has no and or or expressions in the pattern node.

There is a certain point in our job planning where the meaning of query.Basic shifts from "the only and/or expressions are in the pattern node" to "there are no and/or expressions anywhere in the query." In particular, ToTextPatternInfo requires that the basic query passed to it is a query.Flat. If this is not the case, it will just silently fail to generate a valid PatternInfo. However, during the optimize step of our job generation, we violate this expectation by passing in a tree-shaped query.Basic in ToSearchJob. This turns out okay because we end up throwing away any jobs that depend on the broken parts of TextPatternInfo, but it leads to a very confusing flow of information.

This PR introduces query.Flat to enforce the flat query shape with our type system. It does this in four steps (represented by the four commits in this PR).

  • Reorganize methods on query.Basic so that all the methods that depend on just the parameters are grouped together
  • Create a new Parameters type that is embedded in query.Basic
  • Move relevant methods from query.Basic to query.Parameters
  • Mint query.Flat, which embeds query.Parameters, allowing it to expose all those methods without duplication.

This allows us to continue using query.Basic as we were before (all the methods moved to Parameters are available through query.Basic because Parameters is embedded).

Proposed next steps: The first thing I'd like to do with this is to make the contract of `toTextParameters` and `ToSearchJob` more explicit. This is not currently possible because, even though parts of those methods depend on the query being a `query.Flat`, we use them with tree-structured `query.Basic` to generate optimized queries as well, and just throw away the non-optimized parts.

So, in order to actually make that possible, I think we also need to shift how we're doing query optimization a bit. I think it would probably be more clear if we generate jobs in a two-step process:

  1. For each basic query in Plan, generate jobs for backends that can use a query.Basic directly. This will be commit search and zoekt jobs.
  2. Decompose query.Basic into a set of query.Flat wrapped in AndJob and OrJob, then generate jobs for backends that require a flat query.

I think this is considerably easier to follow than the current paradigm of:

  1. generate all jobs from query.Flat composed with AndJob and OrJob
  2. generate optimized jobs and replace the unoptimized jobs with NoopJob.

Test plan

Should be semantics-preserving. Depending on tests to catch any typos.

Merge request reports

Loading