Search backend: introduce `query.Flat`
Created by: camdencheek
Stacked on https://github.com/sourcegraph/sourcegraph/pull/34771
Currently, we have two query types in the backend:
-
query.Qrepresents the full tree structure of a query -
query.Basicrestricts tree structure to only the pattern nodes
We split a query.Q into a set of query.Basic so that we can query
backends that don't support full tree-shaped queries.
However, we also have an implicit third type, which I'm calling query.Flat.
A query.Flat has a flat list of parameters and zero or one pattern nodes.
Stated differently, query.Flat is the subset of query.Basic that has no
and or or expressions in the pattern node.
There is a certain point in our job planning where the meaning of query.Basic
shifts from "the only and/or expressions are in the pattern node" to "there
are no and/or expressions anywhere in the query." In particular, ToTextPatternInfo
requires that the basic query passed to it is a query.Flat. If this is not the
case, it will just silently fail to generate a valid PatternInfo. However, during
the optimize step of our job generation, we violate this expectation by passing
in a tree-shaped query.Basic in ToSearchJob. This turns out okay because we
end up throwing away any jobs that depend on the broken parts of TextPatternInfo,
but it leads to a very confusing flow of information.
This PR introduces query.Flat to enforce the flat query shape with our type
system. It does this in four steps (represented by the four commits in this PR).
- Reorganize methods on
query.Basicso that all the methods that depend on just the parameters are grouped together - Create a new
Parameterstype that is embedded inquery.Basic - Move relevant methods from
query.Basictoquery.Parameters - Mint
query.Flat, which embedsquery.Parameters, allowing it to expose all those methods without duplication.
This allows us to continue using query.Basic as we were before (all the methods
moved to Parameters are available through query.Basic because Parameters is embedded).
Proposed next steps:
The first thing I'd like to do with this is to make the contract of `toTextParameters` and `ToSearchJob` more explicit. This is not currently possible because, even though parts of those methods depend on the query being a `query.Flat`, we use them with tree-structured `query.Basic` to generate optimized queries as well, and just throw away the non-optimized parts.So, in order to actually make that possible, I think we also need to shift how we're doing query optimization a bit. I think it would probably be more clear if we generate jobs in a two-step process:
- For each basic query in
Plan, generate jobs for backends that can use aquery.Basicdirectly. This will be commit search and zoekt jobs. - Decompose
query.Basicinto a set ofquery.Flatwrapped inAndJobandOrJob, then generate jobs for backends that require a flat query.
I think this is considerably easier to follow than the current paradigm of:
- generate all jobs from
query.Flatcomposed withAndJobandOrJob - generate optimized jobs and replace the unoptimized jobs with
NoopJob.
Test plan
Should be semantics-preserving. Depending on tests to catch any typos.
Merge request reports
Activity
Created by: rvantonder
Review: Approved
Nit: this PR doesn't need/use
query.Flatright?, the changes make sense to just change the methods to work on Parameters since that's the real change (nice btw!).About
query.Flat, makes sense for next steps andif we generate jobs in a two-step process: ... I think this is considerably easier to follow than the current paradigm of:
My take is that, this is actually going to be quite a bit of effort to do correctly. I encourage it, and agree it's going to be easier/more desirable than the current paradigm. But the current paradigm exists for a reason, because the former is... IMO, going to be lots of effort/difficult. I would say go for it, and be OK with the fact that it may not be a straight path to get there (lots of shuffling/concerns). The main thing I see as troublesome is effectively identifying the type of job/backend based on query structure, which is encoded in
ToSearchJob, and needs to get lifted.Please also consider structuring the code in such a way, that it would be easy to migrate the generic structure (expressions over
query.Flat) to something specific (likequery.Basic+ backendQueryToFooQuery). For example, right now structural search relies on the generic structure, but will ideally rely on its ownQueryToStructuralSearchQuery. I'd like the code path to allow doing that easily. Right now it's more "convoluted", in terms of tree replacements, but it's also "generically" supported, so I'd opt that the two-step process have generic support for casing out on optimized/tree type. I think that's what you'd shoot for anyway, but just want to give a concrete need/example to make it explicit.