search: basic and/or search pattern evaluation (!9381) · Merge requests · Administrator / sourcegraph

Created by: rvantonder

This is the barebones evaluator code for and/or expressions on content search. Example screenshot below. It is barebones because it implements the bare minimum and naive way to ensure that we can run and/or queries. There are many (many!) opportunities to optimize this functionality and we should consider them all. But we should first add the essential logic that can always serve as a naive fallback guaranteeing a correct evaluation. Here are some details to make that concrete:

The way the evaluator works is by breaking down the search pattern into leaf nodes, running our search, and then merging the results (intersect or union). For each leaf node search, we tack on the scopeParameters which are global with respect to searching over file content currently. In future the evaluator will take into account expressions in scope too.
One part to still refine and test is around merge operations for things other than file matches (i.e., common search results, result counts, timing info and so on). Most of this is taken care of naturally with the update function, but there are other parts that I'm not so sure of (e.g., what is the "merge" operation for two search pagination cursors?). This is not immediately important to solve; just pointing it out.
Things that are done naively that can be optimized in future:
- Expressions like (foo or bar) map directly to regex expressions like foo|bar. This is a simple rewrite that we can do on Parameter nodes as a preprocess step.
- Translate our and/or more directly into a Zoekt expression tree on the indexed code path.
- Reorder the tree using heuristics that so that we evaluate patterns that are likelier to shortcircuit early (applies to both and and orbranches)
- Goroutines for evaluating separate branches, maybe.
- And other things I probably am not thinking of right now.

I'll add tests, either in this PR, or follow up, based on what I want to tackle. The PR is good for review; think of it more as scaffolding than the final product.

Screenie: with this PR, the following kinds of queries mostly work:

It also carries over to structural search, which is pretty neat. There are still a ton of gaps to fill in around things like quoting, parentheses interpretation, and other stuff. But this is the basics.

search: basic and/or search pattern evaluation

Merge request reports