search: introduce concat operator
Created by: rvantonder
This PR introduces a concat
operator in the parse tree. It is not an operator that a user can specify--rather, it delineates which part of the query contains search patterns (or trees containing search patterns) versus other parameters (typically filters, like repo:foo
). The main semantic distinction is that we want to treat order significantly for search patterns, but parameters can otherwise be unordered. It's important to delineate ordered (or concatenated) search patterns, because we have established use cases for interpreting a query like foo bar baz
as any of
foo(.*)?bar(.*)?baz
or foo bar baz
, or foo\s+bar\s+baz
, and in future, perhaps foo and bar and baz
depending on search mode/user preference.
We currently treat search patterns ordered by default (unlike other code search engines, which will typically interpret all whitespace as and
, unless quotes imply an ordering, as in "search pattern"
). We would like to preserve this ordering without quotation (it's more natural), but that means we must disambiguate whitespace-separated terms into those that imply an ordering, and those that do not. Parameters that do not imply an ordering instead imply and
in the usual sense, as we currently do for queries like repo:foo file:bar
. This PR takes care of the conversion and simplification of concated terms versus other terms. See the partitionParameters
docstring for more.
Resulting queries are not necessarily valid or supported (in fact, the concat operator makes it easier to validate and alert on unexpected or nonsensical queries). Refer to RFC 94 to better understand what will be supported initially.