Skip to content

search: add rule transformation and unquoting to lucky search

Warren Gifford requested to merge rvt/lucky-backend-5x into main

Created by: rvantonder

Stacked on https://github.com/sourcegraph/sourcegraph/pull/36518.

This adds basic scaffolding for applying rules and generating queries under a "lucky" search. See doc comments for vocabulary (what are "rules", etc).

I added one unquote rule in this PR as a first step (there are more, but let's keep this PR on its own). unquote attempts to unquote a pattern like "foo", and will search for the unquoted pattern foo after searching for "foo".

In general, how this lucky search behavior works:

  • The input query is always treated "as usual" first: all syntax filters are honored, and the pattern is searched literally (Sourcegraph's current default). You can of course force the pattern to be searched regexp, with patterntype:regexp. None of the initial behavior changes. We run that query and we return results.

  • If, after running the above query, we've returned less than 500 results, things get interesting: we apply a list of rules to transform the query, and interpret it in different ways. As a first step, I added an unquote rule, which fires when we detect quoted values in literal search, like "foo", and which are then searched as foo. The user will get results for foo, if they exist. The idea is that, these rules surface results that are better than "no results". We will discover and think of more useful rules as we go along.

Next rules and directions:

(1) I have rules on the burner that will treat all patterns as unordered (so it will automatically apply x AND y after searching for x y). Things like unquoting and toggling regexp or using explicit operators is all going to suddenly disappear, and hopefully, give you a much better experience. In fact, there is a side effect of running the unquote rule, that it will also treat patterns as regexp, which is probably quite OK right now :-).

(2) An important piece to this behavior is informing the user of what had been done when we return results for generated queries. For example, if they entered x y and we don't find results for x y, but do for x AND y, then we return an alert that says "your query for x y didn't find any results, but we found results for x AND y. I will be adding this alerting to this code, later.

Test plan

Added a test covering the basic end-to-end flow for rule application and unquote transformation.

Merge request reports

Loading