search: Implement `repo:description(...)` predicate
Created by: tbliu98
Implements https://github.com/sourcegraph/sourcegraph/issues/30733
This PR introduces the repo:description(...)
predicate. This predicate accepts a valid regular expression as an argument and will filter the list of repos returned from the database to only those that match the given regular expression.
The regex passed to the predicate is transformed into a "fuzzy" regex during job creation, e.g. in repo:description(go package)
, go package
is transformed to (?:go).*?(?:package)
before being added to the database query. I anticipate that in practice, a user searching for repo:description(go package)
probably wants their result set to be broader than only repos that contain the exact string go package
in their description. For example, the repo github.com/hashicorp/go-multierror
has the description A Go (golang) package for representing a list of errors as a single error.
. The regex go package
will not match this description, but I imagine a user would expect repo:description(go package)
to match that repo.
The added filter on the database query uses either the case-insensitive regex operator ~*
or the LIKE
operator to filter on the repo.description
column, both of which can make use of the trigram index on that column. The index also supports a similarity operator %
which only returns rows with a similarity score higher than a given threshold (I believe the default is 0.3
), but when running test queries against the cloud database I found that the performance of %
was far slower than ~*
or LIKE
, and (subjectively) the results were not as close to what I think real-life users would expect. I'm happy to share the outputs of those test queries if people are curious.
I mainly used https://github.com/sourcegraph/sourcegraph/pull/31577 and https://github.com/sourcegraph/sourcegraph/pull/35374 as references for implementing the front end client changes. Would appreciate a close look at those files!
Test plan
Unit tests, integration tests, manual end-to-end tests.