Skip to content

Search backend: rewrite `file:contains.content()` predicate as a job filter

Created by: camdencheek

This rewrites the file:contains.content() predicate to be a job. This is the last predicate that relies on the predicate expansion step, so we can get rid of that as a followup.

Previously, we would expand file:contains.content() in the query before running the query. So, a query like file:contains.content(test) abc would expand to something like ((repo:a and file:b) or (repo:c and file:d)) abc. As you can imagine, this can get massive, and we would run into stack overflows and OOMs when trying to process that query.

Now, we take a query like file:contains.content(test) abc and convert it to test and abc, which can be evaluated efficiently by zoekt (searcher still does not support conjunctions). This doesn't return the exact results we want, so we add a filter job to the pipeline that filters any returned results, removing any ranges that match the file:contains.content() argument.

Stacked on https://github.com/sourcegraph/sourcegraph/pull/38988

Test plan

Added unit tests, existing backend integration tests, and manual testing.

Merge request reports

Loading