Skip to content

search: differential parser test utility

Administrator requested to merge rvt/diff-test-parser into main

Created by: rvantonder

Putting up for visibility for @keegancsmith and @stefanhengl. I don't think it's high enough value to merge, but it lets me document a couple of inputs to handle and might be of interest.

Run go build in the directory of this file. Then, you simply feed the string representation of a query in the first arg. The program will report a difference in valid/invalid queries (some differences here are to be expected, since the new parser fixes a bunch of issues as per https://github.com/sourcegraph/sourcegraph/issues/8780. For example,

./parser-testing ':'
New parser has weaker validation: old parser reports parse error at character 0: got TokenColon, want expr

These are "soft" cases that are good to know about, but not fundamentally problematic.

The structural equivalence comes down to just comparing a String representation of the field/values map produced by the old parser. In the utility, I have made this condition panic if there's a difference, because this way we can feed inputs from a fuzzer as well. I will first be doing manual queries based on input I collected, and then some fuzz inputs with a corpus based on the collection and test suite inputs.

Some current, known examples for triggering a difference are:

/parser-testing '/derp/'
panic: -old, +new:   string(
- 	`~"derp"`,
+ 	`~"/derp/"`,
  )
  • Since we don't support the /.../ regex syntax in the new parser yet.
./parser-testing 'asdf.*asdf('
panic: -old, +new:   string(
- 	`~"asdf.*asdf\\("`,
+ 	`:"asdf.*asdf("`,
  )
  • Edit edit: This ones a bug, more info in https://github.com/sourcegraph/sourcegraph/issues/12733. The new parser interprets this as a string based on the : preceding the value, likely because it doesn't perform the autofix escaping of trailing parentheses, but I'm surprised it doesn't see a regex value anyway.
./parser-testing 'content:"yo"'
panic: -old, +new:   string(
- 	`content:"yo"`,
+ 	`:"yo"`,
  )
  • The old parser processes content in a different location in search_results, compared to the new parser that handles it at parse/validation time.

Merge request reports

Loading