search: differential parser test utility
Created by: rvantonder
Putting up for visibility for @keegancsmith and @stefanhengl. I don't think it's high enough value to merge, but it lets me document a couple of inputs to handle and might be of interest.
Run go build
in the directory of this file. Then, you simply feed the string representation of a query in the first arg. The program will report a difference in valid/invalid queries (some differences here are to be expected, since the new parser fixes a bunch of issues as per https://github.com/sourcegraph/sourcegraph/issues/8780. For example,
./parser-testing ':'
New parser has weaker validation: old parser reports parse error at character 0: got TokenColon, want expr
These are "soft" cases that are good to know about, but not fundamentally problematic.
The structural equivalence comes down to just comparing a String representation of the field/values map produced by the old parser. In the utility, I have made this condition panic if there's a difference, because this way we can feed inputs from a fuzzer as well. I will first be doing manual queries based on input I collected, and then some fuzz inputs with a corpus based on the collection and test suite inputs.
Some current, known examples for triggering a difference are:
/parser-testing '/derp/'
panic: -old, +new: string(
- `~"derp"`,
+ `~"/derp/"`,
)
- Since we don't support the
/.../
regex syntax in the new parser yet.
./parser-testing 'asdf.*asdf('
panic: -old, +new: string(
- `~"asdf.*asdf\\("`,
+ `:"asdf.*asdf("`,
)
- Edit edit: This ones a bug, more info in https://github.com/sourcegraph/sourcegraph/issues/12733. The new parser interprets this as a string based on the
:
preceding the value, likely because it doesn't perform theautofix
escaping of trailing parentheses, but I'm surprised it doesn't see a regex value anyway.
./parser-testing 'content:"yo"'
panic: -old, +new: string(
- `content:"yo"`,
+ `:"yo"`,
)
- The old parser processes
content
in a different location in search_results, compared to the new parser that handles it at parse/validation time.