search: add delimiter scanner
Created by: rvantonder
This PR implements a scanner for delimited values (like quoted strings), for arbitrary delimiters. The use case is scanning strings, but also the /.../
regex pattern syntax we currently support (where the delimiter is /
).
I thought deeply about (re)using the existing scanner code, but I think it will bite me in the long run. The main reasons is that I want standalone functions for scanning because I want to reuse the units in different contexts, e.g., for regex vs literal search, or to parameterize the parser with different scanners based on heuristics. It's useful if the standalone functions works like DecodeRune
, where I get back the 'decoded' value, and the amount 'decoded', which is how I've written this. Our existing scanner doesn't expose standalone functions and assumes a tight relationship between different tokens in the input (since each token scanner returns a continuation), which is a fine and elegant approach if we're only dealing with one way to scan the input. But, we support different search input interpretations, and standalone functions will help with that complexity.
You'll find I use the same structure in #9729 to implement a scanner for patterns. There's a higher-level scanner abstraction here that can lift the duplicated parts out (e.g., the next()
helper function, and so on) with higher order functions. I will be doing that in a later PR--keeping the scanners separate means I don't have to overengineer a common abstraction first, and helps me refine what it would look like.
Note: this function is not used in this PR, but will be in later PRs.