search: Introduce pattern matcher for more complex query analysis
Created by: fkling
This PR introduces a new way to define search diagnostics via patter matching. For now I just moved the existing query diagnostics over, and even for that I only did minimal changes to make it work. Over time we would use this for different kinds of "query intelligence" and probably integrate at a different place (e.g. global state).
The big comment in patternMatcher.ts
and the tests should provide enough information for what's going on there and why, but here are also the notes I took locally, explaining why the type definitions are the way they are (click the on the summary line to reveal the content). I went with approach 1.
Notes about pattern matching and type representation of union types
Pattern matching
The idea is to implement a system to test whether a given value matches/looks like a given pattern:
matches(obj, {foo: 42})
In this example the result would be true if obj was an object that has a property
foo
with value 42
.
The remainder of this section explains the "structure" of patterns.
Primitives
In its simplest form a pattern is a value of a primitive data type (boolean,number,string,..):
matches(value, 42)
matches(value, true)
matches(value,"foo")
Regular expressions
String input values can also be matched against regular expressions
matches(value, /^ba/)
Objects
Object input values can be matched against object patterns, whereas an object pattern is a subset of the properties of the input object:
matches({foo: 42, bar: 21}, {foo: 42}) // true
matches({foo: 42, bar: 21}, {foo: 42, bar: 21}) // true
matches({foo: 42, bar: 21}, {bar: 42}) // false
matches({foo: 42, bar: 21}, {baz: 21}) // false, but should probably already be caught by Typescript; more below
The property value of an object pattern can be any other valid at that position.
Functions
Any input value can be matched with a function pattern that takes the value as input and
returns a boolean value ((input: any) => boolean
):
matches(42, x => x > 0) // true
As mentioned above for object patterns, object pattern property values can be any valid pattern at that position, so the following is also valid:
matches({foo: 42}, {foo: x => x > 0) // true
Allowing function makes it possible to create "pattern combinators". Imagine a function. For example:
matches("foo bar", every(str => str.length < 10, /^foo/))
matches(value, {type: oneOf("Foo", "Bar")})
Typescript
We could implement such a system without giving too much thoughts about type safty
and simply use any
everywhere. But trying to add type safety will help us avoid
mistakes, can improve developer experience in the editor and lets us avoid to repeat
ourselves.
Creating a generic Pattern
type definition would be relatively straightforward
(using advances features such as type mapping and conditional types), if it weren't
for union types. Union types introduce a certain amount of complexity and
ambiguity, which is why the rest of document present various possible approaches.
Consider the following input type:
type Token =
| {type: 'filter', kind: FilterType, value: string}
| {type: 'pattern', value: string}
We could now implement Pattern<Token>
in different ways:
NOTE: F<X>
stands for the function pattern (v: X) => boolean
1. Single object with union properties
type Pattern<Token> = {
type?: F<'filter'|'pattern'> | 'filter' | 'pattern',
kind?: F<FilterType|undefined> | FilterType
value?: F<string> | string
}
Advantage: Function pattern input is typed with the union of the property values Disadvantage: No discriminating union, i.e. the following is a valid pattern:
{
type: 'pattern',
kind: ...,
}
Even worse if two members do have the same property but with different (union) types. It's easy to create patterns that will never match.
2. Discriminating union
type Pattern<Token> =
| {
type?: 'filter'|F<'filter'>
kind?: FilterType|F<FilterType>
value?: string | F<string>
}
| {
type?: F<'pattern'> | 'pattern'
value?: F<string> | string
}
Advantage: Type refinement possible
Disadvantage: When using a function without using a discriminating type, the
function input value is typed any
so explicit types are necessary:
{
// F<'filter'>|F<'pattern'>
type: x => true, // x has implicit any
}
edit: This seems to be the case even with a discriminating property as long as the type of the discriminating property includes a function.
3. Single object with unionized functions + discriminating union
type Pattern<Token> =
| {
type?: F<'filter'|'pattern'>
kind?: F<FilterType|undefined>
value?: F<string>
}
| {
type?: 'filter'
kind?: FilterType
value?: string
}
| {
type?: 'pattern'
value?: string
}
Advantage: Type refinement possible; typed functions possible Disadvantage: Cannot use functions when one of the union member is used. If functions where allowed we would have the same problem again as in 2.
4. Single object with unionized properties + discriminating union + explicit property name
If the discriminating property is specified, we can create a single object that contains the union of all properties + individual pattern objects for every member of the union, but making the discriminating property required.
type Pattern<Token> =
| {
type?: F<'filter'|'pattern'>
kind?: F<FilterType|undefined> | FilterType
value?: F<string> | string
}
| {
type: 'filter'
kind?: FilterType | F<FilterType>
value?: string | F<string>
}
| {
type: 'pattern'
value?: string | F<string>
}
Advantage: type
can be used to discriminate members; all possible patterns can
be used for each union member.
Disadvantage: The discriminating property has to be specified, i.e. it has to be
known upfront whether a type is a union. (or could that be determined automatically?)
5. Others
There are other possible combinations, also using intersection (&
) but all
other solutions seems to have similar drawbacks.