Skip to content

search: Introduce pattern matcher for more complex query analysis

Warren Gifford requested to merge fkling/query-diagnostics into main

Created by: fkling

This PR introduces a new way to define search diagnostics via patter matching. For now I just moved the existing query diagnostics over, and even for that I only did minimal changes to make it work. Over time we would use this for different kinds of "query intelligence" and probably integrate at a different place (e.g. global state).

The big comment in patternMatcher.ts and the tests should provide enough information for what's going on there and why, but here are also the notes I took locally, explaining why the type definitions are the way they are (click the on the summary line to reveal the content). I went with approach 1.

Notes about pattern matching and type representation of union types

Pattern matching

The idea is to implement a system to test whether a given value matches/looks like a given pattern:

matches(obj, {foo: 42})

In this example the result would be true if obj was an object that has a property foo with value 42.

The remainder of this section explains the "structure" of patterns.

Primitives

In its simplest form a pattern is a value of a primitive data type (boolean,number,string,..):

matches(value, 42)
matches(value, true)
matches(value,"foo")

Regular expressions

String input values can also be matched against regular expressions

matches(value, /^ba/)

Objects

Object input values can be matched against object patterns, whereas an object pattern is a subset of the properties of the input object:

matches({foo: 42, bar: 21}, {foo: 42})              // true
matches({foo: 42, bar: 21}, {foo: 42, bar: 21})     // true
matches({foo: 42, bar: 21}, {bar: 42})              // false
matches({foo: 42, bar: 21}, {baz: 21})              // false, but should probably already be caught by Typescript; more below

The property value of an object pattern can be any other valid at that position.

Functions

Any input value can be matched with a function pattern that takes the value as input and returns a boolean value ((input: any) => boolean):

matches(42, x => x > 0)                   // true

As mentioned above for object patterns, object pattern property values can be any valid pattern at that position, so the following is also valid:

matches({foo: 42}, {foo: x => x > 0)       // true

Allowing function makes it possible to create "pattern combinators". Imagine a function. For example:

matches("foo bar", every(str => str.length < 10, /^foo/))
matches(value, {type: oneOf("Foo", "Bar")})

Typescript

We could implement such a system without giving too much thoughts about type safty and simply use any everywhere. But trying to add type safety will help us avoid mistakes, can improve developer experience in the editor and lets us avoid to repeat ourselves.

Creating a generic Pattern type definition would be relatively straightforward (using advances features such as type mapping and conditional types), if it weren't for union types. Union types introduce a certain amount of complexity and ambiguity, which is why the rest of document present various possible approaches.

Consider the following input type:

type Token =
  | {type: 'filter', kind: FilterType, value: string}
  | {type: 'pattern', value: string}

We could now implement Pattern<Token> in different ways:

NOTE: F<X> stands for the function pattern (v: X) => boolean

1. Single object with union properties

type Pattern<Token> = {
  type?: F<'filter'|'pattern'> | 'filter' | 'pattern',
  kind?: F<FilterType|undefined> | FilterType
  value?: F<string> | string
}

Advantage: Function pattern input is typed with the union of the property values Disadvantage: No discriminating union, i.e. the following is a valid pattern:

{
  type: 'pattern',
  kind: ...,
}

Even worse if two members do have the same property but with different (union) types. It's easy to create patterns that will never match.

2. Discriminating union

type Pattern<Token> =
| {
    type?: 'filter'|F<'filter'>
    kind?: FilterType|F<FilterType>
    value?:  string | F<string>
  }
| {
    type?: F<'pattern'> | 'pattern'
    value?: F<string> | string
  }

Advantage: Type refinement possible Disadvantage: When using a function without using a discriminating type, the function input value is typed any so explicit types are necessary:

{
  // F<'filter'>|F<'pattern'>
  type: x => true, // x has implicit any
}

edit: This seems to be the case even with a discriminating property as long as the type of the discriminating property includes a function.

3. Single object with unionized functions + discriminating union

type Pattern<Token> =
| {
    type?: F<'filter'|'pattern'>
    kind?: F<FilterType|undefined>
    value?: F<string>
  }
| {
    type?: 'filter'
    kind?: FilterType
    value?: string
  }
| {
    type?: 'pattern'
    value?: string
  }

Advantage: Type refinement possible; typed functions possible Disadvantage: Cannot use functions when one of the union member is used. If functions where allowed we would have the same problem again as in 2.

4. Single object with unionized properties + discriminating union + explicit property name

If the discriminating property is specified, we can create a single object that contains the union of all properties + individual pattern objects for every member of the union, but making the discriminating property required.

type Pattern<Token> =
| {
    type?: F<'filter'|'pattern'>
    kind?: F<FilterType|undefined> | FilterType
    value?: F<string> | string
  }
| {
    type: 'filter'
    kind?: FilterType | F<FilterType>
    value?: string | F<string>
  }
| {
    type: 'pattern'
    value?: string | F<string>
  }

Advantage: type can be used to discriminate members; all possible patterns can be used for each union member. Disadvantage: The discriminating property has to be specified, i.e. it has to be known upfront whether a type is a union. (or could that be determined automatically?)

5. Others

There are other possible combinations, also using intersection (&) but all other solutions seems to have similar drawbacks.

Merge request reports

Loading