Skip to content

insights: improve whitespace hints for capture groups

Created by: Joelkw

It's easy to miss whitespace side effects when creating regex capture groups. Some examples:

  • Not using \s means our regex interpreter handles a bit odd (sometimes as a wildcard character)
  • Using (.*) instead of ([\d\.]+) for a version string can also capture things like 1.1 and count it as a separate group from 1.1 no spaces.

We should assist users in avoiding these mistakes.

Some concepts to explore:

  1. Warning users if they have either a .* or a present in the query
  2. Automatically stripping trailing (and leading?) spaces from capture group results before aggregation
  3. We'd need to make sure that this didn't limit the use cases – python, for example, uses the whitespace as part of the syntax (for leading spaces)
  4. The ability to somehow preview the capture groups and possible space collisions before running

One customer's example with this situation can be found here.