Skip to content

Tool idea: Fetch files and benchmark time for highlighting

Created by: varungandhi-src

Honeycomb lets you download data CSVs (limited to 1000 rows; IDK if the API has a similar limit). We are currently recording the repo, the revision, and the filepath, which is enough information to fetch the exact file that is needed.

It would be helpful to have a tool which parses the CSV, fetches the files, runs the highlighter on the files locally, and adds those timings (maybe as mean + stdev?) as columns to a new CSV.

We already have a bunch of code for benchmarking highlighting in https://github.com/sourcegraph/sourcegraph/tree/vg/syntect-perf, maybe we can reuse some of that.

A data analysis script can consume the new CSV to create plots, such as comparisons between prod vs dev timings. This would help us better investigate problems with highlighting sluggishness in prod.