LSIF: fix quadratic SQL query
Created by: chrismwendt
Prior to this change
The nearest-commit CTE was very slow on repos with lots of merge commits. That was because the distance
column is different for the same commit on different paths through the commit graph, so the union didn't filter them out of the working table. That means the working table kept increasing in size, resulting in roughly quadratic performance. There was a limit on the distance
, but no limit on the total number of commits visited. The behavior of CTEs is described in https://www.postgresql.org/docs/9.1/queries-with.html
This was causing lsif-server to take upwards of 75s to run the nearest-commit query (I have since temporarily disabled LSIF on Sourcegraph.com).
After this change
The nearest-commit CTE is linear in the number of visited commits, and there's a limit on it.
Performance analysis
Here's a scatter plot of the query time by the distance from the target commit on moby/moby, starting from the tip of master and walking back in history. At each commit, both the old and new queries are run and the response times are recorded.
Script and data: chart.tar.gz
Legend:
-
🔵 old query (left y-axis) -
🔴 new query (left y-axis) -
🔶 ratio of old query ms / new query ms (right y-axis)
The sudden drops occur when the walk reaches a merge-base where the other branch has a much shorter distance to the target commit.
See how the ratio of old/new increases as the distance from the target commit increases? That indicates the old query's response time increases more than a constant factor faster than the new query (I'm guessing the old query is roughly quadratic in the distance from the target commit).
Previous perf issue about this query: https://github.com/sourcegraph/sourcegraph/issues/5939