batches: improve performance of viewerBatchChangesCodeHosts (!37305) · Merge requests · Warren Gifford / sourcegraph

Warren Gifford requested to merge aharvey/code-host-resolver into main Jun 16, 2022

Created by: LawnGnome

This addresses the other significant hotspot in #37246 (closed) — namely, that retrieving the code hosts for a batch spec (and, in turn, a batch change) was slow.

This turns out to be because we hydrated every changeset spec attached to the batch spec. The actual database query isn't so bad there, but the real cost is in unmarshalling the JSON blob out of the spec field into the ChangesetSpec. It turns out we don't actually need that, since the only reason we ever hydrated the changeset specs was to access their repo IDs to pass into another query. 🤦

Anyway, this PR adds a new store method that, given a batch spec ID, returns the repo IDs that the user has access to. (Added fun: we probably leaked code hosts the user didn't have access to before this. Cool.) We still feed this into another query, so in theory we could actually make this faster still by merging these into one big CTE thing, but I looked at the code host query this feeds into and remembered how awful it was, so we're going with this for now.

Performance on my 195 changeset batch change that adds just under a million lines to 195 different repos is significantly improved — the (every five second) calls when previewing the batch spec or viewing the actual batch change went from ~4 seconds per call to 30-60 ms. Not a typo.

I also fixed up a typo-ed type name along the way, and opportunistically renamed ct to bt in the store test file I was already in.

Since #37250 has landed, merging this fixes #37246 (closed).

Possible future work

Merge those two queries into one big CTE thing, which might actually give you CTE as you bang your head against the screens of SQL involved.
Modify ChangesetSpec to only unmarshal Spec on the first request, which I started doing, but then decided I didn't care enough about right now. We should keep an eye on this in case it's a theme that recurs again later, though.

Test plan

We have decent coverage of this resolver method, and none of those tests had to change, which is usually the sign of a good performance improvement. Beyond that, store tests were added, and some manual testing was performed both in browser and testing the performance of the GraphQL resolvers.

batches: improve performance of viewerBatchChangesCodeHosts

Possible future work

Test plan

Merge request reports