ssbc: Fix duplicate changeset specs
There are no commits yet
Push commits to the source branch or add previously merged commits to review them.
Created by: eseliger
Okay, this has been quite a journey to discover. It happened occasionally on k8s when updates were rolling out. Here's what happens:
A
has job Z
A
cannot send heartbeats anymoreA
doesn't know about this yet, as it hasn't had a successful heartbeat again so farA
continues to work as frontend is downB
gets assigned job Z
, overwriting the worker_hostname
column thus now owning the jobA
is done, and has not yet learned that it doesn’t own the record anymore (no successful heartbeat yet)A
calls MarkComplete({workerHostname: A})
dbworkerstore.MarkComplete
. This returns false, nil
(not updated, but no DB err)ok
value, commit the transaction since err
is nil
, the changeset spec is committed to the DBA
gives upB
completes work, calls MarkComplete({workerHostname: B})
, this time returning true, nil
, since the hostname matchesworkspace.changeset_specs
field is overwritten and includes the new ID, but the old stray changeset spec remainsWe now properly revert the transactions, test this in code and also do one query less, which was not needed.
Closes https://github.com/sourcegraph/sourcegraph/issues/36918
Extended test suite.
Push commits to the source branch or add previously merged commits to review them.