Backend: fix instances of `tx.Done(err)` referencing wrong `err` (!35656) · Merge requests · Administrator / sourcegraph

Administrator requested to merge cc/fix-transaction-errors into main May 18, 2022

Created by: camdencheek

Hi @sourcegraph/backend-devs! This is a PSA PR from your friendly neighborhood me.

I ran into a bug caused by a tx.Done(err) that was referencing the wrong error. It was subtle and something super easy to miss, so I did a quick audit of our codebase to find other instances where this might be happening. The audit turned up a handful of places that could be susceptible to this same issue, all fixed by this PR.

I don't think we can create a lint check for this, so instead I'm pinging you all to raise awareness so we can hopefully avoid introducing more of these in the future.

In case you want to learn more about the problem (I think it's an interesting exploration of go's variable scoping rules)

In our codebase, database transactions must be explicitly closed, otherwise we will leak transactions, which is not good. A very common pattern to avoid that is defer-ing tx.Done(err) so we can guarantee that tx.Done(err) will always be called.

For those unfamiliar with the semantics of tx.Done(err), it will either commit or roll back the transaction depending on whether err == nil. It returns the error that was passed to it if non-nil or any error returned from calling commit/rollback.

The problem comes along when the err variable referenced by tx.Done(err) is not set. It's really easy to do this by re-declaring err and updating the new err rather than the old err referenced by tx.Done(err).

func doTheThing(ctx context.Context, db database.DB) error {
	tx, err := db.Transact(ctx)
	if err != nil {
		return err
	}
	defer func() { err = tx.Done(err) }()

	err2 := tx.DoTheThing(ctx)
	if err2 != nil {
		// Even though we errored, this will not roll the
		// transaction back because `err` is still `nil`.
		return err2
	}
	
	return nil
}

You might be thinking "But Camden, I always use err as my variable name, so I'm fine!" Consider the following:

func getTheThingAndMaybeDo(ctx context.Context, db database.DB) error {
	tx, err := db.Transact(ctx)
	if err != nil {
		return err
	}
	defer func() { err = tx.Done(err) }()

	// Believe it or not, the following line does not redeclare `err`,
	// so this will work as expected.
	// https://go.dev/ref/spec#Short_variable_declarations
	got, err := tx.GetTheThing(ctx)
	if err != nil {
		return err
	}

	if got.IsTruthy() {
		// However, since we're in a new scope here, this `err` is a new
		// variable. Confusing, right? If we get a non-nil error here,
		// the transaction will still be committed because the `err` 
		// on the first line of this function has not been changed.
		err := tx.DoTheThing(ctx)
		if err != nil {
			return err
		}
	}
	
	return nil
}

Okay, so how can we commit/rollback transactions correctly? Try the following:

func getTheThingAndMaybeDo(ctx context.Context, db database.DB) (err error) {
	// declare err in the function signature ----------------^^^

	tx, err := db.Transact(ctx)
	if err != nil {
		return err
	}
	// This `err` refers to the `err` declared in the function signature. 
	// As a bonus, setting `err` in this line actually overwrites the 
	// returned error, so we aren't silently ignoring any errors returned
	// from the database like in the other examples.
	defer func() { err = tx.Done(err) }()

	got, err := tx.GetTheThing(ctx)
	if err != nil {
		return err
	}

	if got.IsTruthy() {
		err := tx.DoTheThing(ctx)
		if err != nil {
			// Even though the `err` here is not referring to the same 
			// variable as the `err` in the function signature, by returning
			// an error, you are implicitly setting the `err` in the function
			// signature, which means that `tx.Done` will see the updated
			// error value.
			return err
		}
	}

	return nil
}

Test plan

Depending on existing tests.

Backend: fix instances of `tx.Done(err)` referencing wrong `err`

Test plan

Merge request reports