migrations: Increase LockTimeout to wait 5min for migrations
Created by: mrnugget
This fixes #4208 by increasing the LockTimeout from the default 15s to 5min.
As explained in the comment, when two processes (frontend
and
management-console
) race each other to run the migrations, only the
first one gets the database lock. The other processes are blocked,
waiting to get the lock.
The "waiting for lock" can happen in two places:
- when instantiating
*migrate.Migrate
(because that checks that the version table exists atomically with the same database lock) - when running the migrations
The problem in #4208 was that management-console
was blocked at (2),
while frontend
was running the migrations. Since the customer in
question had a larger instance, it was not unlikely that the migration
ran for longer than 15s, the default timeout. Once the 15s were over,
management-console
produced the error reported in the ticket.
This change here increases the timeout to 5min, which seems like a reasonable upper-bound for migration times, even for our larger instances.
What this does not do is prevent multiple processes from racing each other to the migrations, because I think this is actually a good thing. Instead of only one process running the migrations and the others booting up and already starting to read/write to the DB (leading to race-conditions with the migrations), the existing behaviour ensures that all processes that directly access the DB have a consistent view of its schema.