Skip to content

postgres-alpine: reindex all databases on startup

Warren Gifford requested to merge alpine into main

Created by: ggilmore

Fixes https://github.com/sourcegraph/sourcegraph/issues/23310

(albeit not ideally - see caveats)

script outline

  1. check for a marker file in PGDATADIR (3.31-reindexing.completed) - its presence indicates that re-indexing process has already been performed. (the following steps assume that the marker file is missing)
  2. prepare the postgres environment variables (in the same way as initdb.sh does)
  3. start up the postgres server with the following options
    • the listen_address is set to empty ('') - this prevents all external TCP connections to the DB, and allows only local connections via a unix socket
    • -P prevents postgres from using system catalog indexes for lookups (since these will need to be re-created for the same reasons that we are re-indexing the other databases). This follows the advice provided by https://www.postgresql.org/docs/12/sql-reindex.html
  4. re-index the system catalogs (--system), then the rest of the databases (--all)
  5. write the marker file to signal that the process has completed
  6. shut down postgres

The liveness script has been altered to check for the presence of the marker file. If it's absent, that means that the re-indexing process is still underway - so it'll return a 0 exit code so that k8s doesn't kill the pod.

(The readiness script doesn't need to be altered, it still works in the same way that it always did since postgres doesn't listen to external TCP connections during the re-indexing process)

still to be done:

  • need to verify that all permutations of specifying postgres credentials work with this script (PGPASSWORD, etc.). I suspect this will work since it uses the same machinery that initdb.sh uses

caveats

Reindexing the entire database can take a long time if you have a lot of data. k8s.sdev.org's codeintel-db instance takes 1-2 hours to re-index. I've been experimenting with using the CONCURRENTLY option that's described on https://www.postgresql.org/docs/12/app-reindexdb.html. If we can pull it off, it's possible that we can do this without incurring downtime. I haven't been able to figure out a clean way around the caveats described in the above document for that option though (a uniqueness violation means that the process will exit early, and we'll have to manually clean that up). I put this PR up in the meantime for discussion, even though I know it's not ideal.

Merge request reports

Loading