Something went wrong while fetching comments. Please try again.
Created by: efritz
There are several places that we are doing bulk uploads, all inconsistently:
lsif_package
datalsif_references
datalsif_nearest_uploads(_links)
datalsif_data_*
data (in the codeintel-db)This PR ensures that we have a unified technique for adding large amounts of data into Postgres:
This last step allows us, in different circumstances, to:
N/((2^16-1) / # cols)
inserts plus one additional 1 update)Reviewers: Please review by commit. The following commits have non-trivial changes:
d5a4ce7
: Update dbstore.UpdatePackages
and dbstore.UpdatePackageReferences
. Previously these used a bulk inserter but did not use a temporary table. We now insert rows (minus the dump ID) into a temporary table, then transfer it over to the target once the bulk inserter has flushed.4be02ff
: Update lsifstore.WriteDocuments
. This also did not use a temporary table previously. We now can save the dump id and the current schema version for a bulk insert statement.ec097e5
: Same thing, but for result chunks.b90c575
: Same thing, but with definition and references.ccaf455
: Update insertion technique for out of band migrations. We used to issue an update query for every row in the batch, which is incredibly sluggish. We now insert into a temporary table and mass update based on the primary key values. This required a slight refactor in each migration implementation, where we would supply a list of fieldSpec
s for each column being read or updated by the migration.00ac185
: Update insertion technique for nearest commit data. We used to create three batch inserters by hand, insert into a temporary table, then insert/update/delete based on the difference between the target and temporary tables. This is all the same, but we now load all of the insertion data into channels so that we can feed all three inserters concurrently.The remaining commits are fixup efforts and should be self-explanatory.