Skip to content

insights: partial failures are possible while writing results

Created by: coury-clark

Somehow we were able to end up in a partial failure state on k8s-dogfood after performing a series of indexed searches for insight snapshots. After retrying the queries the results were consistent with search.

partial_failure

After checking the underlying tables, there was feature/server-side-on-cloud repositories in the vector comprising the most recent data point, and only ~170 in the snapshot. Clearly there was a partial failure somewhere during the write.

The write is intended to be within a single DB transaction, and rollback if there were any errors in the processing.

Additionally, it seems any errors handed back to the queue manager are not logged if the record is marked for retry.