workerutil: experimenting with batch dequeue to reduce queue overhead
Created by: coury-clark
This PR is not meant to be merged, just looking for feedback here
So, what is this?
While debugging some slowness in the code insights worker, I noticed a few things:
- The dequeue mechanism can be a bottleneck for short-lived, high concurrency
handlers
(dequeue operations can have non-trivial overhead) - Other queuing systems (kafka, sqs, etc) allow consumers to poll a batch of records with the semantics that records not successfully marked as completed in some time interval may be deqeueued again.
Now, I'm not interested in rebuilding Kafka here but it would be useful for code insights to have some batching. We are currently spending roughly 400ms on each dequeue operation (this could probably be a bit better, but regardless) and only 100-200ms average on the actual handler invocation. We have millions of invocations to process, so if we could batch something like 20-50 records we would nearly eliminate the queue overhead and save a lot of time.
Here are some of the implementation problems I'm looking for feedback on:
- The
RecordScanFn
can just brick the entire set of records currently because it eats the entirerows
and gives back just oneRecord
. I could update this to return a[]workerutil.Record
, or I could add aBatchRecordScanFn
with different semantics. Ideally, the operations on a singleRecord
would be in a different store than operations to fetch aRecord
but I think that would be much more work. - Should the
Store
interface have both aDequeue
andBatchDequeue
operation? I'm leaning towards yes, to reduce the random impact of changing theDequeue
semantics. For example, in this PR there is an executor that uses the sameStore
interface that would have to be updated. - What to attach to the trace? Currently, it associates the single record
id
on dequeue. Is this even relevant? - Should batching respect the
NumTotalJobs
exactly, or execute within a singleBatchSize
?
So basically this PR is a rough sketch implementation of what this could look like. It doesn't work right because the mapping function eats all of the records, but it should serve to illustrate the idea.
Looking for feedback, thanks!