Structural search: stream to and from comby
Created by: camdencheek
We recently upgraded comby to the latest version (1.7.1). One of the new features added since the last version is the ability to stream input to comby with the -tar
flag.
Currently, we have two codepaths for comby: indexed and unindexed. For both paths, we generate a list of candidate matches and write them to a zip file, then run comby over that zip file.
There are a few problems with this batch approach:
- We cannot start searching with comby until the entire result set is downloaded and written to disk (zip files can't be streamed)
- We cannot stream matches out of comby because we might need to retry a batch request with a higher limit, so any previously streamed results might be repeated.
- We have to request far more candidate results than necessary because we don't know how many of the candidate matches will match, and we don't want to have to retry too many times.
So, we'd like to implement streaming into and out of comby to fix these issues and make structural search a more pleasant experience.