symbols: Ignore contents of big files from git archive
Created by: chrismwendt
Prior to this change, we were slurping every file from git archive into memory (not all at once, but one-by-one), and I'm guessing some files were huge (1GB+) and causing OOMs.
After this change, the symbols service completely ignores the contents of big files, and the threshold is configurable via MAX_FILE_SIZE_KB
.
See https://github.com/sourcegraph/sourcegraph/issues/36107#issuecomment-1140011081
Test plan
Manually ran with env MAX_FILE_SIZE_KB=10 sg start
and verified the SQLite DB was smaller and no symbols were produced for a 19KB file.