Index lockfiles and possibly persist as tree
Created by: mrnugget
This is an extraction from #36481 and includes the backend/API changes necessary to switch to a model for dependency search where we actively index lockfiles in the background using code intel's policy mechanism.
It's the first of multiple PRs to implement what I demoed here.
(Edit: here are the draft PRs that add the remaining functionality: https://github.com/sourcegraph/sourcegraph/pull/37544)
In short, what this PR does:
- IMPORTANT: Change
dependencies.Serviceto NOT parse lockfiles on demand anymore. - Show search alert if a repository/commit hasn't been lockfile indexed.
- Change
dependencies.Serviceto only query the previously persisted dependencies. - Migrate the database to change
codeintel_lockfilesandcodeintel_lockfile_referencesso we can persist dependency trees (!), one per lockfile per repo/commit. - Change
dependencies/internal/storeto work with the new database schema, adding ability to query dependencies per lockfile, transitive only, etc. - Add
lockfile_indexing_enabledtolsif_configuration_policies. @efritz: this is different from the tags-based approached we talked about, because I found it much easier to add a boolean now than to make sure I get all of the stored procedures right when migrating the existing data. I think we can still easily change this and migrate it. - Change lockfile indexer to check for
lockfile_indexing_enabledin the policies (vs. indexing). - Change the code intel frontend to allow users to create lockfile-indexing policies.
- Hide everything behind the
codeIntelLockfileIndexingEnabledsetting flag. - Introduce a
DependencyGraphtype tolockfilesand thesharedpackage. This will be returned by different parsers, for example theyarn.lockparser I've built (but that's not included in this PR, see below). - Change the
service/storelayers to persist the graph.
What's NOT included:
- The
transitive:yespredicate for search. I still need to clean this up and ask Search Product on how to implement this properly. - The actual
yarn.lockparser that builds a full dependency tree. I found a bug in my current implementation and it's non-trivial to fix (but fixable), so I want to get a review on this PR before diving further into the parser.
That means there's only two things from the user's perspective that change with this PR:
-
IMPORTANT: Repository need to be lockfile-indexed before
repo:deps()will return something. - They can create a separate lockfile-indexing policy to enable that.
Since the yarn.lock-parser that produces a tree is not in this PR, no trees will be persisted yet. That will only happen with parsers supporting that. Until then, all dependencies are persisted as direct dependencies, just like before.
TODOs/trade-offs:
This is just the first iteration. I haven't done any performance optimizations. I haven't cleaned up every TODO. I plan to do that in follow-up PRs.
But since dependency search is still behind a feature flag, I think that's fine in order to make continuous progress and have reviewable PRs.
There's one big thing that we should tackle (that was already a problem before this PR): it takes a while (1m?) until a new policy is picked up by the lockfile indexer.
Test plan
- New and existing tests, manual testing, CI