Index lockfiles and possibly persist as tree
Created by: mrnugget
This is an extraction from #36481 and includes the backend/API changes necessary to switch to a model for dependency search where we actively index lockfiles in the background using code intel's policy mechanism.
It's the first of multiple PRs to implement what I demoed here.
(Edit: here are the draft PRs that add the remaining functionality: https://github.com/sourcegraph/sourcegraph/pull/37544)
In short, what this PR does:
- IMPORTANT: Change
dependencies.Service
to NOT parse lockfiles on demand anymore. - Show search alert if a repository/commit hasn't been lockfile indexed.
- Change
dependencies.Service
to only query the previously persisted dependencies. - Migrate the database to change
codeintel_lockfiles
andcodeintel_lockfile_references
so we can persist dependency trees (!), one per lockfile per repo/commit. - Change
dependencies/internal/store
to work with the new database schema, adding ability to query dependencies per lockfile, transitive only, etc. - Add
lockfile_indexing_enabled
tolsif_configuration_policies
. @efritz: this is different from the tags-based approached we talked about, because I found it much easier to add a boolean now than to make sure I get all of the stored procedures right when migrating the existing data. I think we can still easily change this and migrate it. - Change lockfile indexer to check for
lockfile_indexing_enabled
in the policies (vs. indexing). - Change the code intel frontend to allow users to create lockfile-indexing policies.
- Hide everything behind the
codeIntelLockfileIndexingEnabled
setting flag. - Introduce a
DependencyGraph
type tolockfiles
and theshared
package. This will be returned by different parsers, for example theyarn.lock
parser I've built (but that's not included in this PR, see below). - Change the
service
/store
layers to persist the graph.
What's NOT included:
- The
transitive:yes
predicate for search. I still need to clean this up and ask Search Product on how to implement this properly. - The actual
yarn.lock
parser that builds a full dependency tree. I found a bug in my current implementation and it's non-trivial to fix (but fixable), so I want to get a review on this PR before diving further into the parser.
That means there's only two things from the user's perspective that change with this PR:
-
IMPORTANT: Repository need to be lockfile-indexed before
repo:deps()
will return something. - They can create a separate lockfile-indexing policy to enable that.
Since the yarn.lock
-parser that produces a tree is not in this PR, no trees will be persisted yet. That will only happen with parsers supporting that. Until then, all dependencies are persisted as direct dependencies, just like before.
TODOs/trade-offs:
This is just the first iteration. I haven't done any performance optimizations. I haven't cleaned up every TODO. I plan to do that in follow-up PRs.
But since dependency search is still behind a feature flag, I think that's fine in order to make continuous progress and have reviewable PRs.
There's one big thing that we should tackle (that was already a problem before this PR): it takes a while (1m?) until a new policy is picked up by the lockfile indexer.
Test plan
- New and existing tests, manual testing, CI