Skip to content

Add LSIF support

Warren Gifford requested to merge lsif into master

Created by: chrismwendt

Goals

The goal of this PR is to enable us in the Sourcegraph organization to upload LSIF data for the sourcegraph/sourcegraph repository to Sourcegraph.com.

The next goal (not for this PR) is to support other users uploading LSIF for their own repositories by GopherCon (July 24).

The kind of feedback that would be most helpful would be: pointing out problems with the architecture or usability. Here are some areas I'm looking for feedback on:

  • Auth: I plan to restrict the /upload endpoint to admin access tokens, and keep the /request endpoint open to normal users (or anonymous users, if the instance is public) cc @sourcegraph/core-services
  • API: this currently resides at .api/lsif, but I plan to move it under GraphQL cc @sourcegraph/core-services
  • The LSIF HTTP server is Node.js: are we OK with running a Node.js process on the backend? I'm reusing a few files from https://github.com/microsoft/vscode-lsif-extension/tree/master/server (and asked about an npm package https://github.com/microsoft/vscode-lsif-extension/issues/13), but they could probably get rewritten in Go in ~1 week cc @beyang @felixfbecker @lguychard
  • General architecture and anything I'm overlooking (file storage, caching, API, etc.)

Background https://github.com/sourcegraph/sourcegraph/issues/4692

Overview of implementation see lsif/README.md and lsif/server/README.md

Changelog Doesn't update the CHANGELOG because there's no LSIF-scope for access tokens yet, which is OK for an unannounced MVP that we try out for ourselves on the sourcegraph/sourcegraph repository https://sourcegraph.slack.com/archives/C0C324C91/p1562109046016800?thread_ts=1562105156.005000&cid=C0C324C91

Test plan

  • cd into sourcegraph/codeintellify
  • lsif-tsc -p tsconfig.json --noContents --out data.lsif
  • env SRC_ENDPOINT=http://localhost:3080 SRC_ACCESS_TOKEN=$DEV_SRC_ACCESS_TOKEN REPOSITORY=github.com/sourcegraph/codeintellify COMMIT=$(git rev-parse HEAD) bash ~/path/to/sourcegraph/sourcegraph/lsif/upload.sh data.lsif (currently fails at the upload step because I haven't figured out how to implement auth yet)
  • Make sure it says "Upload successful."
  • TODO add LSIF to each language extension (basic-code-intel can be a vehicle for this)
  • Open http://localhost:3080/github.com/sourcegraph/codeintellify/-/blob/src/hoverifier.ts
  • You should see hovers and defs working

TODO

  • Check if LSIF data for a given repo@commit /exists before sending hover/def/ref requests
  • Check the version number during upload, return 400 unless it matches the version that the LSIF DB code expects
  • Restrict the /upload endpoint to admins (in the future, we might create a new LSIF scope for access tokens)
  • Write a Dockerfile, add it to gen-pipeline.go
  • Address TODOs in code, refactor

Potential future ideas (we can move select tasks into the TODOs above)

  • Translate repository name to repository ID
  • Add metrics to track upload rate, upload size, request rate, size of Database cache, etc.
  • Add throttling for uploads and requests
  • Restrict uploads per-repository (see discussion about this https://sourcegraph.slack.com/archives/C0C324C91/p1562105156005000)
  • Add a new kind of token (different from user access tokens): LSIF upload token
  • Support fetching LSIF data from external storage (e.g. Amazon S3, like Codecov does)
  • Support multiple replicas of this LSIF server and figure out storage and load balancing
  • Fall back to the most recent commit that does have LSIF data if the current one doesn't
  • Infer repository, commit, etc. from build environment (Travis, CircleCI, Buildkite, etc.) like Codecov does

Merge request reports

Loading