worries about diff highlighting
Created by: slimsag
I have a number of concerns with the new diff highlighting implementation:
- This file having no test coverage is concerning. We should be adding test coverage not regressing on it, especially with changes as complex as this.
-
preview_file_diff_highlighter.goshould not exist. The fact that we needed to introduce a new file for this logic inside the GraphQL layer is a strong indication that this code carries a lot of complexity which would be better situated in theinternal/highlightpackage. -
preview_file_diff_highlighter.gois entirely untested. - In
preview_file_diff_highlighter.go, we call back into the resolver itself to implement the resolver itself. This is incredibly confusing and is going to be brittle in the future: https://github.com/sourcegraph/sourcegraph/pull/10437/files#diff-363d9f0b68e9ec7dd8cd75a3fd5a22c9R31 - Binary-file checks like this, this, and this appear to be sprinkled throughout the code. This smells to me of defensive coding, especially because
highlight.Codealready handles binary files appropriately. These checks should not be needed anywhere I believe, because if the file is binaryhighlight.Codeis going to give an error which should be handled without a prior "is binary" check. -
highlight.ParseLinesFromHighlightis exposing concerns that package consumers should not be concerned about. This introduces the idea that one should parse the highlighted HTML outside of the highlight package and manipulate it, which opens a whole can of worms for potential regressions in the future. I can test thatinternal/highlightdoes not regress when we update syntect, for example, but I cannot easily test all downstream consumers of it. This is risky AND makes the graphql resolvers much more lengthy which is bad. Instead, there should be something likehighlight.Diffas I mentioned in an earlier comment on the PR.
This is a great change and feature to have at the end of the day, but I am worried this change was merged prematurely and will cause us complexity / pain in the future. Can someone commit to addressing these concerns (no need to revert)?
My largest worry here is that we're reducing separation of concerns in the backend code and have further made the graphqlbackend package an even bigger codebase handling even more things, which is a huge problem for us. @mrnugget what are your thoughts on this (since you reviewed/approved)?