Add a logger sink to export errors to Sentry transparently (!35879) · Merge requests · Warren Gifford / sourcegraph

Merged Warren Gifford requested to merge log-error-sinks into main 3 years ago

Created by: jhchabran

This PR introduces the concept of sinks that can be plugged in the Logger, to extend the capabilities. We're providing here a Sentry sink, that captures errors passed to the logger with the log.Error function if and only if the log level is superior or equal to Warn.

As a result, it means we will get error reports in place that weren't previously covered because there wasn't a sentry.SentryCapture to catch the errors bubbling up.

Fixes #35844 (closed) and #35845 (closed)

Configuring a service with the new sink

To ease the configuration, helpers are provided:

	syncLogs, updateSinks := sglog.InitWithSinks(sglog.Resource{
		Name:       env.MyName,
		Version:    version.Version(),
		InstanceID: hostname.Get(),
	}, sglog.NewSentrySink())
	defer syncLogs()

with the minor caveat that the conf.Watch call must also be made after the conf.Init() call which is often below in the main function and rather easy to forget:

	conf.Init()
	conf.MustValidateDefaults()
	conf.Watch(updateSinks(conf.GetSinks))

Performances considerations

In order to not slow down logging when it's not necessary:

the underlying zapcore.Core is only processing logging events on Warn and above levels.
consuming log events and processing them for sentry reporting are both asynchronous. This deflects most of the work into the processing go routine and leverage Sentry's client ability to send reports in batches.

In the eventuality of saturating the events buffer by producing errors quicker than we can produce them, they will be purely dropped. See the guarantees paragraph below for more infos about this.

I have included two simple benchmarks so we can observe the Δ when the API is enabled and we're logging errors.

BenchmarkWithoutSentry-10    	 303235	     4394 ns/op
BenchmarkWithSentry-10       	 221163	     5354 ns/op

Guarantees regarding losing events

Flushing

In order to avoid losing events, the events are continuously sent to Sentry and don't need to be explicitly flushed. If asked explicitly to be flushed as part of the zapcore.Core interface, the Sentry sink will try to consume all log events within a reasonable time before shutting down the consumer side, and will then submit them to Sentry.

The flushing is only called in the final defer function coming from our logging API, meaning that will only happen when a service is shutting down.

Too many errors

In the eventuality where we are submitting events faster than we could consume then, the upper bound is a large buffered channel, which should be enough to accumulate errors while we're asynchronously reporting them to Sentry.

It would be nice to be able to know if we're dropping errors, but that would create a circular dependency from the sink toward the logger, so for now, they're just silently discarded.

Test plan

Added a series of new tests covering the new package and its internals.

Activity

Please register or sign in to reply

Add a logger sink to export errors to Sentry transparently