Add a logger sink to export errors to Sentry transparently
Created by: jhchabran
This PR introduces the concept of sinks that can be plugged in the Logger, to extend the capabilities. We're providing here a Sentry sink, that captures errors passed to the logger with the log.Error
function if and only if the log level is superior or equal to Warn
.
As a result, it means we will get error reports in place that weren't previously covered because there wasn't a sentry.SentryCapture
to catch the errors bubbling up.
Fixes #35844 (closed) and #35845 (closed)
Configuring a service with the new sink
To ease the configuration, helpers are provided:
syncLogs, updateSinks := sglog.InitWithSinks(sglog.Resource{
Name: env.MyName,
Version: version.Version(),
InstanceID: hostname.Get(),
}, sglog.NewSentrySink())
defer syncLogs()
with the minor caveat that the conf.Watch
call must also be made after the conf.Init()
call which is often below in the main function and rather easy to forget:
conf.Init()
conf.MustValidateDefaults()
conf.Watch(updateSinks(conf.GetSinks))
Performances considerations
In order to not slow down logging when it's not necessary:
- the underlying
zapcore.Core
is only processing logging events onWarn
and above levels. - consuming log events and processing them for sentry reporting are both asynchronous. This deflects most of the work into the processing go routine and leverage Sentry's client ability to send reports in batches.
In the eventuality of saturating the events buffer by producing errors quicker than we can produce them, they will be purely dropped. See the guarantees paragraph below for more infos about this.
I have included two simple benchmarks so we can observe the Δ when the API is enabled and we're logging errors.
BenchmarkWithoutSentry-10 303235 4394 ns/op
BenchmarkWithSentry-10 221163 5354 ns/op
Guarantees regarding losing events
Flushing
In order to avoid losing events, the events are continuously sent to Sentry and don't need to be explicitly flushed. If asked explicitly to be flushed as part of the zapcore.Core
interface, the Sentry sink will try to consume all log events within a reasonable time before shutting down the consumer side, and will then submit them to Sentry.
The flushing is only called in the final defer
function coming from our logging API, meaning that will only happen when a service is shutting down.
Too many errors
In the eventuality where we are submitting events faster than we could consume then, the upper bound is a large buffered channel, which should be enough to accumulate errors while we're asynchronously reporting them to Sentry.
It would be nice to be able to know if we're dropping errors, but that would create a circular dependency from the sink toward the logger, so for now, they're just silently discarded.
Test plan
Added a series of new tests covering the new package and its internals.