-
Notifications
You must be signed in to change notification settings - Fork 816
Handle the error in SpanLogger.Log if log level is Error (#3016) #3995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8aab361
to
358183f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! I think we should also look for all places where we call SpanLogger.Error()
and remove that call if the error is also logged via SpanLogger.Log()
, otherwise this change would introduce double calls to SpanLogger.Error()
.
pkg/util/spanlogger/spanlogger.go
Outdated
@@ -94,3 +95,12 @@ func (s *SpanLogger) Error(err error) error { | |||
s.Span.LogFields(otlog.Error(err)) | |||
return err | |||
} | |||
|
|||
// triggerError is used to detect error in kvps, and calls s.Error if there's any. | |||
func (s *SpanLogger) triggerError(kvps ...interface{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would set the error flag for any error logged, regardless of the log level. I think the idea in #3016 was to detect the log level and set the error flag only if the level
was error
. We have may have cases where we log an error even if it's not critical and way may not want to "red flag" it in the span, so filtering out any non error level may help.
But you did! :) Key-value pairs are passed to the |
^ This is exactly what I meant by "I didn't find a way to retrieve keyvalue pairs in a logger". The keyvalue pair |
I'm confused... If I add following lines to
The way
It's this new logger's cortex/vendor/github.com/go-kit/kit/log/log.go Lines 111 to 120 in d714b62
What it does is that it appends supplied parameters to cortex/vendor/github.com/go-kit/kit/log/log.go Lines 5 to 12 in d714b62
|
I've looked through the code base but didn't find any Edit: HI @pracucci, just a ping, to see if you have any comments regarding the question above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've looked through the code base but didn't find any SpanLogger.Error(). All I see is level.Error() and http.Error() (some require.Error() in the test files). I wonder if I missed something?
Uhm, there are many (eg. if you use Goland you can just use the "Find usage" feature). I checked them for you and places where we both call Error()
and Log()
with error level are:
- pkg/chunk/cache/memcached.go line 161
- pkg/querier/queryrange/results_cache.go line 560
- pkg/testexporter/correctness/simple.go within the
verifySamples()
function
pkg/util/spanlogger/spanlogger.go
Outdated
if kvps[i] == level.Key() && kvps[i+1] == level.ErrorValue() { | ||
logAsError = true | ||
} else if kvps[i] == "err" { | ||
if err, ok := kvps[i+1].(error); ok { | ||
loggedError = err | ||
} else if errMsg, ok := kvps[i+1].(string); ok { | ||
loggedError = errors.New(errMsg) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would simplify this, supporting only error
even if it's not preceeded by the field name err
(eg. may also be called differently, not sure, too many places to check);
if kvps[i] == level.Key() && kvps[i+1] == level.ErrorValue() { | |
logAsError = true | |
} else if kvps[i] == "err" { | |
if err, ok := kvps[i+1].(error); ok { | |
loggedError = err | |
} else if errMsg, ok := kvps[i+1].(string); ok { | |
loggedError = errors.New(errMsg) | |
} | |
} | |
if kvps[i] == level.Key() && kvps[i+1] == level.ErrorValue() { | |
logAsError = true | |
continue | |
} | |
if err, ok := kvps[i+1].(error); ok { | |
loggedError = err | |
} |
pkg/util/spanlogger/spanlogger.go
Outdated
// triggerError calls s.Error if SpanLogger's log level is set to level.ErrorValue(). | ||
func (s *SpanLogger) triggerError(kvps ...interface{}) { | ||
var logAsError = false | ||
loggedError := errors.New("unknown error") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to allocate it each time. You can initialize it to nil, and then set it to an "unknown error" inside:
if logAsError {
_ = s.Error(loggedError)
}
pkg/util/spanlogger/spanlogger.go
Outdated
var logAsError = false | ||
loggedError := errors.New("unknown error") | ||
for i := 0; i < len(kvps) - 1; i += 2 { | ||
if kvps[i] == level.Key() && kvps[i+1] == level.ErrorValue() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to continue the iteration if the level is not error. I think for the purpose of this triggerError()
function you could just return
if level != error.
pkg/util/spanlogger/spanlogger.go
Outdated
@@ -94,3 +95,22 @@ func (s *SpanLogger) Error(err error) error { | |||
s.Span.LogFields(otlog.Error(err)) | |||
return err | |||
} | |||
|
|||
// triggerError calls s.Error if SpanLogger's log level is set to level.ErrorValue(). | |||
func (s *SpanLogger) triggerError(kvps ...interface{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function is not quite correct. Behaviour I would expect to see is:
-
if log level is error, set span to error (
ext.Error.Set(s.Span, true)
). That is all thattriggerError
should do imho. -
if there is also error object logged, it should be converted to
otlog.Error(err)
field. But I would expect to see this happen inLog
method, specifically inInterleavedKVToFields
mapping, which returns fields from values. Note that presence of error object should not mark span itself as errorneous, as errors an be logged at debug level too.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pstibrany I see what you mean:
Right now triggerError
calls SpanLogger.Error
, which calls ext.Error.Set(s.Span, true)
and s.Span.LogFields(otlog.Error(err))
. err
is being logged twice, once in the triggered SpanLogger.Error
and once in Log
itself.
#3016 wants to save users one line of code (calling SpanLogger.Error
themselves) if level.Error
is called before SpanLogger.Log
- triggerError
simply allows user to achieve what the two lines of code below do in one line.
cortex/pkg/testexporter/correctness/simple.go
Lines 179 to 180 in 8fe0bb1
level.Error(log).Log("msg", "wrong number of samples", "expected", expectedNumSamples, "actual", len(pairs)) | |
log.Error(fmt.Errorf("wrong number of samples")) |
Please let me know if anything above is unclear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pstibrany Just a friendly ping: I wonder if you have any further thoughts or comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now
triggerError
callsSpanLogger.Error
, which callsext.Error.Set(s.Span, true)
ands.Span.LogFields(otlog.Error(err))
.err
is being logged twice, once in the triggeredSpanLogger.Error
and once inLog
itself.
You are right, that's not good :)
I see also that we cannot easily modify InterleavedKVToFields
. What about following approach:
Log
will first calls.Logger.Log(kvps...)
as it does now- it would then have logic from
triggerError
, specifically- lookup of
level.Key()=level.ErrorValue()
pair (this causesext.Error.Set(s.Span, true)
to be called) - if we have found
level.Key()=level.ErrorValue()
, then we would also search for non-nil objects implementingerror
interface. If found witherr
orerror
key, we can remove it fromkvps
slice, and keep as extra field. (Only first found error is handled this way).
- lookup of
Log
would then callotlog.InterleavedKVToFields
to convert remaining fields, and if we kept extra error field in previous step, we add it to these fieldsLog
then callss.Span.LogFields
with all fields (incl. error one, if any).
We would not reuse (*SpanLogger).Error
method, to avoid logging of error twice.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I ask the motivation to take out the first error
at step 2, and putting it back at step 3? Do you mean that the first error should be kept as an extra field, and the rest of the error
should be removed from the kvps
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I ask the motivation to take out the first
error
at step 2, and putting it back at step 3?
Motivation is to avoid double-logging of the error, and to store it wrapped into otlog.Error(err)
instead. Calling otlog.InterleavedKVToFields
with error
instance would convert it to string
.
Do you mean that the first error should be kept as an extra field,
Yes.
and the rest of the
error
should be removed from thekvps
?
No, I would keep the rest as-is. If we removed them, they would disappear from the span-log. That would be bad.
…t#3995) Signed-off-by: ChinYing-Li <[email protected]>
Can you please rebase against master to pull in #4137 and move the changelog entry to the top? |
Signed-off-by: ChinYing-Li <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ChinYing-Li for your patience! What did look like an easy change is turning out not that easy. I think we're on the good track tho!
} | ||
} | ||
if logAsError && errorIndex != -1 { | ||
s.Span.LogFields(otlog.Error(kvps[i+1].(error))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to use errorIndex
instead of i
, otherwise you're assuming that the level is always before the error in kvps
ordering.
@@ -77,6 +77,31 @@ func FromContextWithFallback(ctx context.Context, fallback log.Logger) *SpanLogg | |||
// also puts the on the spans. | |||
func (s *SpanLogger) Log(kvps ...interface{}) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it doable adding some valuable unit tests on the new logic you've added? Otherwise we'll need to manually test it before merging, cause the logic is a bit convoluted and there may be subtle bugs slipping it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I could not see a way to retrieve the field of SpanLogger, it's unlikely to test by comparing the value of the fields. If we refactor the new logic into a private function, I can write unit tests for it. However unit testing a private function doesn't sound like a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pracucci Do you have any further suggestion based on my previous comment? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using https://github.com/opentracing/opentracing-go/blob/master/mocktracer/mocktracer.go ?
(We unit-test private functions all the time, but I agree it's better if you can avoid)
What is the performance impact of doing these checks on every field of every log? |
Sorry for getting back late! func TestSpanLogger_Timing(t *testing.T) {
_, ctx := New(context.Background(), "test", "bar")
newSpan := FromContext(ctx)
startTime := time.Now()
for i := 0; i < 10000; i++ {
_ = newSpan.Log("metric1", 1, "err", errors.New("err"), "metric2", 2)
}
totalTime := time.Since(startTime)
print(totalTime, "\n ")
} Certainly there's an impact on the run time for doing the O(n) check on logged values. Result on
|
Could you re-do that using the Go benchmark framework? You had another question about how to do testing - in |
Thanks for the pointer. I have yet to incorporate mocktracer into the benchmarking code, and here's the benchmark functions (Fixed the bug and updated the benchmarking results on 3 Aug): func BenchmarkSpanLogger100(b *testing.B) {
benchmarkSpanLogger(b, 100)
}
func BenchmarkSpanLogger1000(b *testing.B) {
benchmarkSpanLogger(b, 1000)
}
func benchmarkSpanLogger(b *testing.B, numPair int) {
_, ctx := New(context.Background(), "test", "bar")
newSpan := FromContext(ctx)
err := "err"
arr := make([]interface{}, numPair*2)
for i := 0; i < numPair; i++ {
arr[2*i] = time.Now().String()
arr[2*i + 1] = err
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
newSpan.Log(arr...)
}
} No actual Results on branch
|
This should make you suspicious: 2ns is enough to execute maybe 10 instructions, yet you expect it to process 1000 values.
This passes a single value, which fails immediately because Log is expecting pairs of values.
(and you have to make |
I was also a bit confused about the time/op but not sure what went wrong. Thank you for the pointers, and I've updated the code and the results in my previous comment! |
This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions. |
Hi there, |
Signed-off-by: ChinYing-Li [email protected]
What this PR does:
Detect
error
inSpanLogger.Log
and triggersSpanLogger.Error
if there's any.#3016 wants to trigger
SpanLogger.Error
by detecting thelevel=Error
keyvalue pair added bylevel.Error(logger)
. Upon investigation, I didn't find a way to retrieve keyvalue pairs in a logger. This PR, as a workaround, isn't ideal: if user only passes in strings tolevel.Error(logger).Log
, thenSpanLogger.Error
won't be called.Not sure whether this falls under
CHANGE
orENHANCEMENT
in theCHANGELOG.md
.Any suggestion is appreciated!
Which issue(s) this PR fixes:
Fixes #3016
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]