Handle the error in SpanLogger.Log if log level is Error (#3016)#3995
Handle the error in SpanLogger.Log if log level is Error (#3016)#3995ChinYing-Li wants to merge 3 commits intocortexproject:masterfrom
Conversation
8aab361 to
358183f
Compare
pracucci
left a comment
There was a problem hiding this comment.
Thanks for working on this! I think we should also look for all places where we call SpanLogger.Error() and remove that call if the error is also logged via SpanLogger.Log(), otherwise this change would introduce double calls to SpanLogger.Error().
pkg/util/spanlogger/spanlogger.go
Outdated
There was a problem hiding this comment.
This would set the error flag for any error logged, regardless of the log level. I think the idea in #3016 was to detect the log level and set the error flag only if the level was error. We have may have cases where we log an error even if it's not critical and way may not want to "red flag" it in the span, so filtering out any non error level may help.
But you did! :) Key-value pairs are passed to the |
^ This is exactly what I meant by "I didn't find a way to retrieve keyvalue pairs in a logger". The keyvalue pair |
I'm confused... If I add following lines to The way It's this new logger's cortex/vendor/github.com/go-kit/kit/log/log.go Lines 111 to 120 in d714b62 What it does is that it appends supplied parameters to cortex/vendor/github.com/go-kit/kit/log/log.go Lines 5 to 12 in d714b62 |
I've looked through the code base but didn't find any Edit: HI @pracucci, just a ping, to see if you have any comments regarding the question above. |
pracucci
left a comment
There was a problem hiding this comment.
I've looked through the code base but didn't find any SpanLogger.Error(). All I see is level.Error() and http.Error() (some require.Error() in the test files). I wonder if I missed something?
Uhm, there are many (eg. if you use Goland you can just use the "Find usage" feature). I checked them for you and places where we both call Error() and Log() with error level are:
- pkg/chunk/cache/memcached.go line 161
- pkg/querier/queryrange/results_cache.go line 560
- pkg/testexporter/correctness/simple.go within the
verifySamples()function
pkg/util/spanlogger/spanlogger.go
Outdated
There was a problem hiding this comment.
I would simplify this, supporting only error even if it's not preceeded by the field name err (eg. may also be called differently, not sure, too many places to check);
| if kvps[i] == level.Key() && kvps[i+1] == level.ErrorValue() { | |
| logAsError = true | |
| } else if kvps[i] == "err" { | |
| if err, ok := kvps[i+1].(error); ok { | |
| loggedError = err | |
| } else if errMsg, ok := kvps[i+1].(string); ok { | |
| loggedError = errors.New(errMsg) | |
| } | |
| } | |
| if kvps[i] == level.Key() && kvps[i+1] == level.ErrorValue() { | |
| logAsError = true | |
| continue | |
| } | |
| if err, ok := kvps[i+1].(error); ok { | |
| loggedError = err | |
| } |
pkg/util/spanlogger/spanlogger.go
Outdated
There was a problem hiding this comment.
No need to allocate it each time. You can initialize it to nil, and then set it to an "unknown error" inside:
if logAsError {
_ = s.Error(loggedError)
}
pkg/util/spanlogger/spanlogger.go
Outdated
There was a problem hiding this comment.
No need to continue the iteration if the level is not error. I think for the purpose of this triggerError() function you could just return if level != error.
pkg/util/spanlogger/spanlogger.go
Outdated
There was a problem hiding this comment.
I think this function is not quite correct. Behaviour I would expect to see is:
-
if log level is error, set span to error (
ext.Error.Set(s.Span, true)). That is all thattriggerErrorshould do imho. -
if there is also error object logged, it should be converted to
otlog.Error(err)field. But I would expect to see this happen inLogmethod, specifically inInterleavedKVToFieldsmapping, which returns fields from values. Note that presence of error object should not mark span itself as errorneous, as errors an be logged at debug level too.
WDYT?
There was a problem hiding this comment.
@pstibrany I see what you mean:
Right now triggerError calls SpanLogger.Error, which calls ext.Error.Set(s.Span, true) and s.Span.LogFields(otlog.Error(err)). err is being logged twice, once in the triggered SpanLogger.Error and once in Log itself.
#3016 wants to save users one line of code (calling SpanLogger.Error themselves) if level.Error is called before SpanLogger.Log - triggerError simply allows user to achieve what the two lines of code below do in one line.
cortex/pkg/testexporter/correctness/simple.go
Lines 179 to 180 in 8fe0bb1
Please let me know if anything above is unclear.
There was a problem hiding this comment.
@pstibrany Just a friendly ping: I wonder if you have any further thoughts or comment?
There was a problem hiding this comment.
Right now
triggerErrorcallsSpanLogger.Error, which callsext.Error.Set(s.Span, true)ands.Span.LogFields(otlog.Error(err)).erris being logged twice, once in the triggeredSpanLogger.Errorand once inLogitself.
You are right, that's not good :)
I see also that we cannot easily modify InterleavedKVToFields. What about following approach:
Logwill first calls.Logger.Log(kvps...)as it does now- it would then have logic from
triggerError, specifically- lookup of
level.Key()=level.ErrorValue()pair (this causesext.Error.Set(s.Span, true)to be called) - if we have found
level.Key()=level.ErrorValue(), then we would also search for non-nil objects implementingerrorinterface. If found witherrorerrorkey, we can remove it fromkvpsslice, and keep as extra field. (Only first found error is handled this way).
- lookup of
Logwould then callotlog.InterleavedKVToFieldsto convert remaining fields, and if we kept extra error field in previous step, we add it to these fieldsLogthen callss.Span.LogFieldswith all fields (incl. error one, if any).
We would not reuse (*SpanLogger).Error method, to avoid logging of error twice.
WDYT?
There was a problem hiding this comment.
May I ask the motivation to take out the first errorat step 2, and putting it back at step 3? Do you mean that the first error should be kept as an extra field, and the rest of the error should be removed from the kvps?
There was a problem hiding this comment.
May I ask the motivation to take out the first
errorat step 2, and putting it back at step 3?
Motivation is to avoid double-logging of the error, and to store it wrapped into otlog.Error(err) instead. Calling otlog.InterleavedKVToFields with error instance would convert it to string.
Do you mean that the first error should be kept as an extra field,
Yes.
and the rest of the
errorshould be removed from thekvps?
No, I would keep the rest as-is. If we removed them, they would disappear from the span-log. That would be bad.
…t#3995) Signed-off-by: ChinYing-Li <chinying.li@mail.utoronto.ca>
|
Can you please rebase against master to pull in #4137 and move the changelog entry to the top? |
Signed-off-by: ChinYing-Li <chinying.li@mail.utoronto.ca>
pracucci
left a comment
There was a problem hiding this comment.
Thanks @ChinYing-Li for your patience! What did look like an easy change is turning out not that easy. I think we're on the good track tho!
| } | ||
| } | ||
| if logAsError && errorIndex != -1 { | ||
| s.Span.LogFields(otlog.Error(kvps[i+1].(error))) |
There was a problem hiding this comment.
I think you need to use errorIndex instead of i, otherwise you're assuming that the level is always before the error in kvps ordering.
| @@ -77,6 +77,31 @@ func FromContextWithFallback(ctx context.Context, fallback log.Logger) *SpanLogg | |||
| // also puts the on the spans. | |||
| func (s *SpanLogger) Log(kvps ...interface{}) error { | |||
There was a problem hiding this comment.
Is it doable adding some valuable unit tests on the new logic you've added? Otherwise we'll need to manually test it before merging, cause the logic is a bit convoluted and there may be subtle bugs slipping it.
There was a problem hiding this comment.
Because I could not see a way to retrieve the field of SpanLogger, it's unlikely to test by comparing the value of the fields. If we refactor the new logic into a private function, I can write unit tests for it. However unit testing a private function doesn't sound like a good idea.
There was a problem hiding this comment.
@pracucci Do you have any further suggestion based on my previous comment? Thanks!
There was a problem hiding this comment.
How about using https://github.com/opentracing/opentracing-go/blob/master/mocktracer/mocktracer.go ?
(We unit-test private functions all the time, but I agree it's better if you can avoid)
|
What is the performance impact of doing these checks on every field of every log? |
|
Sorry for getting back late! func TestSpanLogger_Timing(t *testing.T) {
_, ctx := New(context.Background(), "test", "bar")
newSpan := FromContext(ctx)
startTime := time.Now()
for i := 0; i < 10000; i++ {
_ = newSpan.Log("metric1", 1, "err", errors.New("err"), "metric2", 2)
}
totalTime := time.Since(startTime)
print(totalTime, "\n ")
}Certainly there's an impact on the run time for doing the O(n) check on logged values. Result on
|
|
Could you re-do that using the Go benchmark framework? You had another question about how to do testing - in |
|
Thanks for the pointer. I have yet to incorporate mocktracer into the benchmarking code, and here's the benchmark functions (Fixed the bug and updated the benchmarking results on 3 Aug): func BenchmarkSpanLogger100(b *testing.B) {
benchmarkSpanLogger(b, 100)
}
func BenchmarkSpanLogger1000(b *testing.B) {
benchmarkSpanLogger(b, 1000)
}
func benchmarkSpanLogger(b *testing.B, numPair int) {
_, ctx := New(context.Background(), "test", "bar")
newSpan := FromContext(ctx)
err := "err"
arr := make([]interface{}, numPair*2)
for i := 0; i < numPair; i++ {
arr[2*i] = time.Now().String()
arr[2*i + 1] = err
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
newSpan.Log(arr...)
}
}No actual Results on branch
|
This should make you suspicious: 2ns is enough to execute maybe 10 instructions, yet you expect it to process 1000 values.
This passes a single value, which fails immediately because Log is expecting pairs of values. (and you have to make |
|
I was also a bit confused about the time/op but not sure what went wrong. Thank you for the pointers, and I've updated the code and the results in my previous comment! |
|
This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions. |
|
Hi there, |
Signed-off-by: ChinYing-Li chinying.li@mail.utoronto.ca
What this PR does:
Detect
errorinSpanLogger.Logand triggersSpanLogger.Errorif there's any.#3016 wants to trigger
SpanLogger.Errorby detecting thelevel=Errorkeyvalue pair added bylevel.Error(logger). Upon investigation, I didn't find a way to retrieve keyvalue pairs in a logger. This PR, as a workaround, isn't ideal: if user only passes in strings tolevel.Error(logger).Log, thenSpanLogger.Errorwon't be called.Not sure whether this falls under
CHANGEorENHANCEMENTin theCHANGELOG.md.Any suggestion is appreciated!
Which issue(s) this PR fixes:
Fixes #3016
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]