fix parser failed when binlog_row_value_options=partial_json#5018
Merged
Conversation
|
|
Member
|
tks |
zoemak
pushed a commit
to zoemak/canal
that referenced
this pull request
Jan 30, 2024
…#5018) * 修改json部分更新的问题 * 保证after-image的情况下位点回退完整
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
修复了binlog_row_value_options=partial_json情况下的解析问题
#5017 当mysql的binlog_row_value_options=partial_json时binlog解析失败
原因
MySQL源码中
sql\log_event.cc\print_json_diff与canal中com.taobao.tddl.dbsync.binlog.JsonDiffConversion#print_json_diff(com.taobao.tddl.dbsync.binlog.LogBuffer, long, java.lang.String, int, java.nio.charset.Charset)这两个函数对于解析的边界取值上存在偏差MySQL源码中的length仅为JSON部分更新内容的长度
但是canal使用
buffer.hasRemaining进行判断,会导致解析到溢出的部分,也就是不属于JSON部分更新的内容的部分。超解析。在本来循环应该结束的位置没有停止,buffer过消费。实际上应该对buffer解析到
len长度就停止解析。修复
在循环结尾处引入长度消费检测,如果消费长度超出上限就直接退出循环
同时在入口处进行捕捉。
在某些情况下,虽然开启了
binlog_row_value_options=partial_json,但是after-image中的json字段的binlog记录仍然是完整的(主要出现在全量替换等情况下),不使用json函数进行表达,因此需要返回原有的完整解析的逻辑。需要注意重新解析时需要做回退位点
正常解析结果
情况如下
执行SQL
解析情况
执行SQL
解析情况
执行SQL
执行情况
上述即为开启
partial_json情况下after-image完整的情况执行SQL
执行情况
执行SQL
执行结果
嵌套解析正常,能够成功获取多层函数
执行SQL
执行结果
上述用例包含了
JSON_INSERT,JSON_REPLACE,JSON_ARRAY_INSERT,JSON_REMOVE以及partial_json下完整after-image,即可能出现的所有情况。