-
Notifications
You must be signed in to change notification settings - Fork 816
输出文字的格式处理严重影响阅读 || The format processing of the output text seriously affects reading #101
Description
What version of Kimi CLI is running?
0.41
Which open platform/subscription were you using?
kimi for coding
Which model were you using?
kimi for coding
What platform is your computer?
macOS
What issue are you seeing?
当执行./infra/kg-initial-build过程中,我们设计了两步LLM调用来完成高质量的知识图谱实体与关系信息抽取。第一步完成抽取
... ,然后进行适当的评估与修整,将结果再次提交到LLM,让LLM判断哪些是可以保留哪些是需要剔除的,此步骤为validation。为了达
... 成这两步骤提高知识图谱建立质量的目的,请你比较 validation是提交结果和原文,然后让LLM做判断还是提交结果,每条结果带一
... 个不大的sniptext是生成该结果的原文摘取的“证据”。哪个方案会比较合适?
Think("这是一个关于知识图谱构建中validation步骤设计的问题。需要比较两种validation方案:\n\n方案A:提交抽取结果 +
原文,让LLM做判断\n方案B:提交抽取结果 + 每条结果带snipetext(证据片段),让LLM做判断\n\n核心考虑因素:\n1.
上下文长度限制\n2. 判断准确性\n3. 效率\n4. 可解释性\n\n方案A的优点:\n- LLM有完整上下文,理论上判断更准确\n-
可以发现原文中遗漏的重要信息\n\n方案A的缺点:\n- 上下文可能过长,特别是大文档\n- LLM需要重新阅读理解全文,效率低\n-
可能产生幻觉,在原文中找不到支持\n\n方案B的优点:\n- 上下文精简,效率高\n- 有具体证据支撑,判断更精准\n-
可解释性强,便于追踪错误\n- 减少幻觉风险\n\n方案B的缺点:\n- 证据片段可能不够全面\n-
需要额外的证据提取步骤\n\n结论:方案B更合适,因为:\n1. 实用性强,避免了长上下文问题\n2.
证据驱动的方式更符合validation的本质\n3. 便于调试和优化\n4. 可以结合质量评分机制")
hai✨ 每次处理的原文大概1200字符。这会影响你的判断吗
Think("原文1200字符这个信息量很重要,让我重新评估:\n\n1200字符的特点:\n- 约200-300个中文词\n- 属于短文本范畴\n-
现代LLM的上下文窗口完全能容纳(通常4K-128K)\n-
阅读和理解成本相对较低\n\n重新比较两种方案:\n\n方案A(原文+结果):\n优点:\n- 1200字符很短,LLM阅读理解成本很低\n-
完整上下文确保不会遗漏重要信息\n- 实现简单,无需额外证据提取\n\n缺点:\n- 仍然需要LLM重新处理全文\n-
可能产生过度判断,把原文中不相关的信息也考虑进去\n\n方案B(结果+证据片段):\n-
证据提取的精度要求更高(因为原文短,证据片段也要精准)\n-
可能过度依赖片段,忽略片段外的上下文\n\n新的判断:\n1200字符的短文本让方案A的可行性大大提升,但我仍然倾向方案B,因为:\n1
. 即使1200字符,方案B的效率优势仍然存在\n2. 证据驱动的validation更精准,减少误判\n3. 便于后续的质量分析和调优")
What steps can reproduce the bug?
question that will trigger think
What is the expected behavior?
new line for \n
Additional information
No response
What version of Kimi CLI is running?
0.41
Which open platform/subscription were you using?
kimi for coding
Which model were you using?
kimi for coding
What platform is your computer?
macOS
What issue are you seeing?
When executing ./infra/kg-initial-build, we designed a two-step LLM call to complete high-quality knowledge graph entity and relationship information extraction. The first step is to complete the extraction
..., then perform appropriate evaluation and trimming, and submit the results to LLM again, allowing LLM to determine which ones can be retained and which ones need to be eliminated. This step is validation. in order to reach
... To achieve the purpose of improving the quality of knowledge graph creation in these two steps, please compare whether validation should submit the results and the original text, and then let LLM make a judgment or submit the results. Each result has a
... A small sniptext is the "evidence" extracted from the original text that generated the result. Which option would be more appropriate?
Think("This is a question about the design of validation steps in knowledge graph construction. Two validation schemes need to be compared:\n\nOption A: Submit extraction results +
Original text, let LLM make the judgment\nOption B: Submit the extraction results + each result with snipetext (evidence fragment), let the LLM make the judgment\n\nCore considerations:\n1.
Context length limit\n2. Judgment accuracy\n3. Efficiency\n4. Interpretability\n\nAdvantages of solution A:\n- LLM has a complete context, and theoretically the judgment is more accurate\n-
Important information missing from the original text can be found\n\nDisadvantages of Option A:\n- The context may be too long, especially for large documents\n- LLM needs to re-read and understand the full text, which is inefficient\n-
May cause hallucinations, no support found in the original text\n\nAdvantages of Option B:\n- Simplified context, high efficiency\n- Supported by concrete evidence, more accurate judgment\n-
Strong interpretability, easy to track errors\n- Reduce the risk of hallucinations\n\nDisadvantages of option B:\n- The evidence fragment may not be comprehensive enough\n-
Additional evidence extraction steps are required\n\nConclusion: Option B is more suitable because:\n1. It is practical and avoids the long context problem\n2.
The evidence-driven approach is more in line with the nature of validation\n3. Easy to debug and optimize\n4. Can be combined with a quality scoring mechanism")
hai✨ The original text processed each time is about 1200 characters. Does this affect your judgment?
Think("The amount of information in the original text of 1200 characters is very important, let me re-evaluate:\n\nCharacteristics of 1200 characters:\n- About 200-300 Chinese words\n- It belongs to the category of short text\n-
The context window of modern LLM can be fully accommodated (usually 4K-128K)\n-
The cost of reading and understanding is relatively low\n\nRe-compare the two options:\n\nOption A (original text + result):\nAdvantages:\n- 1200 characters is very short, and the cost of LLM reading and comprehension is very low\n-
Complete context ensures that no important information is missed\n- Simple to implement, no additional evidence extraction required\n\nDisadvantages:\n- LLM is still required to reprocess the full text\n-
It may lead to over-judgment and take into account irrelevant information in the original text\n\nOption B (results + evidence fragments):\n-
The accuracy of evidence extraction is required to be higher (because the original text is short, the evidence fragments must also be accurate)\n-
May rely too much on fragments and ignore context outside the fragment\n\nNew judgment:\nThe short text of 1200 characters greatly improves the feasibility of option A, but I still prefer option B because:\n1
. Even with 1200 characters, the efficiency advantage of plan B still exists\n2. Evidence-driven validation is more accurate and reduces misjudgments\n3. Facilitates subsequent quality analysis and tuning")
What steps can reproduce the bug?
question that will trigger think
What is the expected behavior?
new line for \n
Additional information
No response