Skip to content

Conversation

@Magicbook1108
Copy link
Contributor

What problem does this PR solve?

Fix: Correct Markdown chunk positioning (#12686)

Markdown chunking now aligns with the DOCX merge behavior introduced in #12455. Content is produced as ordered (text, image, table) tuples and passed through the downstream chunking pipeline with positional information preserved end-to-end.

As a temporary measure, the overlapped parameter is disabled while the context window size is enabled to maintain continuity without compromising offset accuracy.

Fix: DOCX ingestion pipeline return type error
Fix: Markdown ingestion pipeline return type error
Refactor: Rename shared functions to reflect usage across both DOCX and Markdown pipelines

Type of change

  • Bug Fix (non-breaking change which fixes an issue)

@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Jan 20, 2026
@Magicbook1108 Magicbook1108 added ci Continue Integration and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Jan 20, 2026
@dosubot dosubot bot added the 🐞 bug Something isn't working, pull request that fix bug. label Jan 20, 2026
@Magicbook1108 Magicbook1108 marked this pull request as draft January 20, 2026 11:31
@Magicbook1108 Magicbook1108 marked this pull request as ready for review January 20, 2026 11:31
@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Jan 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🐞 bug Something isn't working, pull request that fix bug. ci Continue Integration size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant