User Story
As a developer using mdextractor,
I want code blocks containing inner backticks to be extracted as single units
so that nested or complex markdown structures are parsed accurately.
Background
The current regex pattern in mdextractor/__init__.py uses non-greedy matching (.*?), causing unintended splits when code blocks contain ```` characters. This fails the test_nested_code_blocks unit test, which expects `["Outer inner end"]` but currently returns `["Outer", "end"]`. The issue stems from the pattern prematurely closing at the first encountered closing backticks rather than matching the outermost pair.
Acceptance Criteria
User Story
As a developer using mdextractor,
I want code blocks containing inner backticks to be extracted as single units
so that nested or complex markdown structures are parsed accurately.
Background
The current regex pattern in
mdextractor/__init__.pyuses non-greedy matching (.*?), causing unintended splits when code blocks contain ```` characters. This fails thetest_nested_code_blocksunit test, which expects `["Outerinnerend"]` but currently returns `["Outer", "end"]`. The issue stems from the pattern prematurely closing at the first encountered closing backticks rather than matching the outermost pair.Acceptance Criteria
mdextractor/__init__.pyto use a regex pattern that greedily matches entire fenced blocks, ignoring inner backticks.test_nested_code_blocksintests/test_mdextractor.pyto validate blocks like:["Outer ```inner``` end"].test_multiple_blocks,test_with_language_specifier) still pass after the regex update.