Convert Markdown inside Office Word documents
pip install markdowntodocx
to convert an existing Docx file:
see examples/example.py
from markdowntodocx.markdownconverter import convertMarkdownInFile
convertMarkdownInFile("/mypath/to/document.docx", "output_path.docx", {"Code Car":"CodeStyle"})
You have to define styles in you word document in order to use Markdown **Headers/titles**, **Hyperlinks**, **Code formatting**, **Arrays**, **Unordered List**.
This styles are either standard markdown or come from extended markdown : https://www.markdownguide.org/extended-syntax/
-
Emphasis (italic)
*Text*or_Text_: converts to word italic -
Strong Emphasis (Bold)
**Text**or__Text__: converts to word bold -
Strike through (
Strike)~~Strike~~: converts to word strike through style -
Highlight (==highlight==) `==Highlight==' : converts to word Yellow highlight.
-
Header
# MarkdownHeader1to###### MarkdownHeader6:- Must be in alone in a paragraph. IF NOT, the rest will be erased.
- It will use the document style named "Header" by default.
- You can specify another style by giving the style dictionnary as last arg for both functions.
- E.g :
res, msg = convertMarkdownInFile("examples/in_document.docx", "examples/out_document.docx", {"Header":"style_name"})
-
(EXTENDED SYNTAX FOR WORD) Change font color
<color:FF0000> this text will be very red because the color is in RGB format</color>or<span style="color: rgb(230, 0, 0);> REd colored text </span> -
Inline Code
`Text`(my code):- It will use the document style named "Code" (Caracter format) by default.
- You can specify another style by giving the style dictionnary as last arg for both functions.
- E.g :
markdownToWordInFile("/mypath/to/document.docx", "output_path.docx", {"Code Car":"my_inline_code_style"})
-
Code Block
` ` `T e x t` ` `
my code
* It will use the document style named "Code" by default.
* You can specify another style by giving the style dictionnary as last arg for both functions.
* E.g : `markdownToWordInFile("/mypath/to/document.docx", "output_path.docx", {"Code":"my_block_code_style"})`
-
Mermaid support
` ` `mermaid ... ` ` `(see mermaid https://mermaid.js.org).- CAUTION : By default, mermaid graph are generated using https://mermaid.ink/img. You can install a mermaid.ink server locally with https://github.com/jihchi/mermaid.ink and then specify the mermaid server when running the markdown converter. Alternatively, you can specify a mermaid_cli as a binary, typically mmdc
- E.g :
markdownToWordInFile("/mypath/to/document.docx", "output_path.docx", mermaid_server_link="https://localhost:3000/img/") - E.g :
markdownToWordInFile("/mypath/to/document.docx", "output_path.docx", mermaid_cli="mmdc")
-
Insert Remote Image
:- It will download the image from the hyperlink and insert the picture with a width of 18cm
-
Insert Local Image
- It will read images extensioned file and insert them as remote images
-
Hyperlink
[google](https://www.google.fr): Makes it a Word hyperlink google- Will also attempt to convert any valid http hyperlink to word :
http://www.google.fr-> http://www.google.fr - If the link does not start with http, it will be treated as an internal link to a bookmark
- Will also attempt to convert any valid http hyperlink to word :
-
(EXTENDED SYNTAX FOR WORD) Bookmark ``this will be bookmared with name bookmark1{#bookmark1}
- You may hyperlink to it :
[url text to display]{bookmark1}
- You may hyperlink to it :
-
Footnotes (BETA) :
- Inline foot notes :
this is a conundrum^[https://fr.wiktionary.org/wiki/conundrum] - External foot notes : ``` This paragraph will have a footnote1 And this paragraph will have another2
- Inline foot notes :
-
Array to wordlist: (must be alone in a paragraph otherwise the rest of the paragraph is deleted)
|Column1|column2|Column3|
|-------|-------|-------|
|line|line|line|
-->
| Column1 | column2 | Column3 |
|---|---|---|
| line | line | line |
* Cells created will use the document style named "Cell" by default.
* You can specify another style by giving the style dictionnary as last arg for both functions.
* E.g : `markdownToWordInFile("/mypath/to/document.docx", "output_path.docx", {"Cell":"my_cell_style"})`
- Unordered List : (
- my listor* my listor+ my list) :- Must be in alone in a paragraph. IF NOT, the rest of the paragraph will be erased.
- It will use the document style named "Header" by default.
- You can specify another style by giving the style dictionnary as last arg for both functions.
- E.g :
markdownToWordInFile("/mypath/to/document.docx", "output_path.docx", {"BulletList":"my_bullet_style"})
Docx format supports image format but their is no easy way to save them in docx styles. As a workaround, a keyword arg named image_modifier exists for the function convertMarkdownInFile. It can be used like this to add a centered black shadow to all images in paragraph with a custom style named "ImageModifier":
res, msg = convertMarkdownInFile("examples/in_document.docx", "examples/out_document.docx" ,{"Header":"Header"},
image_modifier=['''<a:outerShdw blurRad="63500" sx="102000" sy="102000"
algn="ctr" rotWithShape="0">
<a:prstClr val="black">
<a:alpha val="40000" />
</a:prstClr>
</a:outerShdw>'''])It will add xml to all effectLst elements of pictures in the word document in paragraph with a custom style named "ImageModifier".
To get your goal style, create an empty docx document, add a single image with desired format and save the document.
Then, unzip the docx document as if it is a ZIP archive. Open the folder and then word/document.xml should display the XML. Look for your picture style and try to copy it.