Skip to content

hunyadi/md2conf

Repository files navigation

Publish Markdown files to Confluence wiki

Contributors to software projects typically write documentation in Markdown format and host Markdown files in collaborative version control systems (VCS) such as GitHub or GitLab to track changes and facilitate the review process. However, not everyone at a company has access to VCS, and documents are often circulated in Confluence wiki instead.

Replicating documentation to Confluence by hand is tedious, and a lack of automated synchronization with the project repositories where the documents live leads to outdated documentation.

This Python package

  • parses Markdown files,
  • converts Markdown content into the Confluence Storage Format (XHTML),
  • invokes Confluence API endpoints to upload images and content.

Features

Whenever possible, the implementation uses Confluence REST API v2 to fetch space properties, and get, create or update page content.

Installation

Required. Install the core package from PyPI:

pip install markdown-to-confluence

Command-line utilities

Optional. Converting *.drawio diagrams to PNG or SVG images before uploading to Confluence as attachments requires installing draw.io. (Refer to --render-drawio.)

Optional. Converting code blocks of Mermaid diagrams to PNG or SVG images before uploading to Confluence as attachments requires mermaid-cli. (Refer to --render-mermaid.)

npm install -g @mermaid-js/mermaid-cli

Marketplace apps

As authors of md2conf, we don't endorse or support any particular Confluence marketplace apps.

Optional. Editable draw.io diagrams require draw.io Diagrams marketplace app. (Refer to --no-render-drawio.)

Optional. Displaying Mermaid diagrams in Confluence without pre-rendering in the synchronization phase requires a marketplace app. (Refer to --no-render-mermaid.)

Optional. Displaying formulas and equations in Confluence requires marketplace app, refer to LaTeX Math for Confluence - Math Formula & Equations.

Getting started

In order to get started, you will need

  • your organization domain name (e.g. example.atlassian.net),
  • base path for Confluence wiki (typically /wiki/ for managed Confluence, / for on-premise)
  • your Confluence username (e.g. [email protected]) (only if required by your deployment),
  • a Confluence API token (a string of alphanumeric characters), and
  • the space key in Confluence (e.g. SPACE) you are publishing content to.

Obtaining an API token

  1. Log in to https://id.atlassian.com/manage/api-tokens.
  2. Click Create API token.
  3. From the dialog that appears, enter a memorable and concise Label for your token and click Create.
  4. Click Copy to clipboard, then paste the token to your script, or elsewhere to save.

Setting up the environment

Confluence organization domain, base path, username, API token and space key can be specified at runtime or set as Confluence environment variables (e.g. add to your ~/.profile on Linux, or ~/.bash_profile or ~/.zshenv on MacOS):

export CONFLUENCE_DOMAIN='example.atlassian.net'
export CONFLUENCE_PATH='/wiki/'
export CONFLUENCE_USER_NAME='[email protected]'
export CONFLUENCE_API_KEY='0123456789abcdef'
export CONFLUENCE_SPACE_KEY='SPACE'

On Windows, these can be set via system properties.

If you use Atlassian scoped API tokens, you should set API URL, substituting CLOUD_ID with your own Cloud ID:

export CONFLUENCE_API_URL='https://api.atlassian.com/ex/confluence/CLOUD_ID/'

In this case, md2conf can automatically determine CONFLUENCE_DOMAIN and CONFLUENCE_PATH.

Permissions

The tool requires appropriate permissions in Confluence in order to invoke endpoints.

Required scopes for scoped API tokens are as follows:

  • read:page:confluence
  • write:page:confluence
  • read:space:confluence
  • write:space:confluence
  • read:attachment:confluence
  • write:attachment:confluence
  • read:label:confluence
  • write:label:confluence

If a Confluence username is set, the tool uses HTTP Basic authentication to pass the username and the API key to Confluence REST API endpoints. If no username is provided, the tool authenticates with HTTP Bearer, and passes the API key as the bearer token.

If you lack appropriate permissions, you will get an Unauthorized response from Confluence. The tool will emit a message that looks as follows:

2023-06-30 23:59:59,000 - ERROR - <module> [80] - 401 Client Error: Unauthorized for url: ...

Associating a Markdown file with a wiki page

Each Markdown file is associated with a Confluence wiki page with a Markdown comment:

<!-- confluence-page-id: 20250001023 -->

The above tells the tool to synchronize the Markdown file with the given Confluence page ID. This implies that the Confluence wiki page must exist such that it has an ID. The comment can be placed anywhere in the source file.

Setting the Confluence space

If you work in an environment where there are multiple Confluence spaces, and some Markdown pages may go into one space, whereas other pages may go into another, you can set the target space on a per-document basis:

<!-- confluence-space-key: SPACE -->

This overrides the default space set via command-line arguments or environment variables.

Setting generated-by prompt text for wiki pages

In order to ensure readers are not editing a generated document, the tool adds a warning message at the top of the Confluence page as an info panel. You can customize the text that appears. The text can contain markup as per the Confluence Storage Format, and is emitted directly into the info panel macro.

Provide generated-by prompt text in the Markdown file with a tag:

<!-- generated-by: Do not edit! Check out the <a href="https://example.com/project">original source</a>. -->

Alternatively, use the --generated-by GENERATED_BY option. The tag takes precedence.

Publishing a single page

md2conf has two modes of operation: single-page mode and directory mode.

In single-page mode, you specify a single Markdown file as the source, which can contain absolute links to external locations (e.g. https://example.com) but not relative links to other pages (e.g. local.md). In other words, the page must be stand-alone.

Publishing a directory

md2conf allows you to convert and publish a directory of Markdown files rather than a single Markdown file in directory mode if you pass a directory as the source. This will traverse the specified directory recursively, and synchronize each Markdown file.

First, md2conf builds an index of pages in the directory hierarchy. The index maps each Markdown file path to a Confluence page ID. Whenever a relative link is encountered in a Markdown file, the relative link is replaced with a Confluence URL to the referenced page with the help of the index. All relative links must point to Markdown files that are located in the directory hierarchy.

If a Markdown file doesn't yet pair up with a Confluence page, md2conf creates a new page and assigns a parent. Parent-child relationships are reflected in the navigation panel in Confluence. You can set a root page ID with the command-line option -r, which constitutes the topmost parent. (This could correspond to the landing page of your Confluence space. The Confluence page ID is always revealed when you edit a page.) Whenever a directory contains the file index.md or README.md, this page becomes the future parent page, and all Markdown files in this directory (and possibly nested directories) become its child pages (unless they already have a page ID). However, if an index.md or README.md file is subsequently found in one of the nested directories, it becomes the parent page of that directory, and any of its subdirectories.

The top-level directory to be synchronized must always have an index.md or README.md, which maps to the root of the corresponding sub-tree in Confluence (specified with -r).

The concepts above are illustrated in the following sections.

File-system directory hierarchy

The title of each Markdown file (either the text of the topmost unique heading (#), or the title specified in front-matter) is shown next to the file name. docs is the top-level directory to be synchronized.

docs
├── index.md: Root page
├── computer-science
│   ├── index.md: Introduction to computer science
│   ├── algebra.md: Linear algebra
│   └── algorithms.md: Theory of algorithms
├── machine-learning
│   ├── README.md: AI and ML
│   ├── awareness.md: Consciousness and intelligence
│   └── statistics
│       ├── index.md: Introduction to statistics
│       └── median.md: Mean vs. median
└── ethics.md: Ethical considerations

Page hierarchy in Confluence

Observe how index.md and README.md files have assumed parent (or ancestor) role for any Markdown files in the same directory (or below).

Root page
├── Introduction to computer science
│   ├── Linear algebra
│   └── Theory of algorithms
├── AI and ML
│   ├── Consciousness and intelligence
│   └── Introduction to statistics
│       └── Mean vs. median
└── Ethical considerations

Emoji

The short name notation :smile: in a Markdown document is converted into the corresponding emoji 😄 when publishing to Confluence.

md2conf relies on the Emoji extension of PyMdown Extensions to parse the short name notation with colons, and generate Confluence Storage Format output such as

<ac:emoticon ac:name="smile" ac:emoji-shortname=":smile:" ac:emoji-id="1f604" ac:emoji-fallback="&#128516;"/>

Colors

Confluence allows setting text color and highlight color. Even though Markdown doesn't directly support colors, it is possible to set text and highlight color via the HTML element <span> and the CSS attributes color and background-color, respectively:

Text in red, green and blue:

Text in <span style="color: rgb(255,86,48);">red</span>, <span style="color: rgb(54,179,126);">green</span> and <span style="color: rgb(76,154,255);">blue</span>.

Highlight in teal, lime and yellow:

Highlight in <span style="background-color: rgb(198,237,251);">teal</span>, <span style="background-color: rgb(211,241,167);">lime</span> and <span style="background-color: rgb(254,222,200);">yellow</span>.

The following table shows standard text colors (CSS color) that are available via Confluence UI:

Color name CSS attribute value
bold blue rgb(7,71,166)
blue rgb(76,154,255)
subtle blue rgb(179,212,255)
bold teal rgb(0,141,166)
teal rgb(0,184,217)
subtle teal rgb(179,245,255)
bold green rgb(0,102,68)
green rgb(54,179,126)
subtle green rgb(171,245,209)
bold orange rgb(255,153,31)
yellow rgb(255,196,0)
subtle yellow rgb(255,240,179)
bold red rgb(191,38,0)
red rgb(255,86,48)
subtle red rgb(255,189,173)
bold purple rgb(64,50,148)
purple rgb(101,84,192)
subtle purple rgb(234,230,255)

The following table shows standard highlight colors (CSS background-color) that are available via Confluence UI:

Color name CSS attribute value
teal rgb(198,237,251)
lime rgb(211,241,167)
yellow rgb(254,222,200)
magenta rgb(253,208,236)
purple rgb(223,216,253)

Lists and tables

If your Markdown lists or tables don't appear in Confluence as expected, verify that the list or table is delimited by a blank line both before and after, as per strict Markdown syntax. While some previewers accept a more lenient syntax (e.g. an itemized list immediately following a paragraph), md2conf uses Python-Markdown internally to convert Markdown into XHTML, which expects the Markdown document to adhere to the stricter syntax.

Publishing images

Local images referenced in a Markdown file are automatically published to Confluence as attachments to the page.

Unfortunately, Confluence struggles with SVG images, e.g. they may only show in edit mode, display in a wrong size or text labels in the image may be truncated. (This seems to be a known issue in Confluence.) In order to mitigate the issue, whenever md2conf encounters a reference to an SVG image in a Markdown file, it checks whether a corresponding PNG image also exists in the same directory, and if a PNG image is found, it is published instead.

External images referenced with an absolute URL retain the original URL.

LaTeX math formulas

Inline formulas can be enclosed with $ signs, or delimited with \( and \), i.e.

  • the code $\sum_{i=1}^{n} i = \frac{n(n+1)}{2}$ is shown as $\sum_{i=1}^{n} i = \frac{n(n+1)}{2}$,
  • and \(\lim _{x\rightarrow \infty }\frac{1}{x}=0\) is shown as $\lim _{x\rightarrow \infty }\frac{1}{x}=0$.

Block formulas can be enclosed with $$, or wrapped in code blocks specifying the language math:

$$\int _{a}^{b}f(x)dx=F(b)-F(a)$$

is shown as

$$\int _{a}^{b}f(x)dx=F(b)-F(a)$$

Displaying math formulas in Confluence requires the extension LaTeX Math for Confluence - Math Formula & Equations.

Ignoring files

Skip files in a directory with rules defined in .mdignore. Each rule should occupy a single line. Rules follow the syntax (and constraints) of fnmatch. Specifically, ? matches any single character, and * matches zero or more characters. For example, use up-*.md to exclude Markdown files that start with up-. Lines that start with # are treated as comments.

Files that don't have the extension *.md are skipped automatically. Hidden directories (whose name starts with .) are not recursed into.

Relative paths to items in a nested directory are not supported. You must put .mdignore in the same directory where the items to be skipped reside.

If you add the synchronized attribute to JSON or YAML front-matter with the value false, the document content (including attachments) and metadata (e.g. tags) will not be synchronized with Confluence:

---
title: "Collaborating with other teams"
page_id: "19830101"
synchronized: false
---

This Markdown document is neither parsed, nor synchronized with Confluence.

This is useful if you have a page in a hierarchy that participates in parent-child relationships but whose content is edited directly in Confluence. Specifically, these documents can be referenced with relative links from other Markdown documents in the file system tree.

Page title

md2conf makes a best-effort attempt at setting the Confluence wiki page title when it publishes a Markdown document the first time. The following are probed in this order:

  1. The title attribute set in the front-matter. Front-matter is a block delimited by --- at the beginning of a Markdown document. Both JSON and YAML syntax are supported.
  2. The text of the topmost unique Markdown heading (#). For example, if a document has a single first-level heading (e.g. # My document), its text is used. However, if there are multiple first-level headings, this step is skipped.
  3. The file name (without the extension .md).

If a matching Confluence page already exists for a Markdown file, the page title in Confluence is left unchanged.

Labels

If a Markdown document has the front-matter attribute tags, md2conf assigns the specified tags to the Confluence page as labels.

---
title: "Example document"
tags: ["markdown", "md", "wiki"]
---

Any previously assigned labels are discarded. As per Confluence terminology, new labels have the prefix of global.

If a document has no tags attribute, existing Confluence labels are left intact.

Content properties

The front-matter attribute properties in a Markdown document allows setting Confluence content properties on a page. Confluence content properties are a way to store structured metadata in the form of key-value pairs directly on Confluence content. The values in content properties are represented as JSON objects.

Some content properties have special meaning to Confluence. For example, the following properties cause Confluence to display a wiki page with content confined to a fixed width in regular view mode, and taking the full page width in draft mode:

---
properties:
  content-appearance-published: fixed-width
  content-appearance-draft: full-width
---

The attribute properties is parsed as a dictionary with keys of type string and values of type JSON. md2conf passes JSON values to Confluence REST API unchanged.

draw.io diagrams

With the command-line option --no-render-drawio (default), editable diagram data is extracted from images with embedded draw.io diagrams (*.drawio.png and *.drawio.svg), and uploaded to Confluence as attachments. *.drawio and *.drawio.xml files are uploaded as-is. You need a marketplace app to view and edit these diagrams on a Confluence page.

With the command-line option --render-drawio, images with embedded draw.io diagrams (*.drawio.png and *.drawio.svg) are uploaded unchanged, and shown on the Confluence page as images. These diagrams are not editable in Confluence. When both an SVG and a PNG image is available, PNG is preferred. *.drawio and *.drawio.xml files are converted into PNG or SVG images by invoking draw.io as a command-line utility, and the generated images are uploaded to Confluence as attachments, and shown as images.

Mermaid diagrams

You can include Mermaid diagrams in your Markdown documents to create visual representations of systems, processes, and relationships. When a Markdown document contains a code block with the language specifier mermaid, md2conf offers two options to publish the diagram:

  1. Pre-render into an image (command-line option --render-mermaid). The code block is interpreted by and converted into a PNG or SVG image with the Mermaid diagram utility mermaid-cli. The generated image is then uploaded to Confluence as an attachment to the page. This is the approach we use and support.
  2. Display on demand (command-line option --no-render-mermaid). The code block is transformed into a diagram macro, which is processed by Confluence. You need a marketplace app to turn macro definitions into images when a Confluence page is visited.

If you are running into issues with the pre-rendering approach (e.g. misaligned labels in the generated image), verify if mermaid-cli can process the Mermaid source:

mmdc -i sample.mmd -o sample.png -b transparent --scale 2

Ensure that mermaid-cli is set up, refer to Installation for instructions.

Local output

md2conf supports local output, in which the tool doesn't communicate with the Confluence REST API. Instead, it reads a single Markdown file or a directory of Markdown files, and writes Confluence Storage Format (*.csf) output for each document. (Confluence Storage Format is a derivative of XHTML with Confluence-specific tags for complex elements such as images with captions, code blocks, info panels, collapsed sections, etc.) You can push the generated output to Confluence by invoking the API (e.g. with curl).

Running the tool

You execute the command-line tool md2conf to synchronize the Markdown file with Confluence:

$ python3 -m md2conf sample/index.md

Use the --help switch to get a full list of supported command-line options:

$ python3 -m md2conf --help
usage: md2conf [-h] [--version] [-d DOMAIN] [-p PATH] [--api-url API_URL] [-u USERNAME] [-a API_KEY] [-s SPACE] [-l {debug,info,warning,error,critical}] [-r ROOT_PAGE] [--keep-hierarchy] [--flatten-hierarchy]
               [--generated-by GENERATED_BY] [--no-generated-by] [--render-drawio] [--no-render-drawio] [--render-mermaid] [--no-render-mermaid] [--render-mermaid-format {png,svg}] [--heading-anchors]
               [--no-heading-anchors] [--ignore-invalid-url] [--local] [--headers [KEY=VALUE ...]] [--webui-links]
               mdpath

positional arguments:
  mdpath                Path to Markdown file or directory to convert and publish.

options:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  -d, --domain DOMAIN   Confluence organization domain.
  -p, --path PATH       Base path for Confluence (default: '/wiki/').
  --api-url API_URL     Confluence API URL. Required for scoped tokens. Refer to documentation how to obtain one.
  -u, --username USERNAME
                        Confluence user name.
  -a, --apikey, --api-key API_KEY
                        Confluence API key. Refer to documentation how to obtain one.
  -s, --space SPACE     Confluence space key for pages to be published. If omitted, will default to user space.
  -l, --loglevel {debug,info,warning,error,critical}
                        Use this option to set the log verbosity.
  -r ROOT_PAGE          Root Confluence page to create new pages. If omitted, will raise exception when creating new pages.
  --keep-hierarchy      Maintain source directory structure when exporting to Confluence.
  --flatten-hierarchy   Flatten directories with no index.md or README.md when exporting to Confluence.
  --generated-by GENERATED_BY
                        Add prompt to pages (default: 'This page has been generated with a tool.').
  --no-generated-by     Do not add 'generated by a tool' prompt to pages.
  --render-drawio       Render draw.io diagrams as image files and add as attachments. (Converter required.)
  --no-render-drawio    Inline draw.io diagram in Confluence page. (Marketplace app required.)
  --render-mermaid      Render Mermaid diagrams as image files and add as attachments. (Converter required.)
  --no-render-mermaid   Inline Mermaid diagram in Confluence page. (Marketplace app required.)
  --render-mermaid-format {png,svg}
                        Format for rendering Mermaid and draw.io diagrams (default: 'png').
  --heading-anchors     Place an anchor at each section heading with GitHub-style same-page identifiers.
  --no-heading-anchors  Don't place an anchor at each section heading.
  --ignore-invalid-url  Emit a warning but otherwise ignore relative URLs that point to ill-specified locations.
  --local               Write XHTML-based Confluence Storage Format files locally without invoking Confluence API.
  --headers [KEY=VALUE ...]
                        Apply custom headers to all Confluence API requests.
  --webui-links         Enable Confluence Web UI links. (Typically required for on-prem versions of Confluence.)

Using the Docker container

You can run the Docker container via docker run or via Dockerfile. Either can accept the environment variables or arguments similar to the Python options. The final argument ./ corresponds to mdpath in the command-line utility.

With docker run, you can pass Confluence domain, user, API and space key directly to docker run:

docker run --rm --name md2conf -v $(pwd):/data leventehunyadi/md2conf:latest -d example.atlassian.net -u [email protected] -a 0123456789abcdef -s SPACE ./

Alternatively, you can use a separate file .env to pass these parameters as environment variables:

docker run --rm --env-file .env --name md2conf -v $(pwd):/data leventehunyadi/md2conf:latest ./

In each case, -v $(pwd):/data maps the current directory to Docker container's WORKDIR such md2conf can scan files and directories in the local file system.

Note that the entry point for the Docker container's base image is ENTRYPOINT ["python3", "-m", "md2conf"].

With the Dockerfile approach, you can extend the base image:

FROM leventehunyadi/md2conf:latest

ENV CONFLUENCE_DOMAIN='example.atlassian.net'
ENV CONFLUENCE_PATH='/wiki/'
ENV CONFLUENCE_USER_NAME='[email protected]'
ENV CONFLUENCE_API_KEY='0123456789abcdef'
ENV CONFLUENCE_SPACE_KEY='SPACE'

CMD ["./"]

Alternatively,

FROM leventehunyadi/md2conf:latest

CMD ["-d", "example.atlassian.net", "-u", "[email protected]", "-a", "0123456789abcdef", "-s", "SPACE", "./"]

About

Publish Markdown files to Confluence wiki

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 18