Skip to content

bitovi/n8n-nodes-markitdown

Repository files navigation

@bitovi/n8n-nodes-markitdown

This is an n8n community node. It lets you use Microsoft's Markitdown in your n8n workflows.

Microsoft's Markitdown is a powerful utility for converting various document formats (PDF, Word, PowerPoint, Excel, images, audio, and more) into clean, structured Markdown text. This makes it easy to extract and process content from different file types in your automation workflows.

n8n is a fair-code licensed workflow automation platform.

Installation
Operations
Compatibility
Usage
Resources
Version history

Installation

Follow the installation guide in the n8n community nodes documentation.

Requirements

  • A self-hosted n8n instance.
  • Make sure you have Markitdown installed 2. Simply update your own Dockerfile with the delcarations below
    1. You may use our custom image
  • Make sure to allow community nodes with N8N_COMMUNITY_PACKAGES_ENABLED=true
  • Once logged in to your N8N web UI, go to /settings/community-nodes and type @bitovi/n8n-nodes-markitdown

Option 1 (Recommended) - Use our pre-built Docker image

# Use our custom image
FROM bitovi/n8n-nodes-markitdown:latest

# Optional, put your customization here
...

Option 2 - Add to your existing Dockerfile

FROM n8nio/n8n:latest

# Switch to the root user for installations
USER root
RUN npm install -g pnpm

# === Python Dependencies for Alpine ===
# This uses Alpine's 'apk' package manager.
# 1. Create a temporary virtual package '.build-deps' with all build dependencies.
# 2. Use pip to install markitdown globally, adding '--break-system-packages' to handle PEP 668.
# 3. Ensure Python runtime packages remain installed.
# 4. Remove only the build dependencies to keep the image smaller.
RUN apk add --no-cache --virtual .build-deps git build-base python3-dev py3-pip && \
    apk add --no-cache python3 && \
    pip install markitdown --break-system-packages && \
    apk del .build-deps

# Ensure the Python scripts directory is in PATH for all users
ENV PATH="/usr/local/bin:$PATH"

# Switch back to the non-privileged 'node' user for security
USER node

# Set the working directory to n8n's default
WORKDIR /home/node/.n8n/nodes

# Install the n8n Markitdown nodes package
RUN npm install @bitovi/n8n-nodes-markitdown@latest --only=prod

# Set the main working directory back to n8n's default
WORKDIR /home/node

Operations

This node provides a single operation:

Convert to Markdown: Takes any supported file format and converts it to clean, structured Markdown text.

Input Parameters:

  • Input Binary Field: The name of the binary property containing the file to process (default: "data")

Output:

  • data: The converted Markdown content as a string in the JSON output
  • binary: Empty binary object (the output is provided as text in the JSON data)

Compatibility

  • Minimum n8n version: 1.0.0
  • Node.js version: >=18.10
  • Python version: 3.7+ (required for markitdown)
  • Tested with: n8n 1.14.1+

This node requires Python and the markitdown package to be installed in your n8n environment. See the installation instructions above for proper setup.

Usage

Basic Workflow Setup

  1. File Input: Connect a node that provides binary file data (like "Read Binary File", "HTTP Request", or "Google Drive" nodes)
  2. Markitdown Node: Add the Markitdown node to your workflow
  3. Configure: Set the "Input Binary Field" parameter to match the binary property name from your input node (usually "data")
  4. Output: The converted Markdown will be available in the data field of the output JSON

Example Use Cases

  • Document Processing: Convert Word documents, PDFs, or PowerPoint presentations to Markdown for further text processing
  • Content Extraction: Extract text content from images using OCR capabilities
  • Data Pipeline: Batch convert multiple documents as part of a larger data processing workflow
  • Content Management: Convert various document formats to Markdown for publishing or archival systems

Tips

  • The node processes files by creating temporary files, so ensure your n8n instance has sufficient disk space
  • Large files may take longer to process depending on the complexity and type of document
  • If processing fails, check that the markitdown command is properly installed and accessible in your environment

How to find the node

You can search "markitdown" in the n8n node search bar. The node will appear with the Microsoft logo.

Markitdown Node

File Types Supported

MarkItDown supports a wide variety of file formats:

  • PDF: Extract text and structure from PDF documents
  • Microsoft Office: PowerPoint (reading in top-to-bottom, left-to-right order), Word, Excel
  • Images: EXIF metadata extraction and OCR text recognition
  • Audio: EXIF metadata and speech transcription
  • Web: HTML pages and content
  • Text-based formats: CSV, JSON, XML
  • Archives: ZIP files (iterates over contents)
  • Online content: YouTube URLs
  • And more!

Resources

Need Help or Have Questions?

Need guidance on leveraging AI agents or N8N for your business? Our AI Agents workshop will equip you with the knowledge and tools necessary to implement successful and valuable agentic workflows.

License

MIT

About

No description, website, or topics provided.

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Contributors 6