🚀 PaperToSlides is an AI-driven tool designed to automatically convert academic papers in PDF format into polished presentation slides—perfect for research group meetings, conference rehearsals, and quick paper summaries.
-
Completely refactored the
extract_pdf_to_markdown
function using the new MinerU API to enhance PDF parsing performance. -
Added a conditional check to skip PDF parsing if the corresponding Markdown file already exists.
-
Modified the LLM API call mechanism to read from a .env file. Now you can easily switch or invoke the desired OpenAI API Key or Gemini API Key by specifying it in the .env file. I provide a
.env.example
file in the repo, you can remove the.example
to use it.
Many thanks to Mr. Yamauchi for his valuable suggestions!
- 📄 Efficient Content Extraction: Utilizes MinerU for high-quality content extraction from academic PDFs.
- 🤖 AI-Powered Summarization: Integrates OpenAI’s API to interpret and summarize the paper's content, including both text and visual data.
- 🎨 Slide Generation: Produces a structured, ready-to-present PowerPoint file.
- 🖼️ Visual Preservation: Retains original figures, tables, and images, ensuring content integrity.
- 📊 Presentation-Ready: Tailored for academic settings, making it ideal for presentations, discussions, and research insights.
- 📘 Research Paper Presentations: Summarize and present papers with minimal manual preparation.
- 👥 Academic Group Meetings: Share findings efficiently in lab or study group settings.
- 🎤 Conference Rehearsals: Practice presenting key points ahead of conferences.
- 🔍 Quick Overviews: Generate concise summaries for rapid information sharing.
To get started, follow these steps:
-
Clone this repository:
git clone [https://github.com/yourusername/PaperToSlides.git] cd PaperToSlides
-
Set up MinerU and dependencies:
cd MinerU git clone https://github.com/opendatalab/MinerU.git && cd ..
-
Create a virtual environment:
conda create -n MinerU python=3.10 conda activate MinerU
-
Install dependencies:
pip install magic-pdf[full]==0.7.0b1 --extra-index-url https://wheels.myhloli.com pip install python-pptx
-
Set up model weights using Git LFS:
cd model git lfs clone https://huggingface.co/wanderkid/PDF-Extract-Kit
-
Configuration: Update
magic-pdf.json
to specify themodels-dir
andcuda
settings according to your environment.
Below is a concise installation and troubleshooting guide written by Mr. Yamauchi. Special thanks to him for his contribution!
This document details the installation process, provides clarifications on the instructions in the README.md file, and outlines the errors I encountered along with their solutions.
Here are some clarifications regarding the steps outlined in the PaperToSlides README.md:
-
Downloading the Source Code: Regarding step 1 in the README, downloading the source code by selecting "Download ZIP" from the
< > Code
button on the GitHub repository page presented no particular issues. -
Creating the Model Directory: Before proceeding to step 5, it is necessary to create the
model
directory beforehand using the commandmkdir -p model
. -
Downloading Model Files (
git lfs clone
): The commandgit lfs clone https://huggingface.co/wanderkid/PDF-Extract-Kit
is used to download the model files. However, this command may not complete successfully depending on the execution environment. Specifically, there were cases where only small files were downloaded, resulting in the error message "Not in a Git repository." In such situations, trial and error are required until the command succeeds.
Even if git lfs clone
persistently fails, it was ultimately possible to get the tool working by running python download_models.py
within the ./MinerU
directory (discussed later) and individually addressing the various errors that arose subsequently. However, this method is extremely time-consuming and is not recommended.
- Running
download_models.py
: Regardless of whethergit lfs clone
was successful, it seems often necessary to navigate to the./MinerU
directory and executepython download_models.py
. Some trial and error was required to get this script to complete successfully.
Important Notes:
-
The total size of the downloaded model files is approximately 12GB, and the download process takes a significant amount of time.
-
Executing this script generates a configuration file named magic-pdf.json in the user's home directory.
-
Dependency errors may occur during execution. For instance, in my environment, it was necessary to run
pip install modelscope
after activating the virtual environment withconda activate MinerU
.
-
Placing the PDF File: Place the target PDF file (the one you want to generate a slide outline from) inside the
./data
directory of the PaperToSlides installation. -
Editing
GenerateSlidesOutline.py
: Edit the following sections within theGenerateSlidesOutline.py
script:
- Setting the OpenAI API Key: Set your own OpenAI API key. Please refer to external resources for information on obtaining and using API keys.
os.environ["OPENAI_API_KEY"] = "your_openai_api_key_here"
- Specifying the Input PDF File Name: Specify the name of the PDF file you placed in the ./data directory.
pdf_file_name = "data/YourFileName.pdf" # e.g., "data/Example.pdf"
- Enabling the PDF Parsing Function: To enable the function that converts the PDF to Markdown, remove the comment symbol (#) from the beginning of the following two lines:
# from MinerU import extract_pdf_to_markdown # Remove '#' from this line
# md_file_path = extract_pdf_to_markdown(pdf_file_name, local_dir) # Remove '#' from this line
Consequently, comment out the following existing line by adding a # at the beginning, as it becomes unnecessary:
# md_file_path = "output/Example/auto/Example.md" # Comment out this line
- Executing the Script: After activating the virtual environment with the conda activate MinerU command, run the script using the following command:
python ./GenerateSlidesOutline.py
Even after following the steps above, several errors may occur. Below are the main errors I encountered and their respective solutions (in no particular order):
-
ModuleNotFoundError: No module named 'magic_pdf.data'
-
subprocess.CalledProcessError: Command '['magic-pdf', 'extract', ...]' returned non-zero exit status 2.
These errors were caused by issues related to invoking the magic-pdf
PDF parsing library. To resolve this, the extract_pdf_to_markdown
function within the MinerU.py
script was significantly modified. The method for specifying arguments for the magic-pdf
command was also changed.
Modified MinerU.py
:
import os
import subprocess
def extract_pdf_to_markdown(pdf_file_name, output_dir):
"""
Simple function to convert PDF file to Markdown (Modified)
"""
# Create the output directory
os.makedirs(output_dir, exist_ok=True)
# Convert PDF to Markdown using the magic-pdf command
pdf_base_name = os.path.splitext(os.path.basename(pdf_file_name))[0]
# Note: The original code might have intended a different md_file_path structure.
# This path is based on the original snippet but might need adjustment
# depending on where magic-pdf actually places the output.
md_file_path = os.path.join(output_dir, f"{pdf_base_name}.md")
# # Original command structure (commented out)
# cmd = [
# "magic-pdf", "extract", pdf_file_name,
# "--output-dir", output_dir,
# "--format", "markdown"
# ]
# subprocess.run(cmd, check=True)
# MinerU.py after modification
cmd = [
"magic-pdf",
"-p", pdf_file_name, # Add "-p" before the input file path
"-o", output_dir # Use "-o" for the output directory
# (Note: The original Japanese comment mentioned either -o or --output-dir might work)
]
# It's good practice to capture output for debugging
try:
subprocess.run(cmd, check=True, capture_output=True, text=True)
except subprocess.CalledProcessError as e:
print(f"Error executing magic-pdf command:")
print(f"Command: {' '.join(e.cmd)}")
print(f"Return Code: {e.returncode}")
print(f"Stderr: {e.stderr}")
print(f"Stdout: {e.stdout}")
raise e # Re-raise the exception after printing details
# Check and return the path of the generated markdown file
# magic-pdf seems to create a nested structure like output_dir/pdf_base_name/auto/pdf_base_name.md
expected_md_path = os.path.join(output_dir, f"{pdf_base_name}/auto/{pdf_base_name}.md")
if os.path.exists(expected_md_path):
return expected_md_path
# If not found in the usual location, search within the output directory.
# This part tries to find the .md file more robustly if the expected path is wrong.
print(f"Warning: Expected Markdown file not found at {expected_md_path}. Searching in {output_dir}...")
for root, _, files in os.walk(output_dir):
for file in files:
# Look for a markdown file containing the base name in its filename
if file.endswith(".md") and pdf_base_name in file:
found_path = os.path.join(root, file)
print(f"Found potential Markdown file: {found_path}")
# It might be necessary to move/rename this file to the expected path
# depending on subsequent script logic.
return found_path # Return the path found
# If no file is found after searching, return the originally constructed (but likely incorrect) path,
# or better, raise an error. The original code returned md_file_path.
print(f"Error: Markdown file could not be located in {output_dir}")
# Returning the original md_file_path might suppress errors later.
# Raising an error is usually better.
# return md_file_path # Original behavior
raise FileNotFoundError(f"Could not find the generated Markdown file in {output_dir} or its subdirectories.")
Disclaimer: The Python code modification above is based on the provided snippet. The exact arguments and output paths for magic-pdf
might vary depending on its version and internal logic. Further adjustments may be needed.
-
Installing the OpenAI Library: If you encounter an error indicating that the
openai
module cannot be found during execution, activate theMinerU
environment (conda activate MinerU
) and install the library by runningconda install openai
orpip install openai
. (The original memo mentioned runningconda install openai
.) -
FileNotFoundError: [Errno 2] No such file or directory: '/home/user/.cache/modelscope/hub/.../weights.pt'
: This error indicates that the pre-trained model (weight file), used for extracting elements like mathematical formulas from the PDF, could not be found. The cause is that the model files are not correctly placed in the expected cache directory.
-
Confirm the location where the model files downloaded via
git lfs clone
(as discussed earlier) are stored. Typically, there should be a subdirectory namedmodels
(approximately 8.8GB) within themodel/PDF-Extract-Kit/
directory of your installation. -
Copy or move this entire
models
directory into the cache directory path shown in the error message, ensuring it resides within that path as a directory namedmodels
. Example: Copy or move the contents of[Your_Download_Location]/model/PDF-Extract-Kit/models to ~/.cache/modelscope/hub/models/opendatalab/PDF-Extract-Kit-1___0/models
. (Note: The exact cache path/home/user/.cache/...
will vary depending on the user and environment.) -
For this reason too, it is highly advisable to ensure that the model file download (
git lfs clone
) completes successfully in the first place.
If formula recognition is not strictly required, this error can sometimes be bypassed by editing the magic-pdf.json
file generated in the home directory and disabling the formula recognition feature (set "formula-config": { "enable": false }
). However, this is generally not recommended as it may significantly degrade the quality of the generated slides for documents containing mathematical formulas, such as academic papers.
The details above cover the installation process and troubleshooting steps I experienced while setting up PaperToSlides. While the setup and error resolution can require patience and technical adjustments, I hope this detailed feedback proves helpful both to other users attempting to utilize this promising tool and potentially to the developers for improving the installation and user experience in the future.