You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 19, 2023. It is now read-only.
Copy file name to clipboardExpand all lines: README.md
+31-5Lines changed: 31 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -2,15 +2,17 @@
2
2
3
3
A simple command line tool to split a PDF file according to the pages content.
4
4
5
-
# Motivation
5
+
##Motivation
6
6
7
7
This project is the solution to automate a process that I was doing completely manually.
8
8
There is a folder that contains ~250 PDF files. Each file contains one or many documents of a person, and they need to be split into different files each of which has a new name according to a format,
9
9
e.g. a file is named `1234.pdf` where `1234` is the ID of the person. This file contains the person's degree in the first 2 pages and the ID card in the 3rd page.
10
10
So the degree will be extracted from pages 1 and 2 and named `1234-DEGREE-BACH.pdf`, assuming it's a bachelor's, and the ID card from page 3 with name `1234-IDCARD.pdf`.
11
11
12
12
The original files don't follow a guideline, thus it's not possible to anticipate the order of the person's documents or if any of them is present at all, so I could only think about automatizing the process of splitting and naming the new files.
13
-
As a result, I developed a CLI app specifically for this problem:
13
+
As a result, I developed a simple CLI specifically for this problem.
14
+
15
+
If you run [`main.py`](/pdf-splitter/main.py) using the `--help` argument you'll see the following:
14
16
15
17
```txt
16
18
$ py pdf-splitter/main.py -h
@@ -27,13 +29,37 @@ optional arguments:
27
29
first and last page numbers to split from the input file
28
30
```
29
31
30
-
## Usage
32
+
### Usage
33
+
34
+
I'm using a Windows executable generated with [`PyInstaller`](https://github.com/pyinstaller/pyinstaller), which is available in [Releases](https://github.com/netotz/pdf-splitter/releases), because the client PC containing the PDFs folder doesn't have Python installed.
35
+
36
+
Arguments `-s` and `-p` can be repeated as many times as needed, but both have to be repeated the same number of times.
37
+
So if a file needs to be split into 3 new files, 3 file names will need to be specified, each one with `-s`, and also 3 page ranges with `-p`, respectively.
31
38
32
-
I'm using an executable generated with `pyinstaller` because the client PC that has the files doesn't have Python installed.
39
+
The page numbers start in 0.
33
40
34
41
### Examples
35
42
36
-
Following the previous example:
43
+
Assume this is the folder, and you're using the executable:
44
+
45
+
```txt
46
+
folder/
47
+
├─ pdf-splitter.exe
48
+
├─ 1234.pdf
49
+
```
50
+
51
+
Following the previous example, running this inside `folder/`:
0 commit comments