Merge Pdfs automatically in a folder based on the file names.
A python module to merge a bunch of Pdf files.
A few months ago now, I purchased this mobile scanner .
The IRIScan Anywhere scanner.
This scanner is very portable and I am using it very often to scan all sort of documents I am receiving over mail and that I want to archive to be sure that I have a digital copy.
It is also useful to scan and store the receipt I receive from stores.
The scanner can’t merge files during scanning. It must be done on post-processing through their software.
That’s why I wanted to automate the process a little.
Let’s consider that you scan (or have by any other method) a lot of different pdf files.
You need to rename your files according to the following pattern :
“file name1 - part1.pdf“
…
“file name1 - partN.pdf“
“file name2 - part1.pdf“
…
“file name2 - partN.pdf“
See a folder that is properly prepared on the image below :
The files that have the same name (“file name1“ for instance) are going to be merged together.
As a start, you need to install the PyPDF2 python module to your computer as it is necessary to manipulate the pdf files.
pip install PyPDF2
You could then call the module like this, when you are in the folder that has the “mergepdfs.py“ file :
mergepdfs.py ./path/to/the/pdf/files/I/want/to/merge
I use powershell to do this on Windows :
and you can see that the files were merged properly.
A “result” directory is created to store the merged files.
Then the files are named as were their parts and available in the “result” folder.