How to Merge Pdf Files in Python using PyPdf2

How to Merge Pdf Files in Python using PyPdf2

PyPdf2 is a Python package that allows you to manipulate pdf files. You can use it for cropping, splitting e.t.c easily. In this tutorial, you will learn how to merge two or more pdf files using Pypdf2 module though steps.

Steps to Merge Pdf Files in Python using PyPdf2

In this section, you will know all the steps to merge two pdf files. Just follow the steps for deep understanding.

Step 1: Import the necessary library

I am using only pypdf2 module so let’s import it using the improved statement.

import PyPDF2
If you are getting the error No module named PyPDF2 then you have to install PyPDF2 in your system.

Step 2: Create a function

In this step will create a function that will take two parameters. One is the list of input pdf files and the name of the output file. Inside the function below are the lines of merging the pdf files.

merger = PyPDF2.PdfMerger()

    for input_file in input_files:
        with open(input_file, 'rb') as pdf_file:
            merger.append(pdf_file)

    with open(output_file, 'wb') as output:
        merger.write(output)

Here you can see I have called the PdfMerger() constructor using the PyPDF2 package. After that used the for loop that will iterate the list of pdf files and open it in rb mode. Then each pdf file will be merged with the existing one in each iteration using the append() function. Lastly you will output the file using the write() function to the desired filename.

Step 3: Call the function

After the creation of the function, you will call the function. The path of all the input pdfs should be in the list and also the path of the out file should be passed as arguments for the above function created.

Use the below lines of code to merge the pdfs.

input_files = ["files/file1.pdf", "files/file2.pdf", "files/file3.pdf"]
output_file = "merged.pdf"
merge_pdfs(input_files, output_file)

Below is the full code for the above steps.

import PyPDF2

def merge_pdfs(input_files, output_file):
    merger = PyPDF2.PdfMerger()

    for input_file in input_files:
        with open(input_file, 'rb') as pdf_file:
            merger.append(pdf_file)

    with open(output_file, 'wb') as output:
        merger.write(output)

if __name__ == "__main__":
    # List of input PDF files to merge
    input_files = ["files/file1.pdf", "files/file2.pdf", "files/file3.pdf"]

    # Output file name for the merged PDF
    output_file = "merged.pdf"

    merge_pdfs(input_files, output_file)

Code to merge pdf files
Code to merge pdf files

Conclusion

PyPDF2 package is the best package for any pdf file manipulation. The above steps will allow you to Merge Pdf Files in Python. The next step is to Make a Flask Api that will take pdf files as input and allows the user to download the file.

I hope you have liked this tutorial. If you have any queries then you can contact us for more help.

Join our list

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for signup. A Confirmation Email has been sent to your Email Address.

Something went wrong.

Meet Sukesh ( Chief Editor ), a passionate and skilled Python programmer with a deep fascination for data science, NumPy, and Pandas. His journey in the world of coding began as a curious explorer and has evolved into a seasoned data enthusiast.
 
Thank you For sharing.We appreciate your support. Don't Forget to LIKE and FOLLOW our SITE to keep UPDATED with Data Science Learner