In this tutorial we will explore how to convert HTML files to PDF using Python.
Table of Contents
- Introduction
- Sample HTML file
- Convert HTML file to PDF using Python
- Convert Webpage to PDF using Python
- Conclusion
Introduction
There are several online tools that allow you to convert HTML files and webpages to PDF, and most of them are free.
While it is a simple process, being able to automate it can be very useful for some HTML code testing as well as saving required webpages as PDF files.
To continue following this tutorial we will need:
- wkhtmltopdf
- pdfkit
wkhtmltopdf is an open source command line tool to render HTML files into PDF using the Qt WebKit rendering engine.
In order to use it in Python, we will also need the pdfkit library which is a wrapper for wkhtmltopdf utility.
First, search for the wkhtmltopdf installer for your operating system. For Windows, you can find the latest version of wkhtmltopdf installer here. Simply download the .exe file and install on your computer.
Remember the path to the directory where it will be installed.
In my case it is: C:\Program Files\wkhtmltopdf
If you don’t have the Python library installed, please open “Command Prompt” (on Windows) and install it using the following code:
pip install pdfkit
Sample HTML file
In order to continue in this tutorial we will need some HTML file to work with.
Here is a sample HTML file we will use in this tutorial:
If you download it and open in your browser, you should see:
and opening it in the code editor should show:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Hello!</title>
</head>
<body>
<h1>Welcome to my YouTube channel!</h1>
<p>This is a sample HTML file.</p>
</body>
</html>
Convert HTML file to PDF using Python
Let’s start with converting HTML file to PDF using Python.
The sample.html file is located in the same directory as the main.py file with the code:
First, we will need to find the path to the wkhtmltopdf executable file wkhtmltopdf.exe
Recall that we installed in C:\Program Files\wkhtmltopdf meaning that the .exe file is in that folder. Navigating to it, you should see that the path to executable file is: C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe
Now we have everything we need and can easily convert HTML file to PDF using Python:
import pdfkit
#Define path to wkhtmltopdf.exe
path_to_wkhtmltopdf = r'C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe'
#Define path to HTML file
path_to_file = 'sample.html'
#Point pdfkit configuration to wkhtmltopdf.exe
config = pdfkit.configuration(wkhtmltopdf=path_to_wkhtmltopdf)
#Convert HTML file to PDF
pdfkit.from_file(path_to_file, output_path='sample.pdf', configuration=config)
And you should see sample.pdf created in the same directory:
which should should look like this:
Convert Webpage to PDF using Python
Using pdfkit library you can also convert webpages into PDF using Python.
Let’s convert the wkhtmltopdf project page to PDF!
In this section we will reuse most of the code from the previous section, except now instead of using HTML file we will use the URL of a webpage and the .from_url() method of pdfkit class:
import pdfkit
#Define path to wkhtmltopdf.exe
path_to_wkhtmltopdf = r'C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe'
#Define url
url = 'https://wkhtmltopdf.org/'
#Point pdfkit configuration to wkhtmltopdf.exe
config = pdfkit.configuration(wkhtmltopdf=path_to_wkhtmltopdf)
#Convert Webpage to PDF
pdfkit.from_url(url, output_path='webpage.pdf', configuration=config)
And you should see webpage.pdf created in the same directory:
which should should look like this:
Conclusion
In this article we explored how to convert HTML to PDF using Python and wkhtmltopdf.
Feel free to leave comments below if you have any questions or have suggestions for some edits and check out more of my Python for PDF tutorials.