Converting Word to PDF is a common task in our daily life as it it brings us several benefits such as:š
Wide compatibility: PDF supports a wide range of operating systems, devices and software applications. By converting Word documents to PDF, you can ensure consistent viewing and accessing of files across different platforms without any formatting or compatibility issues.
Reduce file size: PDF files tend to be smaller than Word documents. Converting Word to PDF makes files easier to share via email or upload to a website.
Easy to print: PDF is designed for high quality printing. Converting a Word document to PDF ensures that the layout, fonts, and images of the document will be consistent when printed, no matter what printer or settings are used.
āØ In this article, we will explore how to programmatically convert Word (Doc/ Docx) to PDF in Python using third-party Python library (Spire.Doc for Python). The following is the pip command to install it:
pip install Spire.Doc
Let's start!
šConvert Word to PDF with Python
from spire.doc import *
from spire.doc.common import *
# Create word document
document = Document()
# Load a doc or docx file
document.LoadFromFile("sample.docx")
#Save the document to PDF
document.SaveToFile("ToPDF.pdf", FileFormat.PDF)
document.Close()
In the ablove code, we just load a .doc or .docx file and then it can be converted to a PDF file through the Document.SaveToFile() method.
The result file:
The Python Word API also offers the ToPdfParameterList class that allows us to convert Word to PDF with additional conversion settings.
ā” We can encrypt the generated PDF file using the below code.
# Set open password and permission password for the generated PDF
openPsd = "abcde"
permissionPsd = "12345"
parameter.PdfSecurity.Encrypt(openPsd, permissionPsd, PdfPermissionsFlags.Default, PdfEncryptionKeySize.Key128Bit)
# Save the Word document to PDF
document.SaveToFile("ToPdfWithPassword.pdf", parameter)
ā” Embed fonts in the generated PDF to ensure consistent document appearance.
# Create a ToPdfParameterList object
parameter = ToPdfParameterList()
# Embed fonts in PDF
parameter.IsEmbeddedAllFonts = True
ā” In addition, the bookmarks of the original Word document can also be preserved when converting, or we can create new bookmarks based on the headings.
# Create a ToPdfParameterList object
parames = ToPdfParameterList()
# Create bookmarks from Word headings
parames.CreateWordBookmarksUsingHeadings = True
# Create bookmarks in PDF from existing bookmarks in Word
parames.CreateWordBookmarks = True
# Save the document to PDF
document.SaveToFile("output/ToPdfWithBookmarks.pdf", FileFormat.PDF)
document.Close()
Refer: https://www.e-iceblue.com/Introduce/doc-for-python.html
Also Learn:
Convert Word to Images in Python
Convert Word to HTML in Python
Convert HTML to Word in Python
Create, Read, or Update a Word Document in Python
Top comments (2)
For anyone wondering, spire.doc is not available for free. Author did a very bad job here not notifying us about the watermark.
Really thanks @frankyyyyy this comment saves me a lot of time.