pdf2htmlEX is a tool that allows you to convert PDF to HTML without losing text or format. pdf2htmlEX renders PDF files in HTML, using modern Web technologies. It is very useful if you want to convert academic papers with lots of formulas and figures to HTML format
This post will show you how to install pdf2htmlEX on Ubuntu 20.04 LTS.
As at the time of writing this post pdf2htmlEX is no longer packaged by Debian/Ubuntu, you will need to install from the pdf2htmlEX Debian archives (*.deb).
To get started you will need to install the dependencies:
sudo apt update
sudo apt install -y libfontconfig1 libcairo2 libjpeg-turbo8
If you get error about unmet dependencies run the following to fix broken packages
sudo apt apt --fix-broken install
Download latest *.deb package from pdf2htmlEX repository
wget https://github.com/pdf2htmlEX/pdf2htmlEX/releases/download/v0.18.8.rc1/pdf2htmlEX-0.18.8.rc1-master-20200630-Ubuntu-bionic-x86_64.deb
sudo mv pdf2htmlEX-0.18.8.rc1-master-20200630-Ubuntu-bionic-x86_64.deb pdf2htmlEX.deb
Install the package
sudo apt install ./pdf2htmlEX.deb
It is very important that you use a (relative or absolute) path to the *.deb file. It is the ./ in front of the pdf2htmlEX.deb file name which tells apt install that it is supposed to install a local file rather than a package name in apt install's internal package database.
Alternatively you could use the following commands:
sudo dpkg -i pdf2htmlEX.deb
sudo apt install -f
Test your installation
pdf2htmlEX -v
You should see something like this:
pdf2htmlEX version 0.18.8.rc1
Copyright 2012-2015 Lu Wang <coolwanglu@gmail.com> and other contributors
Libraries:
poppler 0.89.0
libfontforge (date) 20200314
cairo 1.16.0
Default data-dir: /usr/local/share/pdf2htmlEX
Poppler data-dir: /usr/local/share/pdf2htmlEX/poppler
Supported image format: png jpg svg
Top comments (5)
Thank you! I was trying to install it from scratch with no success, but your tutorial helped me a lot!
Thank you so much
Hi there! Do you know if it Is possible to build a aws lambda package for this library running amazonlinux2023 py 3.13 runtime?
Thank you! Your tutorial helped me as well.
There's a typo (double apt) in: sudo apt apt --fix-broken install
Great post!. Helped me a lot.