DEV Community

Alain Airom
Alain Airom

Posted on

A REST Implementation of ‘Docling’ with FastAPI

A sample application showing Docling implementation, exposing as a REST service, using FastPI in Python.

Image description

Introduction

This is yet another request for a project (much simplified) to produce a REST application based on Docling document conversion from PDF to markdown using FastAPI.

First things first, what is Docling and what does it do if you haven’t heard of it 😲😳

What is Docling

Docling simplifies document processing, parsing diverse formats — including advanced PDF understanding — and providing seamless integrations with the gen AI ecosystem.

Docling Features

  • 🗂️ Parsing of multiple document formats incl. PDF, DOCX, XLSX, HTML, images, and more
  • 📑 Advanced PDF understanding incl. page layout, reading order, table structure, code, formulas, image classification, and more
  • 🧬 Unified, expressive DoclingDocument representation format
  • ↪️ Various export formats and options, including Markdown, HTML, and lossless JSON
  • 🔒 Local execution capabilities for sensitive data and air-gapped environments
  • 🤖 Plug-and-play integrations incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI
  • 🔍 Extensive OCR support for scanned PDFs and images
  • 💻 Simple and convenient CLI

Implementation

The first sample application is meant to be used on a CPU based platform.

The recommended steps are shown below.

  • Create a venv.
#macos/linux version
python3.11 -m venv myenv
source myenv/bin/activate
Enter fullscreen mode Exit fullscreen mode
  • Install the required packages and dependencies.
pip install fastapi uvicorn docling python-multipart
Enter fullscreen mode Exit fullscreen mode
  • Write the code! 😊
from fastapi import FastAPI, UploadFile, File, HTTPException
from docling.document_converter import DocumentConverter, ConverterConfig
from pathlib import Path

app = FastAPI()

@app.post("/convert_pdf_to_markdown/")
async def convert_pdf_to_markdown(file: UploadFile = File(...)):
    if file.content_type != "application/pdf":
        raise HTTPException(status_code=400, detail="Invalid file type. Only PDF files are allowed.")

    try:
        temp_file_path = Path(f"./temp_{file.filename}")
        with open(temp_file_path, "wb") as f:
            f.write(await file.read())

        config = ConverterConfig(ocr_backend="default", layout_backend="default")
        converter = DocumentConverter(config=config)

        result = converter.convert(str(temp_file_path))
        markdown_output = result.document.export_to_markdown()

        temp_file_path.unlink()

        return {"markdown": markdown_output}

    except Exception as e:
        if temp_file_path.exists():
            temp_file_path.unlink()
        raise HTTPException(status_code=500, detail=f"An error occurred during conversion: {str(e)}")
Enter fullscreen mode Exit fullscreen mode
  • Run the code in a Terminal.
# you can change the defauly port number from 8000 to another one
# if already in use
uvicorn main:app --port 8080
Enter fullscreen mode Exit fullscreen mode
  • Test your code in another Terminal (duh 🤓)
curl -X POST -F "file=@./docker-commands.pdf" http://127.0.0.1:8080/convert_pdf_to_markdown/
Enter fullscreen mode Exit fullscreen mode
  • Sample output.
{"markdown":"## All Docker Commands\n\nHere's a comprehensive list of commonly used Docker commands, along with their usage:\n\n## Basic Docker Commands\n\n- 1. ocker version d Displays Docker version information. :\n- 2. ocker info d Provides detailed information about the   Docker installation. :\n- 3. ocker --help d Lists all available Docker commands   and options. :\n\n## mage Management Commands I\n\n- 1. ocker pull <image> d Downloads an image from a Docker   registry (e.g., Docker : Hub).\n- ○ Example: ocker pull nginx d 2. ocker images d Lists all Docker images available on   the system. : ○ Example: ocker images d 3. ocker rmi <image> d Deletes a Docker image from the   system. : ○ Example: ocker rmi nginx d 4. ocker build -t <name> <path> d Builds a Docker image   from a Dockerfile. : ○ Example: ocker build -t myapp:latest . d 5. ocker tag <source\\_image> <target\\_image> d Tags an   image with a new : name. ○ Example: ocker tag nginx:latest myrepo/nginx:v1 d 6. ocker save -o <file> <image> d Saves an image to a   tar archive. : ○ Example: ocker save -o nginx.tar nginx:latest d 7. ocker load -i <file> d Loads an image from a tar archive. :\n- ocker load -i nginx.tar\n- ○ Example: d\n\n## Container Management Commands\n\n- 1. ocker run <image> d Creates and starts a new container   from an image. :\n- ○ Example: d\n- 2. ocker run -d <image> d : background).\n- ○ Example: d\n- 3. ocker run -it <image> d Runs a container interactively with a terminal. :\n- ○ Example: d\n- 4. ocker ps d :\n- ○ Example: d\n- 5. ocker ps -a d :\n- ○ Example: ocker ps -a d\n- 6. ocker stop <container> d Stops a running container. :\n- ○ Example: ocker stop my\\_container d\n- 7. ocker start <container> d Starts a stopped container. :\n- ○ Example: ocker start my\\_container d\n- 8. ocker restart <container> d Restarts a container. :\n- ○ Example: ocker restart my\\_container d\n- 9. ocker rm <container> d Deletes a stopped container. :\n- ○ Example: ocker rm my\\_container d\n- 10. ocker exec -it <container> <command> d Executes a   command in a : unning container. r\n- ○ Example: ocker exec -it my\\_container bash d\n- 1. 1 ocker logs <container> d Displays logs from a container. :\n- ○ Example: ocker logs my\\_container d\n- 12. ocker attach <container> d Attaches to a running container's   console. :\n- ○ Example: ocker attach my\\_container d\n- 13.\n- ocker kill <container> d Forcefully stops a container. :\n- ○ Example: ocker kill my\\_container d\n\n```

\nocker run nginx Runs a container in detached   mode (in the ocker run -d nginx ocker run -it ubuntu bash Lists all running containers. ocker ps Lists all containers, including stopped   ones.\n

```\n\n## Container Networking Commands\n\n- 1. ocker network ls d Lists all Docker networks. :\n- ○ Example: ocker network ls d\n- 2. ocker network create <name> d Creates a new Docker   network. :\n- ○ Example: ocker network create my\\_network d\n- 3. ocker network rm <name> d Deletes a Docker network. :\n- ○ Example: ocker network rm my\\_network d\n- 4. ocker network connect <network> <container> d Connects   a container to : a network.\n- ○ Example: ocker network connect my\\_network my\\_container d\n- 5. ocker network disconnect <network> <container> d Disconnects   a : ontainer from a network. c\n- ○ Example: ocker network disconnect my\\_network my\\_container d\n\n## Volume Management Commands\n\n- 1. ocker volume ls d Lists all Docker volumes. :\n- ○ Example: ocker volume ls d\n- 2. ocker volume create <name> d Creates a new Docker volume. :\n- ○ Example: ocker volume create my\\_volume d\n- 3. ocker volume rm <name> d Deletes a Docker volume. :\n- ○ Example: ocker volume rm my\\_volume d\n- 4. ocker volume inspect <name> d Displays detailed information   about a volume. :\n- ○ Example: ocker volume inspect my\\_volume d 5. ocker run -v <volume>:/path/in/container <image> d Mounts a volume : nto a container. i ○ Example: ocker run -v my\\_volume:/data nginx d\n\n## Dockerfile Commands\n\n- 1. ocker build -f <Dockerfile> d Builds an image from   a specific Dockerfile. : ○ Example: ocker build -f Dockerfile . d\n\n## Docker Compose Commands\n\n- 1. ocker-compose up d Starts containers defined in a : ocker-compose.yml d ile. f\n- ocker-compose up\n- ○ Example: d\n- 2. ocker-compose down d Stops and removes containers,   networks, and volumes : reated by c ocker-compose up d .\n- ○ Example: ocker-compose down d\n- 3. ocker-compose ps d Lists containers created by Docker   Compose. :\n- ○ Example: ocker-compose ps d\n- 4. ocker-compose logs d Shows logs for containers managed   by Docker Compose. :\n- ○ Example: ocker-compose logs d\n- 5. ocker-compose build d Builds or rebuilds services   defined in a Compose file. :\n- ○ Example: ocker-compose build d\n\n## mage and Container Inspection I\n\n- 1. ocker inspect <container\\_or\\_image> d Returns low-level   information about : a container or image.\n- ○ Example: ocker inspect my\\_container d\n- 2. ocker top <container> d Displays running processes   in a container. :\n- ○ Example: ocker top my\\_container d\n- 3. ocker stats d Displays resource usage statistics of   running containers. :\n- ○ Example: ocker stats d\n\n## System Cleanup Commands\n\n- 1. ocker system df d :\n- ○ Example: d\n- 2. ocker system prune d : networks, dangling images).\n- ○ Example: d\n- 3. ocker image prune d :\n- ○ Example: d\n- 4. ocker container prune d Removes all stopped containers. :\n- ○\n- Example: d\n\n```

\nDisplays information about disk   usage by Docker. ocker system df Removes unused data (stopped containers, unused ocker system prune Removes unused and dangling images. ocker image prune\n

```\n\nocker container prune\n\n## Other Commands\n\n- 1. ocker commit <container> <image> d Creates a new image   from a : ontainer's changes. c\n- ○ Example: ocker commit my\\_container my\\_image d\n- 2. ocker export <container> d Exports a container's filesystem   to a tar archive. :\n- ○\n- Example: ocker export my\\_container > container.tar d\n- 3. ocker import <file> d Imports a tarball to create   an image. :\n- ○ Example: ocker import container.tar my\\_imported\\_image d\n\nThese Docker commands cover the most common activities when working with Docker, anging from managing containers, images, and volumes to orchestrating multi-container r applications with Docker Compose. Mastering these commands helps in efficiently creating, deploying, and managing containerized applications."}%            
Enter fullscreen mode Exit fullscreen mode

Et voilà!

Conclusion

This article showed a simple REST implementation of Docling conversion capacities.

Stay tuned for more stories 🔜

Useful links

Top comments (0)