DEV Community

Cover image for A serverless Python app to send public domain gutenberg.org ebooks to your Kindle with one click!
Joe Stech for AWS Community Builders

Posted on • Edited on

A serverless Python app to send public domain gutenberg.org ebooks to your Kindle with one click!

I wrote a little serverless web app that does the following:

  1. Downloads a page from gutenberg.org (Project Gutenberg is a non-profit project to provide free public domain ebooks to the public)
  2. If the url is a book download page, adds a column to the "download this ebook table" which contains a link and an input for an email
  3. After the user puts their kindle email in the input field and clicks the "send to kindle" link, the ebook is sent to the user's kindle!

This was a super fun project that I'm going to walk you through from beginning to end. It uses AWS API Gateway, AWS Lambda, and AWS SES. Before we start digging into the details of how I built it, you can try it out by adding "joe@compellingsciencefiction.com" to your Kindle safe sender list and then adding any gutenberg.org path to https://sendtokindle.compellingsciencefiction.com like this:

https://sendtokindle.compellingsciencefiction.com/ebooks/67368

My little app (which is hosted at a subdomain of my science fiction site) will add the extra "send to kindle" column into the Gutenberg table, you just have to put your kindle email address in the little box and click the send link.

Serverless architecture

Architecture of the send to kindle application

The architecture of the app is simple, just an API Gateway that invokes a Lambda. The Lambda saves some metadata about what book was downloaded and when in S3, downloads the requested ebook from Gutenberg, and sends it to the specified email address. Here's the CDK code that generates the AWS resources:

from aws_cdk import (
aws_lambda as lambda_,
aws_apigateway as apigw,
aws_iam as iam,
aws_ecr as ecr,
aws_certificatemanager as acm,
aws_route53 as route53,
Duration,
Stack)
from constructs import Construct
import os

class SendToKindleStack(Stack):
    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        # execution role
        lambda_role = iam.Role(self, id="sendtokindle-lambda",
            role_name='SendtokindleManagementRole',
            assumed_by=iam.ServicePrincipal("lambda.amazonaws.com"),
            managed_policies= [
                        iam.ManagedPolicy.from_aws_managed_policy_name("service-role/AWSLambdaVPCAccessExecutionRole"),
                        iam.ManagedPolicy.from_aws_managed_policy_name("service-role/AWSLambdaBasicExecutionRole"),
                        iam.ManagedPolicy.from_aws_managed_policy_name("AmazonS3FullAccess"),
                        iam.ManagedPolicy.from_aws_managed_policy_name("AmazonSESFullAccess"),
                    ]
        )

        repo = ecr.Repository.from_repository_name(self, "SendToKindleRepo", "sendtokindlerepo")

        sendtokindle_management_lambda = lambda_.DockerImageFunction(self,
            "CSFsendtokindleManagementLambda",
            code=lambda_.DockerImageCode.from_ecr(
                repository=repo,
                tag=os.environ["CDK_DOCKER_TAG"]
                ),
            role=lambda_role,
            timeout=Duration.seconds(30)
        )

        api = apigw.LambdaRestApi(self,
            "csf-sendtokindle-management-endpoint",
            handler=sendtokindle_management_lambda,
            default_cors_preflight_options=apigw.CorsOptions(allow_origins=["*"])
        )

        custom_domain = apigw.DomainName(
            self,
            "custom-domain",
            domain_name="sendtokindle.compellingsciencefiction.com",
            certificate=acm.Certificate.from_certificate_arn(self,'cert',<cert arn str>),
            endpoint_type=apigw.EndpointType.EDGE
        )

        apigw.BasePathMapping(
            self,
            "base-path-mapping",
            domain_name=custom_domain,
            rest_api=api
        )

        hosted_zone = route53.HostedZone.from_hosted_zone_attributes(
            self,
            "hosted-zone",
            hosted_zone_id=<zone id str>,
            zone_name="compellingsciencefiction.com"
        )

        route53.CnameRecord(
            self,
            "cname",
            zone=hosted_zone,
            record_name="sendtokindle",
    domain_name=custom_domain.domain_name_alias_domain_name
        )
Enter fullscreen mode Exit fullscreen mode

I've put placeholders in a couple of these fields for privacy, but this is verbatim the CDK code I deployed.

As you can see, the longest part of this IaC is the DNS code! If you don't care about a custom domain and just want to use the API Gateway courtesy domain, you don't even need anything past the Gateway specification.

The Lambda Python code

This Lambda pulls double-duty -- it both returns the gutenberg.org HTML (with a modified table), and it also emails the ebook to Kindle if the request sent is to download an epub image. There are many improvements that can be made, but this quick and dirty Lambda works perfectly for what I need.

I'd like to point out a few fun things about this code:

  1. It downloads the ebook to an in-memory BytesIO object instead of a file, and then builds a MIME attachment with the in-memory file.
  2. The code isn't very robust about how it differentiates between calls -- there are a lot of ways to break this Lambda, in which case it'll just return a generic 502 server error.
  3. The code uses AWS SES to send emails via boto3. Be careful when using SES in this way, it's definitely possible for someone to try spending a ton of money on my AWS account by spamming SES sends. You can set up limits on SES (and API Gateway) to mitigate attacks like this.
  4. I use BeautifulSoup to navigate the gutenberg.org HTML to add my column to the download table. I had to introspect the gutenberg.org HTML to find the correct place in the code to insert my column.
from bs4 import BeautifulSoup
import requests
import uuid
import json
import boto3
import time
from email import encoders
from email.mime.base import MIMEBase
from email.mime.application import MIMEApplication
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from io import BytesIO
from urllib.request import urlopen
import urllib

BUCKET = "S3 bucket name str"

def response(code, body):
    return {
            'statusCode': code,
            'headers': {
                'Access-Control-Allow-Headers': 'Content-Type',
                'Access-Control-Allow-Origin': '*',
                'Access-Control-Allow-Methods': 'OPTIONS,POST,GET',
                'Content-Type': 'application/json',
            },
            'body': body
        }


def send_ebook(url, filename, email):
    epubfile = BytesIO()
    print(url)
    try:
        with urlopen(url, timeout=10) as connection:
            epubfile.write(connection.read())
    except urllib.error.HTTPError as e:
        print(e.read().decode())
    from_email = "joe@compellingsciencefiction.com"
    to_email = email
    msg = MIMEMultipart()
    msg["Subject"] = "gutenberg ebook!"
    msg["From"] = from_email
    msg["To"] = to_email

    # Set message body
    body = MIMEText("your book!", "plain")
    msg.attach(body)

    epubfile.seek(0)
    part = MIMEApplication(epubfile.read())
    part.add_header("Content-Disposition",
                    "attachment",
                    filename=filename)
    msg.attach(part)

    # Convert message to string and send
    ses_client = boto3.client("ses", region_name="us-west-2")
    response = ses_client.send_raw_email(
        Source=from_email,
        Destinations=[to_email],
        RawMessage={"Data": msg.as_string()}
    )
    print(response)


def handler(event, context):
    print(event)
    try:
        print(event['path'])
    except:
        pass

    if "epub3.image" in event['path']:
        # this is a request to send an ebook
        path = event['path']
        email = event['queryStringParameters']['email']
        filename = path.replace("/ebooks/","").replace("3.images","")
        send_ebook(f"https://www.gutenberg.org/ebooks{path}", filename, email)
        client = boto3.client('s3')
        upload_id = uuid.uuid4().hex
        payload = {
            "timestamp": time.time(),
            "book_url": path,
            "user_email": email
        }
        client.put_object(Bucket=BUCKET,Key=f'{upload_id}.json', Body=json.dumps(payload).encode('utf-8'))
        return response(200, '{"status":"Sent '+path+' to '+email+'!"}')
    else:
        # return the gutenberg html with added column
        r = requests.get(f"https://www.gutenberg.org/{event['path']}")
        print(r.status_code)
        soup = BeautifulSoup(r.text.replace('"/','"https://www.gutenberg.org/').replace("https://www.gutenberg.org/ebooks","https://sendtokindle.compellingsciencefiction.com/ebooks"),features="html.parser")

        trs = soup.find_all("tr")
        for tr in trs:
            about = tr.get('about')
            if about and 'epub3' in about:
                print(tr)
                epubpath = f'{about.split("ebooks")[1]}'
                soup.append(BeautifulSoup(f"""
                <script>
                function buildlink() {{
                  let text = document.getElementById("kindleemail").value;
                  document.getElementById("csflink").href = "{epubpath}?email=" + text;
                }}
                </script>
                """))
                tr.append(BeautifulSoup(f"<td><a id='csflink' href='/{epubpath}'>Send<br>to<br>kindle<br>email:</a><br><input type='text' id='kindleemail' oninput='buildlink()'></td>", "html.parser"))
            else:
                tr.append(BeautifulSoup("<td class='noprint'></td>", "html.parser"))
        return {
            'statusCode': 200,
            'headers': {
                'Access-Control-Allow-Headers': 'Content-Type',
                'Access-Control-Allow-Origin': '*',
                'Access-Control-Allow-Methods': 'OPTIONS,POST,GET',
                'Content-Type': 'text/html;charset=utf-8',
            },
            'body': soup.prettify()
        }
Enter fullscreen mode Exit fullscreen mode

If any of this inspires you to create your own little utility web app, please tell me about it (you have my compellingsciencefiction.com email address now :). I love hearing about projects like this.

Top comments (1)

Collapse
 
aymanmahmoud33 profile image
Ayman Aly Mahmoud

Lovely idea