DEV Community

chuongmep
chuongmep

Posted on

Backup Posts Dev.to to SQLite

Dev.to is a great platform for writing blog articles and sharing knowledge. However, I’m concerned that important posts might be deleted at any time due to account-related issues or other reasons beyond our control. Therefore, it’s a good idea to keep backups of all the posts we create to ensure they are preserved.

Good Things

  • You don't need to worries about your post will lost
  • You can use you data for analyst and explore
  • you can use backup data to build another blog or for another purpose.
  • dev.to as a cms system ...

Requirment

First at all, you need to import the library and api able to use API, the API you can generate at Settings > Extensions > Generate a new Key

Image description

Some import library for python:

import requests
import os
import json
import sqlite3
api_key = os.environ['DEVTOAPI']
Enter fullscreen mode Exit fullscreen mode

Backup Acticles To Markdown

def get_posts_response_data(api_key):
    # URL of the API endpoint
    url = "https://dev.to/api/articles/me/published"

    # Headers for the request
    headers = {
        "Content-Type": "application/json",
        "api-key": api_key
    }

    # Send GET request
    response = requests.get(url, headers=headers)

    # Check if request was successful
    if response.status_code == 200:
        # Parse JSON response
        response_data = response.json()
        return response_data
    else:
        # If request was unsuccessful, print error message
        print("Error:", response.text)

def save_dev_post_to_markdown(response,markdown_file_root_path,overwrite=False):
    for article in response:
        published_at = article['published_at']
        title = article['title']
        full_title = "{}-{}".format(published_at.split('T')[0],title)
        markdown_content = article['body_markdown']
        if overwrite == False and os.path.exists('{}/{}.md'.format(markdown_file_root_path,full_title)) == True:
            continue
        if os.path.exists(markdown_file_root_path) == False:
            os.mkdir(markdown_file_root_path)
        if '/' in title:
            title = title.replace('/', '-')
        with open('{}/{}.md'.format(markdown_file_root_path,full_title), 'w',encoding='utf-8') as f:
            f.write(markdown_content)
            print("File saved as {}.md".format(title))
Enter fullscreen mode Exit fullscreen mode

Now you can run function to generate result markdown to storage in folder acticles

# run the function
response = get_posts_response_data(api_key)
save_dev_post_to_markdown(response, 'acticles')
Enter fullscreen mode Exit fullscreen mode

This is result, markdown files generate with datetime-title and save into folder acticles created
Image description

Backup Acticles to SQLite

def insert_article(data,sql_path):
    # Connect to the SQLite database (or create it if it doesn't exist)
    conn = sqlite3.connect(sql_path)
    cursor = conn.cursor()

    # Create the table if it doesn't already exist
    cursor.execute('''
    CREATE TABLE IF NOT EXISTS articles (
        id INTEGER PRIMARY KEY,
        type_of TEXT,
        title TEXT,
        description TEXT,
        published BOOLEAN,
        published_at TEXT,
        slug TEXT,
        path TEXT,
        url TEXT,
        comments_count INTEGER,
        public_reactions_count INTEGER,
        page_views_count INTEGER,
        published_timestamp TEXT,
        body_markdown TEXT,
        positive_reactions_count INTEGER,
        cover_image TEXT,
        tag_list TEXT,
        canonical_url TEXT,
        reading_time_minutes INTEGER,
        user_name TEXT,
        user_username TEXT,
        user_github_username TEXT,
        user_website_url TEXT,
        user_profile_image TEXT
    )
    ''')

    # Prepare the data to insert
    article = {
        "type_of": data.get("type_of"),
        "id": data.get("id"),
        "title": data.get("title"),
        "description": data.get("description"),
        "published": data.get("published"),
        "published_at": data.get("published_at"),
        "slug": data.get("slug"),
        "path": data.get("path"),
        "url": data.get("url"),
        "comments_count": data.get("comments_count"),
        "public_reactions_count": data.get("public_reactions_count"),
        "page_views_count": data.get("page_views_count"),
        "published_timestamp": data.get("published_timestamp"),
        "body_markdown": data.get("body_markdown"),
        "positive_reactions_count": data.get("positive_reactions_count"),
        "cover_image": data.get("cover_image"),
        "tag_list": json.dumps(data.get("tag_list", [])),  # Store tags as JSON string
        "canonical_url": data.get("canonical_url"),
        "reading_time_minutes": data.get("reading_time_minutes"),
        "user_name": data["user"].get("name"),
        "user_username": data["user"].get("username"),
        "user_github_username": data["user"].get("github_username"),
        "user_website_url": data["user"].get("website_url"),
        "user_profile_image": data["user"].get("profile_image")
    }

    # Insert data into the table
    cursor.execute('''
        INSERT OR REPLACE INTO articles (
            id, type_of, title, description, published, published_at, slug, path, url,
            comments_count, public_reactions_count, page_views_count, published_timestamp, 
            body_markdown, positive_reactions_count, cover_image, tag_list, canonical_url, 
            reading_time_minutes, user_name, user_username, user_github_username, 
            user_website_url, user_profile_image
        ) VALUES (
            :id, :type_of, :title, :description, :published, :published_at, :slug, :path, :url, 
            :comments_count, :public_reactions_count, :page_views_count, :published_timestamp, 
            :body_markdown, :positive_reactions_count, :cover_image, :tag_list, :canonical_url, 
            :reading_time_minutes, :user_name, :user_username, :user_github_username, 
            :user_website_url, :user_profile_image
        )
    ''', article)

    # Commit the transaction and close the connection
    conn.commit()
    conn.close()
def insert_articles(response,sql_path):
    for article in response:
        insert_article(article,sql_path)
Enter fullscreen mode Exit fullscreen mode

And you also do same with generate markdown by execute function to save to sqlite :

# Save the response data to a SQLite database
sql_path = './database/articles.db'
insert_articles(response, sql_path)
Enter fullscreen mode Exit fullscreen mode

Result

Image description

Save Comments to SQLite


def create_comments_db(db_path):
    # Connect to the comments database (comments.db)
    conn = sqlite3.connect(db_path)
    cursor = conn.cursor()

    # Create the comments table if it doesn't exist
    cursor.execute('''
    CREATE TABLE IF NOT EXISTS comments (
        id_code TEXT PRIMARY KEY,
        article_id INTEGER,
        created_at TEXT,
        body_html TEXT,
        user_name TEXT,
        user_username TEXT,
        user_github_username TEXT,
        user_profile_image TEXT,
        children TEXT,
        FOREIGN KEY (article_id) REFERENCES articles (id)
    )
    ''')

    # Commit the changes and close the connection
    conn.commit()
    conn.close()

def insert_comment(data, article_id,sql_path):
    # Connect to the comments database
    conn = sqlite3.connect(sql_path)
    cursor = conn.cursor()

    # Prepare the data to insert
    comment = {
        "id_code": data.get("id_code"),
        "article_id": article_id,  # This links the comment to an article
        "created_at": data.get("created_at"),
        "body_html": data.get("body_html"),
        "user_name": data["user"].get("name"),
        "user_username": data["user"].get("username"),
        "user_github_username": data["user"].get("github_username"),
        "user_profile_image": data["user"].get("profile_image"),
        "children": json.dumps(data.get("children", []))  # Store children as a JSON string
    }

    # Insert comment into the table
    cursor.execute('''
        INSERT OR REPLACE INTO comments (
            id_code, article_id, created_at, body_html, user_name, user_username, 
            user_github_username, user_profile_image, children
        ) VALUES (
            :id_code, :article_id, :created_at, :body_html, :user_name, :user_username, 
            :user_github_username, :user_profile_image, :children
        )
    ''', comment)

    # Commit the changes and close the connection
    conn.commit()
    conn.close()
# Assuming article_id is 2222614 (use the correct article ID)
def get_all_acticle_ids(sql_path):
    conn = sqlite3.connect(sql_path)
    cursor = conn.cursor()
    cursor.execute('SELECT id FROM articles')
    article_ids = cursor.fetchall()
    conn.close()
    return article_ids

def patch_insert_comments(sql_acticles_path,sql_comments_path):
    article_ids = get_all_acticle_ids(sql_acticles_path)
    for article_id in article_ids:
        url = f'https://dev.to/api/comments?a_id={article_id[0]}'
        response = requests.get(url)
        comments = response.json()
        for comment in comments:
            insert_comment(comment, article_id[0],sql_comments_path)
Enter fullscreen mode Exit fullscreen mode

Now you also able to save into sqlite by execute the funtion :

# path insert comments
sql_comments_path = './database/comments.db'
create_comments_db(sql_comments_path)
patch_insert_comments(sql_path,sql_comments_path)
Enter fullscreen mode Exit fullscreen mode

So finally you can see beautiful result like this :

Image description

Backup Automation :

You can create notebook to run pipline automation :

name: Run Jupyter Notebooks Weekly

on:
  schedule:
    - cron: '0 0 * * 0'  # Runs every Sunday at 00:00 UTC
  workflow_dispatch: # Manual trigger

jobs:
  run-notebooks:
    runs-on: ubuntu-latest

    steps:
    - name: Checkout repository
      uses: actions/checkout@v3

    - name: Set up Python
      uses: actions/setup-python@v4
      with:
        python-version: 3.9

    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        pip install notebook nbconvert nbformat

    - name: Run Jupyter Notebooks
      env:  # Add your environment variables here
        DEVTOAPI: ${{ secrets.DEVTOAPI }}
      run: |
        for notebook in $(find . -name "*.ipynb"); do
          echo "Running $notebook"
          jupyter nbconvert --to notebook --execute "$notebook" --output "$notebook"
        done
Enter fullscreen mode Exit fullscreen mode

Cheers !!

Open Source :

Top comments (0)