DEV Community

Cover image for πŸ“Š Dropbox User Sentiment Analysis using Airbyte πŸͺΌ and Motherduck πŸ¦†
Abhiraj Adhikary
Abhiraj Adhikary

Posted on

πŸ“Š Dropbox User Sentiment Analysis using Airbyte πŸͺΌ and Motherduck πŸ¦†

In this blog, we'll dive 🌊🀿 into building a Streamlit-based dashboard for analyzing Dropbox User Sentiment using Airbyte for data extraction and Motherduck (DuckDB) for storage and querying. This post continues from our previous discussion in "Leveraging Airbyte πŸͺΌ and Motherduck πŸ¦† for Sentiment Analysis" and explores how these technologies integrate with Streamlit to create an interactive and insightful data analysis application.


πŸ“ Folder Structure Overview

DROPBOX-REVIEWS-ANALYSIS
β”œβ”€β”€ .devcontainer
β”‚   β”œβ”€β”€ devcontainer.json
β”œβ”€β”€ .streamlit
β”‚   β”œβ”€β”€ config.toml
β”œβ”€β”€ assets
β”‚   β”œβ”€β”€ main.png
β”œβ”€β”€ dropbox-reviews-analytics
β”‚   β”œβ”€β”€ src
β”‚   β”‚   β”œβ”€β”€ config
β”‚   β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”‚   β”œβ”€β”€ config.py
β”‚   β”‚   β”œβ”€β”€ utils
β”‚   β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”‚   β”œβ”€β”€ database.py
β”‚   β”‚   β”œβ”€β”€ app.py
β”œβ”€β”€ .env
β”œβ”€β”€ venv
β”œβ”€β”€ .gitignore
β”œβ”€β”€ LICENSE.md
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt
Enter fullscreen mode Exit fullscreen mode
  • .devcontainer/devcontainer.json: Configures development environment.
  • .streamlit/config.toml: Streamlit's UI style and configuration.
  • assets: Stores static assets like images.
  • src/config/config.py: Handles environment variables.
  • src/utils/database.py: Queries data from Motherduck.
  • src/app.py: Streamlit dashboard and logic.
  • .env: Stores environment variables securely.

πŸ‘‰ Tips On Folder Structure

  • DROPBOX-REVIEWS-ANALYSIS: This is the outer folder of repo on Github (while building project on your own don't create this folder)
  • dropbox-reviews-analytics: This is the main folder where you will add src followed by config and utils

🎏 Streamlit Setup

Streamlit is an open-source Python library that enables developers to create interactive web apps for data science and machine learning projects.

πŸ“œ Code Snippet: Streamlit Core Structure

import streamlit as st
import plotly.express as px
from utils.database import get_reviews_for_sentiment

st.set_page_config(page_title="Dropbox Analysis", page_icon="πŸ—³οΈ", layout="wide")

# Title
st.markdown("## πŸ—³οΈ Dropbox Sentiment Analysis")

# Sidebar
sentiment_type = st.sidebar.selectbox("Sentiment Analysis Type", ["Polarity", "Subjectivity"])

# Fetch and Display Data
reviews_df = get_reviews_for_sentiment()
st.dataframe(reviews_df)
Enter fullscreen mode Exit fullscreen mode

When you run app.py with streamlit run src/app.py, the dashboard launches at http://localhost:8501.


πŸ“Š Core Logic of Sentiment Analysis

Sentiment analysis is powered by TextBlob to determine the polarity (positive/negative sentiment) and subjectivity (factual/opinionated content) of reviews.

🧠 Sentiment Analysis Function

from textblob import TextBlob

def get_sentiment(text):
    blob = TextBlob(str(text))
    return blob.sentiment.polarity if sentiment_type == "Polarity" else blob.sentiment.subjectivity

Enter fullscreen mode Exit fullscreen mode

πŸ“ˆ Visualization Example

fig = px.histogram(reviews_df, x='sentiment', title='Sentiment Distribution')
st.plotly_chart(fig)

Enter fullscreen mode Exit fullscreen mode

πŸ¦† *Database Integration with Motherduck *

πŸ”— database.py

import duckdb
from config.config import MOTHERDUCK_TOKEN

def get_connection():
    return duckdb.connect(f"md:?token={MOTHERDUCK_TOKEN}")

def get_reviews_for_sentiment():
    conn = get_connection()
    query = """
    SELECT content, score FROM dropbox_reviews WHERE content IS NOT NULL
    """
    return conn.execute(query).fetch_df()

Enter fullscreen mode Exit fullscreen mode

This code fetches Dropbox review data securely using MOTHERDUCK_TOKEN stored in .env through config.py file.

πŸ—‚οΈ config.py

import os
from dotenv import load_dotenv

load_dotenv()
MOTHERDUCK_TOKEN = os.getenv("MOTHERDUCK_TOKEN")

Enter fullscreen mode Exit fullscreen mode

πŸ”„ Connection Between app.py and database.py

The app.py imports get_reviews_for_sentiment from database.py, creating a seamless flow of data into the dashboard.


βš™οΈ Why devcontainer.json and config.toml?

  • devcontainer.json: Provides a consistent environment for development, anyone willing to use Docker for containerization can use it.
  • config.toml: Controls Streamlit UI customization (e.g., colors, fonts, themes).

Example config.toml:

[theme]
primaryColor="#0061FE"
backgroundColor="#0E1117"
secondaryBackgroundColor="#262730"
textColor="#FAFAFA"
font="Monospace"
Enter fullscreen mode Exit fullscreen mode

⚠️ Deployment Challenges

  • Avoid specifying exact library versions in requirements.txt, like instead of plotly == 5.24.1 write plotly only.
  • Ensure .env is configured correctly in deployment environments.
  • Validate database connection tokens during runtime.

Backend Deployment Flow:

  1. Load environment variables from .env.
  2. Establish connection with Motherduck DB.
  3. Fetch and process data.
  4. Render dashboard in Streamlit.

🎯 Conclusion

We successfully built a Dropbox Reviews Sentiment Analysis Dashboard using Airbyte, Motherduck, and Streamlit. This project demonstrates the power of data analysis and visualization.

πŸ‘¨β€πŸ’» Check out the complete code on GitHub.
πŸ“Ί Live PROJECT https://airbyte-motherduck-hackathon-sentiment-analysis.streamlit.app

Sentiment Analysis #Happy Coding! #AirbyteπŸͺΌπŸ¦†

Top comments (0)