In this blog, we'll dive ππ€Ώ into building a Streamlit-based dashboard for analyzing Dropbox User Sentiment using Airbyte for data extraction and Motherduck (DuckDB) for storage and querying. This post continues from our previous discussion in "Leveraging Airbyte πͺΌ and Motherduck π¦ for Sentiment Analysis" and explores how these technologies integrate with Streamlit to create an interactive and insightful data analysis application.
π Folder Structure Overview
DROPBOX-REVIEWS-ANALYSIS
βββ .devcontainer
β βββ devcontainer.json
βββ .streamlit
β βββ config.toml
βββ assets
β βββ main.png
βββ dropbox-reviews-analytics
β βββ src
β β βββ config
β β β βββ __init__.py
β β β βββ config.py
β β βββ utils
β β β βββ __init__.py
β β β βββ database.py
β β βββ app.py
βββ .env
βββ venv
βββ .gitignore
βββ LICENSE.md
βββ README.md
βββ requirements.txt
- .devcontainer/devcontainer.json: Configures development environment.
- .streamlit/config.toml: Streamlit's UI style and configuration.
- assets: Stores static assets like images.
- src/config/config.py: Handles environment variables.
- src/utils/database.py: Queries data from Motherduck.
- src/app.py: Streamlit dashboard and logic.
- .env: Stores environment variables securely.
π Tips On Folder Structure
- DROPBOX-REVIEWS-ANALYSIS: This is the outer folder of repo on Github (while building project on your own don't create this folder)
- dropbox-reviews-analytics: This is the main folder where you will add src followed by config and utils
π Streamlit Setup
Streamlit is an open-source Python library that enables developers to create interactive web apps for data science and machine learning projects.
π Code Snippet: Streamlit Core Structure
import streamlit as st
import plotly.express as px
from utils.database import get_reviews_for_sentiment
st.set_page_config(page_title="Dropbox Analysis", page_icon="π³οΈ", layout="wide")
# Title
st.markdown("## π³οΈ Dropbox Sentiment Analysis")
# Sidebar
sentiment_type = st.sidebar.selectbox("Sentiment Analysis Type", ["Polarity", "Subjectivity"])
# Fetch and Display Data
reviews_df = get_reviews_for_sentiment()
st.dataframe(reviews_df)
When you run app.py
with streamlit run src/app.py
, the dashboard launches at http://localhost:8501.
π Core Logic of Sentiment Analysis
Sentiment analysis is powered by TextBlob to determine the polarity (positive/negative sentiment) and subjectivity (factual/opinionated content) of reviews.
π§ Sentiment Analysis Function
from textblob import TextBlob
def get_sentiment(text):
blob = TextBlob(str(text))
return blob.sentiment.polarity if sentiment_type == "Polarity" else blob.sentiment.subjectivity
π Visualization Example
fig = px.histogram(reviews_df, x='sentiment', title='Sentiment Distribution')
st.plotly_chart(fig)
π¦ *Database Integration with Motherduck *
π database.py
import duckdb
from config.config import MOTHERDUCK_TOKEN
def get_connection():
return duckdb.connect(f"md:?token={MOTHERDUCK_TOKEN}")
def get_reviews_for_sentiment():
conn = get_connection()
query = """
SELECT content, score FROM dropbox_reviews WHERE content IS NOT NULL
"""
return conn.execute(query).fetch_df()
This code fetches Dropbox review data securely using MOTHERDUCK_TOKEN stored in .env
through config.py
file.
ποΈ config.py
import os
from dotenv import load_dotenv
load_dotenv()
MOTHERDUCK_TOKEN = os.getenv("MOTHERDUCK_TOKEN")
π Connection Between app.py
and database.py
The app.py
imports get_reviews_for_sentiment
from database.py
, creating a seamless flow of data into the dashboard.
βοΈ Why devcontainer.json
and config.toml
?
- devcontainer.json: Provides a consistent environment for development, anyone willing to use Docker for containerization can use it.
- config.toml: Controls Streamlit UI customization (e.g., colors, fonts, themes).
Example config.toml
:
[theme]
primaryColor="#0061FE"
backgroundColor="#0E1117"
secondaryBackgroundColor="#262730"
textColor="#FAFAFA"
font="Monospace"
β οΈ Deployment Challenges
- Avoid specifying exact library versions in
requirements.txt
, like instead ofplotly == 5.24.1
writeplotly
only. - Ensure
.env
is configured correctly in deployment environments. - Validate database connection tokens during runtime.
Backend Deployment Flow:
- Load environment variables from
.env
. - Establish connection with Motherduck DB.
- Fetch and process data.
- Render dashboard in Streamlit.
π― Conclusion
We successfully built a Dropbox Reviews Sentiment Analysis Dashboard using Airbyte, Motherduck, and Streamlit. This project demonstrates the power of data analysis and visualization.
π¨βπ» Check out the complete code on GitHub.
πΊ Live PROJECT https://airbyte-motherduck-hackathon-sentiment-analysis.streamlit.app
Top comments (0)