DEV Community

Cover image for Flask Apps from the Future: Unleash Semantic Search and RAG in 10 Minutes or Less
Level09
Level09

Posted on

Flask Apps from the Future: Unleash Semantic Search and RAG in 10 Minutes or Less

In the era of AI-powered applications, the ability to search and retrieve information based on meaning rather than just keywords has become a game-changer. Enter Retrieval-Augmented Generation (RAG), a powerful approach that combines the precision of information retrieval with the flexibility of generative AI. Today, I'm excited to share how you can implement a complete RAG system in just 15 minutes using Enferno, a modern Flask framework.

The Magic of Semantic Search

Before diving into the implementation, let's understand why semantic search is revolutionary. Traditional keyword search is like looking for a book in a library using only the title—if the exact words aren't there, you're out of luck. Semantic search, on the other hand, understands the meaning behind your query, making it possible to find relevant information even when the exact keywords aren't present.

Consider these examples from our demo application:

  • Query: "animals with amazing abilities"

    • Result: Documents about cats' sleeping patterns and communication methods
    • No mention of "amazing abilities" in the document, but the semantic connection is made
  • Query: "space phenomena that defy intuition"

    • Result: Information about Venus's day being longer than its year and Saturn's density
    • The system understands the conceptual link between "defying intuition" and unusual space facts

This is the power of vector embeddings—transforming text into mathematical representations that capture meaning, not just words.

Enter Enferno: The Flask Framework on Steroids

Enferno is a modern Python web framework built on Flask, designed for rapid development of secure and scalable applications. It combines best practices with pre-configured components, making it the perfect foundation for our RAG implementation.

Building RAG with Enferno: A 15-Minute Journey

I've created a complete, ready-to-use RAG implementation that you can clone and run in minutes. The repository is available at github.com/level09/rag.

Let's walk through the key components that make this possible:

1. Vector Storage with PostgreSQL and pgvector

The heart of our RAG system is the ability to store and query vector embeddings efficiently. PostgreSQL with the pgvector extension provides a powerful solution:

class Document(db.Model):
    __tablename__ = 'documents'
    # ...
    embedding = db.Column(Vector(384))  # 384-dimensional embeddings
Enter fullscreen mode Exit fullscreen mode

This simple model definition enables us to store document embeddings directly in our database, making them queryable using vector similarity operations.

2. Generating Embeddings with Sentence Transformers

To convert text into meaningful vector representations, we use the sentence-transformers library:

from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2', device='cpu')

# Generate embedding
embedding = model.encode(text, normalize_embeddings=True)
Enter fullscreen mode Exit fullscreen mode

The all-MiniLM-L6-v2 model is a lightweight yet powerful transformer that generates 384-dimensional embeddings, capturing the semantic essence of text in a compact form.

3. Vector Search with PostgreSQL

The magic happens when we search for documents using vector similarity:

query_embedding = model.encode(search_query, normalize_embeddings=True)

# SQL query using cosine distance
query = f'''
    SELECT 
        id, title, content,
        embedding <=> '{query_embedding_str}'::vector as distance
    FROM documents
    ORDER BY embedding <=> '{query_embedding_str}'::vector
    LIMIT :limit
'''
Enter fullscreen mode Exit fullscreen mode

The <=> operator computes the cosine distance between vectors, allowing us to find the most semantically similar documents to our query.

4. A Beautiful UI for Interaction

What good is powerful search without an intuitive interface? Our implementation includes a clean, responsive UI built with Vue.js and Vuetify:

  • Document creation form for adding content
  • Search interface with adjustable similarity threshold
  • Results display with color-coded similarity scores
  • One-click sample data seeding

Real-World Applications: Beyond the Demo

While our demo focuses on a simple document search system, the applications of this technology are vast:

  1. Customer Support Knowledge Bases: Help agents find relevant information even when customer queries don't match documentation exactly
  2. Legal Document Analysis: Search through case law and contracts based on concepts, not just keywords
  3. Research Assistants: Help researchers find relevant papers across disciplines with different terminology
  4. E-commerce Search: Understand what customers are looking for even when they use non-standard descriptions
  5. Content Recommendation: Suggest related articles based on semantic similarity rather than tag matching

Getting Started in 5 Simple Steps

Ready to try it yourself? Follow these steps:

  1. Clone and Setup:
   git clone https://github.com/level09/rag.git
   cd rag
   ./setup.sh
Enter fullscreen mode Exit fullscreen mode
  1. Configure PostgreSQL with pgvector:
   sudo -u postgres psql
   CREATE DATABASE enferno_rag;
   \c enferno_rag
   CREATE EXTENSION IF NOT EXISTS vector;
Enter fullscreen mode Exit fullscreen mode
  1. Update your .env file:
   SQLALCHEMY_DATABASE_URI=postgresql://postgres:postgres@localhost/enferno_rag
Enter fullscreen mode Exit fullscreen mode
  1. Initialize the Application:
   flask create-db
   flask install
Enter fullscreen mode Exit fullscreen mode
  1. Run and Explore:
   flask run
Enter fullscreen mode Exit fullscreen mode

Visit http://localhost:5000 and start exploring the power of semantic search!

The Future of RAG with Enferno

This implementation is just the beginning. The combination of Enferno's rapid development capabilities with modern AI techniques opens up exciting possibilities:

  • Multi-modal RAG: Extend the system to handle images and other media types
  • Hybrid Search: Combine vector search with traditional keyword search for even better results
  • LLM Integration: Connect the retrieval system to large language models for complete RAG pipelines
  • Fine-tuned Embeddings: Train domain-specific embedding models for specialized applications

Conclusion: The 15-Minute Revolution

Building a RAG system used to require deep expertise in machine learning, vector databases, and web development. With Enferno and the power of modern Python libraries, we've reduced this to a 15-minute implementation that anyone with basic Flask knowledge can understand and extend.

The repository at github.com/level09/rag provides a complete, production-ready starting point for your semantic search journey. Clone it, customize it, and supercharge your applications with the power of meaning-based search.

Happy coding!


Want to learn more about Enferno? Visit the official documentation or check out the main repository.

Top comments (0)