How About Ditching the Hype: Do We Really Need a Specialized Vector Database?

#database #chatgpt #python #discuss

With the emergence of Generative AI, vector databases have surged in popularity. They've found their niche in powering Retrieval Augmented Generation (RAG) applications. However, as we delve into the landscape of databases, a common trend emerges: nearly every database provider is incorporating vector search capabilities into their offerings. It's a strategic move driven by the fact that vector search is integral to capturing a substantial share of the RAG workload.

Some of the major releases:

Databricks: Databricks introduces new generative AI tools.
Pgvector and Pgvector.rs: Postgres extension that provides vector similarity search.
Cloudflare launches Vectorize: A vector database for shipping AI-powered applications to production, fast.
MongoDB Atlas Vector Search: Vector Search capability designed to meet the demands of data.
Elastic - Vector search powers the next generation of search experiences
Oracle Integrated Vector Database: Integrated Vector Database to Augment Generative AI.
Sqlite-vss: A SQLite extension for efficient vector search, based on Faiss.
PlanetScale: Adding vector storage and search to MySQL.

So, the big question is: Is all this effort going to make the difference between vector and other databases disappear over time? Open thoughts 🤔

Why might customers consider moving to a separate database for vector search when their current database provider already offers vector search capabilities?
Will these databases come with RAG capabilities right out of the box, or will libraries like Langchain and llama-index be used as ETL pipelines on top of these databases to facilitate RAG?
Conversely, can these extensions or bolt-on vector search supports meet the scalability, latency, cost, and index freshness requirements of applications?
What if a specialized architectural change is needed to handle vector search due to the massive embedding size?
Perhaps both options will coexist, but for smaller workloads, the difference in performance and cost between specialized vector databases and built-in support may not be significant enough to justify maintaining a new database.

Sources:

Top comments (1)

Gaurav Tarlok Kakkar • Oct 5 '23

Let me know if I missed any other major release.

DEV Community

How About Ditching the Hype: Do We Really Need a Specialized Vector Database?

Top comments (1)

Read next

Dockerizing SQL Server with Pre-Restored Databases

RAG Web Scraping

NutritionAI: Your Personal AI-Powered Nutrition Guide for Smarter Eating!

A Playground for SQL RDBMS, custom Ollama API Interaction with RAG on Timescale DB and download of query plus result