🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

#ai #vectordatabase #postgres #opensource

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

But developers wanted pgai Vectorizer to work with familiar application-building tools and support more embedding providers. Today, we're making that happen with two key updates:

SQLAlchemy support: The most widely used Python ORM, SQLAlchemy lets you work with databases using Python instead of raw SQL. Now, pgai Vectorizer integrates directly, so you can store and query vector embeddings like any other column using SQLAlchemy queries.
More embedding models via LiteLLM: LiteLLM provides a simple API for multiple embedding providers. Now, pgai Vectorizer works with OpenAI, Cohere, Hugging Face, Mistral, Azure OpenAI, AWS Bedrock, and Google Vertex—all from a single SQL command.

These updates make working with embeddings in Python simpler and more flexible—no extra complexity, just tools that fit into your existing stack.

(More on why we built pgai Vectorizer: Vector Databases Are the Wrong Abstraction.)

SQLAlchemy: Store and Query Vectors Like Any Other Column

SQLAlchemy, a Python ORM that abstracts SQL queries, now integrates with pgai Vectorizer, allowing you to store and query embeddings just like any other database column—without writing raw SQL.

Embed and retrieve vectors like any other column: Define embeddings as structured fields inside your models.
Run vector similarity queries inside ORM expressions: No need for raw SQL or external vector stores.
Use Alembic migrations: Keep schema changes version-controlled.
This makes it easier to store, retrieve, and update embeddings inside Postgres while keeping everything ORM-native.

op.create_vectorizer(
    source="blog",
    embedding=OpenAIConfig(
        model='text-embedding-3-small',
        dimensions=768
    ),
    chunking=CharacterTextSplitterConfig(
        chunk_column='content',
    ),
    formatting=PythonTemplateConfig(template='$title - $chunk')
)

LiteLLM: Swap Embedding Providers With a Single SQL Command

Choosing the right embedding model affects cost, performance, and accuracy—but switching providers shouldn’t require rewriting queries. LiteLLM removes that friction by enabling seamless provider swaps with a single SQL command—no query rewrites, no manual reprocessing.

Switch providers instantly: Test OpenAI, Cohere, Hugging Face, Mistral, Azure OpenAI, AWS Bedrock, and Google Vertex with minimal effort.
No query rewrites: Applications continue working regardless of the embedding provider.
Keep old embeddings while switching: Ensures smooth transitions without downtime.

SELECT ai.create_vectorizer(
    'my_table'::regclass,
    embedding => ai.embedding_litellm(
        'cohere/embed-english-v3.0',
        1024,
        api_key_name => 'COHERE_API_KEY'
    ),
    chunking => ai.chunking_recursive_character_text_splitter('contents')
);

LiteLLM is available for self-hosted users now and will be coming to Timescale Cloud soon.

Get Started in Minutes

Pgai Vectorizer’s SQLAlchemy and LiteLLM integrations are available now, making it easier than ever to store, query, and experiment with vector embeddings inside Postgres.

Try it today

Install pgai Vectorizer from GitHub
Define a vector column using SQLAlchemy
Test different embedding providers with LiteLLM

For self-hosted deployments, an external worker is required to process embeddings. In Timescale Cloud, this is automated to handle API calls and scale workloads dynamically.

What’s Next?

Pgai Vectorizer is just the start. We’re building a broader AI-native database experience where Postgres supports smarter indexing, advanced AI search, and deeper integrations with AI tools. Expect more enhancements that make Postgres the best place to build AI applications.