egor romanov for Supabase

Posted on Oct 23, 2023 • Edited on Feb 7 • Originally published at supabase.com

pgvector vs Pinecone: cost and performance

At Supabase, we believe that a combination of Postgres and pgvector serves as a better alternative to single-purpose databases like Pinecone for AI tasks. This isn't the first time a Postgres-based solution has successfully rivaled specialized databases designed for specific data types. Timescale for time-series data and Greenplum for analytics are just a few examples.

We decided to put Postgres vector performance to the test and run a direct comparison between pgvector and Pinecone.

What is Pinecone?

Pinecone is a fully managed cloud Vector Database that is only suitable for storing and searching vector data. It offers straightforward start-up and scalability. It employs a proprietary ANN index and lacks support for exact nearest neighbors search or fine-tuning. The only setting that allows you to adjust the balance between query accuracy and speed is the choice of pod type when creating an index.

So, before we dive into their performance, let us first introduce Pinecone's offerings.

Pinecone has 3 Pod types for indexes

An index on Pinecone is made up of pods, which are units of cloud resources (vCPU, RAM, disk) that provide storage and compute for each index.

Type	Capacity / Vectors	QPS	Accuracy	Price per unit per month
s1	5,000,000 768d (~2,500,000 1536d)	Slowest	0.98	$80
p1	1,000,000 768d (~500,000 1536d)	Medium	0.99	$80
p2	1,100,000 768d (~550,000 1536d)	Fastest	0.94	$120

Pods can be scaled in 2 dimensions: vertically and horizontally:

vertical scaling can be used to fit more vectors on a single pod: x1, x2, x4, x8;
horizontal scaling increases the number of pods or creates replicas to boost queries per second (QPS). This works linearly for Pinecone: doubling the number of replica pods doubles your QPS.

Benchmarking methodology

We utilized the ANN Benchmarks methodology, a standard for benchmarking vector databases. Our tests used the dbpedia dataset of 1,000,000 OpenAI embeddings (1536 dimensions) and inner product distance metric for both Pinecone and pgvector.

To compare Pinecone and pgvector on equal grounds, we opted for the following setups:

pgvector: A single Supabase 2XL instance approximating ~410$/month (8-core ARM CPU and 32 GB RAM). An HNSW index with the following build parameters: m='36', ef_construction='128'.
Pinecone: Vertically scaled pod to the minimum option that fits the dbpedia dataset into the index on a single pod. We then added replicas to match the budget (slightly exceeding in all cases with ~$480/month).

To reduce network latency, we placed our clients in the same cloud provider and region as the database. Experiments were run in a parallel configuration, varying the number of concurrent clients from 5 to 100 to determine the maximum QPS for each setup.

Measuring accuracy in Pinecone

There is no available information on Pinecone's proprietary ANN index. Likewise, they doesn't provide information about query accuracy, nor does it support exact nearest neighbors search (KNN). So to measure Pinecone's accuracy, we had to compare its results with pgvector's exact search (KNN without indexes) for the same queries. This seems to be the only way to measure Pinecone's index accuracy.

Benchmarking Results

Pinecone with s1 pod type

As the index can fit in a single s1.x1 pod ($80/month), we created five additional replicas. Our Pinecone setup consisted of six s1 pods (totaling $480/month). We measured Pinecone's accuracy for the dbpedia dataset using the s1 pod type, achieving a score of 0.98 at the 10 nearest neighbors (accuracy@10).

To match the measured .98 accuracy@10 of Pinecone s1 pods, we set ef_search=32 for pgvector (HNSW) queries, and observed the following results:

The pgvector HNSW index can manage 1185% more queries per second while being $70 cheaper per month.

Interestingly, we'd often heard that the pgvector IVFFlat was too slow until the HNSW support was introduced. However, even the pgvector IVFFlat index on the same compute exceeds the Pinecone s1 pod and manages 143% more queries per second:

p1 pod type

With Pinecone p1 pods, we can fit the dbpedia dataset into the index on a single p1.x2 pod ($160/month). So, by adding two more replicas, we maintained our budget. Therefore, our second experiment involved a Pinecone setup scale of three p1.x2 pods (totaling $480/month). The measured accuracy@10 for the p1.x2 pod and dbpedia dataset was 0.99.

To match the .99 accuracy of Pinecone's p1.x2, we set ef_search=40 for pgvector (HNSW) queries.

pgvector demonstrated much better performance again with over 4x better QPS than the Pinecone setup, while still being $70 cheaper per month. As Pinecone can linearly scale by adding more replicas, you can estimate that you would need 12-13 p1.x2 pods to match pgvector performance. This equates to approximately $2000 per month versus ~$410 per month for a 2XL on Supabase.

p2 pod type

This is Pinecone's fastest pod type, but the increased QPS results in an accuracy trade-off. We measured accuracy@10=0.94 for the p2 pods and the dbpedia dataset. It is possible to fit the index on a single p2.x2 pod (240$/month), so we could add 1 replica. Thus, Pinecone's setup for the third experiment consisted of two p2.x2 pods (totaling $480/month).

To match Pinecone's .94 accuracy, we set ef_search=10 for pgvector (HNSW) queries. In this test, pgvector's accuracy was actually better by 1% with .95 accuracy@10, and it was still significantly faster despite the better accuracy.

Here's something important to highlight: pgvector is faster than Pinecone's fastest pod type, even with an accuracy@10=0.99 compared to Pinecone's 0.94. Pinecone's most expensive option sacrifices 5% accuracy just to match pgvector's speed.

Additional thoughts on Pinecone vs. pgvector

It's only fair to note that Pinecone may be cheaper than pgvector since you could use a single p1.x2 pod without replicas, costing about $160 per month, and you would still achieve approximately 60 QPS with accuracy@10=0.99. For pgvector on Supabase, this means you might not be able to fit the index in RAM as you may use a large ($110) or XL ($210) compute add-on and will fall back to KNN search without any indexes. However, this translates to potentially adding more vectors, similar to the s1 pod on Pinecone.

Real user stories indicate this might not be problematic. For instance, Quivr expanded to 1 million vectors without using any indexes:

pgvector's hidden cost-saving benefits

There are also a couple of benefits from a developer experience perspective that we often take for granted when using Postgres:

Postgres offers numerous features applicable to your vectors: database backups, row-level security, client libraries support and ORMs for 18 languages, complete ACID compliance, bulk updates and deletes (metadata updates in seconds).
Having all your data in a sole Postgres instance (or a cluster) reduces roundtrips in production and allows running the entire developer setup locally.
Implementing additional databases can increase operational complexity and the learning curve.
Postgres is battle-tested and robust, whereas most specialized vector databases haven't had time to demonstrate their reliability.

Start using HNSW

All new Supabase databases automatically ship with pgvector v0.5.0 which includes the new HNSW indexes. Try it out today and let us know what you think!

More pgvector and AI resources

🚀 Learn more about Supabase

Top comments (15)

awalias • Oct 23 '23

For anyone interested there's a good guide here on how to generate vector embeddings easily with pgvector on Supabase: supabase.com/docs/guides/ai/quicks...

and another guide here on how to build Q&A into something like a documentation site: supabase.com/docs/guides/ai/exampl...

Which can look something like this:

Lastly we have an awesome python library for interacting with vectors called vecs which you can find here: supabase.com/docs/guides/ai/python...

And works a bit like this:

# add records to the collection
docs.upsert(
    records=[
        (
         "vec0",           # the vector's identifier
         [0.1, 0.2, 0.3],  # the vector. list or np.array
         {"year": 1973}    # associated  metadata
        ),
        (
         "vec1",
         [0.7, 0.8, 0.9],
         {"year": 2012}
        )
    ]
)