Introduction
Hello world! Today we're going to be talking about embeddings. Used in recommendation systems, semantic search and Retrieval Augmented Generation (RAG) pipelines, embeddings are the core of being able to easily retrieve contextual information for a given query. This result can then be plugged into an LLM to retrieve a result - or alternatively, the resulting embedding can be stored in a database.
In this article, we're going to get into why you should use local embeddings vs. embeddings from managed models, as well as the technical implementation for how to create and use your local embeddings.
Interested in checking out the code for this example? Have a look at the example on our GitHub repo.
Why local embeddings?
Embeddings, while an important part of the RAG ecosystem, can be relatively expensive to create, especially as you start storing embeddings from bigger and bigger amounts of documents. Some simple use cases where this can happen might be:
- If you're running a directory that aggregates a list of businesses
- If you have a large amount of products that you sell (a supermarket or large grocery store being a good example of this)
- You're storing user-submitted uploads
For all the convenience that managed models provide, there are also data concerns. If you're working on a commercial project, you may not want to hand over your data to OpenAI - especially in industries where data privacy is absolutely paramount. In this case, a local model has exactly what you need: text embedding in the knowledge that your data will be staying totally on your infrastructure.
Building your project
Without further do, let's get to building! Before we start, don't forget to have the Rust programming language installed.
Getting started
To get started, we'll spin up a new project:
cargo init local-embeddings-rig
cd local-embeddings-rig
Next, we'll need to add our dependencies:
cargo add rig-core serde rig-fastembed -F rig-core/derive,serde/derive
Note that this one-line command adds the following dependencies:
-
rig-core
: Therig
crate which houses all the traits (interfaces) we need to create the embeddings. We've added thederive
feature to get theEmbed
derive macro which will make it much easier to embed whatever we need to. -
rig-fastembed
: Thefastembed-rs
integration forrig
. -
serde
: A de/serialization library. We've added thederive
feature for theSerialize
andDeserialize
macros, both of which are extremely helpful for serializing and deserializing simple structs without hassle.
Let's get embedding with Rig
Next, we'll go into main.rs
and define a struct which will serve as the payload we want to associate with our struct.
use rig::Embed;
use rig_fastembed::FastembedModel;
use serde::{Deserialize, Serialize};
// Shape of data that needs to be RAG'ed.
// The definition field will be used to generate embeddings.
#[derive(Embed, Clone, Deserialize, Debug, Serialize, Eq, PartialEq, Default)]
struct WordDefinition {
id: String,
word: String,
#[embed]
definitions: Vec<String>,
}
Once done, we'll define our main function (which will use the Tokio runtime macro to allow for easy async) as well as a new rig-fastembed::Client
which we'll be using to create our embedding models from.
#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
// Create OpenAI client
let fastembed_client = rig_fastembed::Client::new();
let embedding_model = fastembed_client.embedding_model(&FastembedModel::AllMiniLML6V2Q);
// .. some more code goes here
Ok(())
}
The rig-core
library exposes an EmbeddingsBuilder
struct which we can use to store our documents in. Note that the field which we've annotated previously with #[embed]
will be used as the text source for embedding, but if you want to change this, you can create custom behavior by implementing rig::Embed
manually.
let documents = vec![
WordDefinition {
id: "doc0".to_string(),
word: "flurbo".to_string(),
definitions: vec![
"A green alien that lives on cold planets.".to_string(),
"A fictional digital currency that originated in the animated series Rick and Morty.".to_string()
]
},
WordDefinition {
id: "doc1".to_string(),
word: "glarb-glarb".to_string(),
definitions: vec![
"An ancient tool used by the ancestors of the inhabitants of planet Jiro to farm the land.".to_string(),
"A fictional creature found in the distant, swampy marshlands of the planet Glibbo in the Andromeda galaxy.".to_string()
]
},
WordDefinition {
id: "doc2".to_string(),
word: "linglingdong".to_string(),
definitions: vec![
"A term used by inhabitants of the sombrero galaxy to describe humans.".to_string(),
"A rare, mystical instrument crafted by the ancient monks of the Nebulon Mountain Ranges on the planet Quarm.".to_string()
]
},
];
let embeddings = EmbeddingsBuilder::new(embedding_model.clone())
documents(documents)?
.build()
.await?;
And that's it! The embeddings
variable will now be an iterator that outputs (embedding, T)
. You can use this however you want. For example:
for (embedding, document) in embeddings {
println!("{embedding}");
println!("{document:?}");
}
Using embeddings with vector stores in Rig
Next, we create our vector store and add the embeddings to it. Note that embeddings
actually returns an (embedding, doc)
tuple. The from_documents_with_id_f
takes a tuple, as well as a closure (an anonymous function) for the document ID. Additionally we also have other integrations for vector stores if you'd like to use production-ready alternatives such as Qdrant, LanceDB and MongoDB. Check them out on our GitHub repo.
// Create vector store with the embeddings
let vector_store = InMemoryVectorStore::from_documents_with_id_f(embeddings, |doc| doc.id.clone());
// Create vector store index
let index = vector_store.index(embedding_model);
Finally, to use our vector store we use top_n
to grab the results, annotating the function with the type we want to return from the function.
Note that here the returned type of the results
variable is Vec<(f64, String, T)>
(where T
in this case is the WordDefinition
struct). However, because we don't necessarily need to know what type it is (or it's obvious what type it is), we've left it out in the type annotation (Vec<_>
).
let results = index
.top_n::<WordDefinition>("I need to buy something in a fictional universe. What type of money can I use for this?", 1)
.await?
.into_iter()
.map(|(score, id, doc)| (score, id, doc.word))
.collect::<Vec<_>>();
println!("Results: {:?}", results);
let id_results = index
.top_n_ids("I need to buy something in a fictional universe. What type of money can I use for this?", 1)
.await?
.into_iter()
.collect::<Vec<_>>();
println!("ID results: {:?}", id_results);
So what's next?
Thanks for reading! If you have any thoughts, feel free to let us know in the comments.
For additional Rig resources and community engagement:
- Check out more examples in our gallery.
- Contribute or report issues on our GitHub.
- Join discussions in our Discord community!
Top comments (0)