In my previous Firestore for Text Embedding and Similarity Search post, I talked about how Firestore and LangChain can help you to store text embeddings and do similarity searches against them. With multimodal embedding models, you can generate embeddings not only for text but for images and video as well. In this post, I will show you how to store image embeddings in Firestore and later use them for similarity search.
Image embeddings support in FirestoreVectorStore
As a recap from the previous post, the Firestore for LangChain project provides a FirestoreVectorStore
which simplifies storage and retrieval of embeddings.
Initially, FirestoreVectorStore
only supported text embeddings, but we recently added a new add_images method to store image embeddings. Likewise, we added the similarity_search_image method to run similarity searches with image embeddings.
Let’s take a closer look at how you can use FirestoreVectorStore
in LangChain to work with image embeddings.
First, you need to create a multimodal embedding model that can embed images:
from langchain_google_vertexai import VertexAIEmbeddings
embedding = VertexAIEmbeddings(
model_name="multimodalembedding",
project=PROJECT_ID,
location="us-central1"
)
Then, you create a Firestore-backed vector store with the embedding model:
from langchain_google_firestore import FirestoreVectorStore
vector_store = FirestoreVectorStore(
collection=COLLECTION_NAME,
embedding_service=embedding,
)
Now, you can add images stored locally, in Google Cloud Storage, or any image on the web as follows:
ids = ["landmark1.png", "landmark2.png", "landmark3.png"]
image_paths = [
"gs://your-storage-bucket/landmark1.png",
"./images/landmark2.png",
"https://your-website/images/landmark3.png",
]
vector_store.add_images(image_paths, ids=ids)
This creates embeddings for each image and saves them to Firestore.
Afterwards, you can perform a similarity search with a text query:
vector_store.similarity_search("stadium", k=3)
You can also perform a similarity search with an image. In that case, the image is first embedded and the resulting embedding is then used to retrieve the similar images:
image_path = "../images/landmark4.png"
vector_store.similarity_search_image(image_path, k=3)
You can see play with these examples in the FirestoreVectorStore notebook.
Sometimes, it’s useful to store base64
encoded images as content in the Firestore document for easy retrieval. In that case, you can set the store_encodings
flag to true:
vector_store.add_images(image_paths, ids=ids, store_encodings=True)
Note that Firestore documents have a size limit (1 MiB (1,048,576 bytes) - see the Firestore documentation for details) and you might run into that if you try to embed large images and store their base64 encodings.
Sample: Image embedding storage and retrieval
Let’s see a sample of how to store and retrieve image embeddings with FirestoreVectorStore
. The full source code is in main.py.
Firestore database and index
First, create a Firestore database for image embeddings:
gcloud firestore databases create --database image-database --location=europe-west1
Create a Firestore index that we will need for retrieval later. Note that if you forget to do this initially, you’ll get an error when you run your first query with the exact command you need to run in order to create the index:
gcloud firestore indexes composite create --project=your-project-id \
--database="image-database" --collection-group=ImageCollection --query-scope=COLLECTION \
--field-config=vector-config='{"dimension":"1408","flat": "{}"}',field-path=embedding
Add image embeddings
Let’s add some images.
Add the landmark1.png from a Cloud Storage url:
python main.py --project_id=genai-atamel --image_paths gs://genai-atamel-firestore-images/landmark1.png
Add landmark2.png and landmark3.png from a local folder:
python main.py --project_id=genai-atamel --image_paths ../images/landmark2.png ../images/landmark3.png
Add another image from a URL:
python main.py --project_id=genai-atamel --image_paths https://atamel.dev/img/mete-512.jpg
At this point, you should see images and their embeddings saved to Firestore:
Retrieve images
Now, you can retrieve and display images with a keyword using the similarity_search
method. For example, you can retrieve by the keyword stadium:
python main.py --project_id=genai-atamel --search_by_keyword="stadium"
You get back the picture of the Colosseum:
You can also search similar images using the similarity_search_image
method. For example, you can pass in landmark4.png, a picture of a temple:
python main.py --project_id=genai-atamel --search_by_image="../images/landmark4.png"
You get back another temple as the most similar image:
That’s cool!
Conclusion
In this post, you learned how you can use LangChain and Firestore to store and retrieve image embeddings using a multimodal image embedding model. This allows LLMs to answer questions grounded not only in text but images as well. Try out the new image embedding support in FirestoreVectorStore and let us know what you think!
Further reading
- Multimodal image storage and retrieval with Firestore Vector Store
- Build LLM-powered applications using LangChain
- Firestore for LangChain
- FirestoreVectorStore notebook