Firestore for Image Embeddings


Firestore and LangChain

In my previous Firestore for Text Embedding and Similarity Search post, I talked about how Firestore and LangChain can help you to store text embeddings and do similarity searches against them. With multimodal embedding models, you can generate embeddings not only for text but for images and video as well. In this post, I will show you how to store image embeddings in Firestore and later use them for similarity search.

Image embeddings support in FirestoreVectorStore

As a recap from the previous post, the Firestore for LangChain project provides a FirestoreVectorStore which simplifies storage and retrieval of embeddings.

Initially, FirestoreVectorStore only supported text embeddings, but we recently added a new add_images method to store image embeddings. Likewise, we added the similarity_search_image method to run similarity searches with image embeddings.

Let’s take a closer look at how you can use FirestoreVectorStore in LangChain to work with image embeddings.

First, you need to create a multimodal embedding model that can embed images:

from langchain_google_vertexai import VertexAIEmbeddings

embedding = VertexAIEmbeddings(
    model_name="multimodalembedding",
    project=PROJECT_ID,
    location="us-central1"
)

Then, you create a Firestore-backed vector store with the embedding model:

from langchain_google_firestore import FirestoreVectorStore

vector_store = FirestoreVectorStore(
    collection=COLLECTION_NAME,
    embedding_service=embedding,
)

Now, you can add images stored locally, in Google Cloud Storage, or any image on the web as follows:

ids = ["landmark1.png", "landmark2.png", "landmark3.png"]
image_paths = [
    "gs://your-storage-bucket/landmark1.png",
    "./images/landmark2.png",
    "https://your-website/images/landmark3.png",
]

vector_store.add_images(image_paths, ids=ids)

This creates embeddings for each image and saves them to Firestore.

Afterwards, you can perform a similarity search with a text query:

vector_store.similarity_search("stadium", k=3)

You can also perform a similarity search with an image. In that case, the image is first embedded and the resulting embedding is then used to retrieve the similar images:

image_path = "../images/landmark4.png"
vector_store.similarity_search_image(image_path, k=3)

You can see play with these examples in the FirestoreVectorStore notebook.

Sometimes, it’s useful to store base64 encoded images as content in the Firestore document for easy retrieval. In that case, you can set the store_encodings flag to true:

vector_store.add_images(image_paths, ids=ids, store_encodings=True)

Note that Firestore documents have a size limit (1 MiB (1,048,576 bytes) - see the Firestore documentation for details) and you might run into that if you try to embed large images and store their base64 encodings.

Sample: Image embedding storage and retrieval

Let’s see a sample of how to store and retrieve image embeddings with FirestoreVectorStore. The full source code is in main.py.

Firestore database and index

First, create a Firestore database for image embeddings:

gcloud firestore databases create --database image-database --location=europe-west1

Create a Firestore index that we will need for retrieval later. Note that if you forget to do this initially, you’ll get an error when you run your first query with the exact command you need to run in order to create the index:

gcloud firestore indexes composite create --project=your-project-id \
 --database="image-database" --collection-group=ImageCollection --query-scope=COLLECTION \
 --field-config=vector-config='{"dimension":"1408","flat": "{}"}',field-path=embedding

Add image embeddings

Let’s add some images.

Add the landmark1.png from a Cloud Storage url:

python main.py --project_id=genai-atamel --image_paths gs://genai-atamel-firestore-images/landmark1.png

Add landmark2.png and landmark3.png from a local folder:

python main.py --project_id=genai-atamel --image_paths ../images/landmark2.png ../images/landmark3.png

Add another image from a URL:

python main.py --project_id=genai-atamel --image_paths https://atamel.dev/img/mete-512.jpg

At this point, you should see images and their embeddings saved to Firestore:

Firestore with images

Retrieve images

Now, you can retrieve and display images with a keyword using the similarity_search method. For example, you can retrieve by the keyword stadium:

python main.py --project_id=genai-atamel --search_by_keyword="stadium"

You get back the picture of the Colosseum:

Colosseum

You can also search similar images using the similarity_search_image method. For example, you can pass in landmark4.png, a picture of a temple:

python main.py --project_id=genai-atamel --search_by_image="../images/landmark4.png"

You get back another temple as the most similar image:

Template

That’s cool!

Conclusion

In this post, you learned how you can use LangChain and Firestore to store and retrieve image embeddings using a multimodal image embedding model. This allows LLMs to answer questions grounded not only in text but images as well. Try out the new image embedding support in FirestoreVectorStore and let us know what you think!

Further reading


See also