Gemini

Control LLM output with LangChain's structured and Pydantic output parsers

Posted on 9 December 2024

In my previous Control LLM output with response type and schema post, I talked about how you can define a JSON response schema and Vertex AI makes sure the output of the Large Language Model (LLM) conforms to that schema. In this post, I show how you can implement a similar response schema using LangChain’s structured output parser with any model. You can further get the output parsed and populated into Python classes automatically with the Pydantic output parser. Read More →

GenAI GoogleAI VertexAI Gemini Google Cloud Platform

Tracing with Langtrace and Gemini

Posted on 27 November 2024

Large Language Models (LLMs) feel like a totally new technology with totally new problems. It’s true to some extent but at the same time, they also have the same old problems that we had to tackle in traditional technology. For example, how do you figure out which LLM calls are taking too long or have failed? At the bare minimum, you need logging but ideally, you use a full observability platform like OpenTelemetry with logging, tracing, metrics and more. Read More →

GenAI GoogleAI VertexAI Gemini Google Cloud Platform

Batch prediction in Gemini

Posted on 18 November 2024

LLMs are great in generating content on demand but if left unchecked, you can be left with a large bill at the end of the day. In my Control LLM costs with context caching post, I talked about how to limit costs by using context caching. Batch generation is another technique you can use to save time at a discounted price. What’s batch generation? Batch generation in Gemini allows you to send multiple generative AI requests in batches rather than one by one and get responses asynchronously either in a Cloud Storage bucket or a BigQuery table. Read More →

GenAI GoogleAI VertexAI Gemini Google Cloud Platform

LLM Guard and Vertex AI

Posted on 11 November 2024

I’ve been focusing on evaluation frameworks lately because I believe that the hardest problem while using LLMs is to make sure they behave properly. Are you getting the right outputs grounded with your data? Are outputs free of harmful or PII data? When you make a change to your RAG pipeline or to your prompts, are outputs getting better or worse? How do you know? You don’t know unless you measure. Read More →

GenAI GoogleAI VertexAI Gemini Google Cloud Platform

Promptfoo and Vertex AI

Posted on 4 November 2024

In my previous DeepEval and Vertex AI blog post, I talked about how crucial it is to have an evaluation framework in place when working with Large Language Models (LLMs) and introduced DeepEval as one of such evaluation frameworks. Recently, I came across another LLM evaluation and security framework called Promptfoo. In this post, I will introduce Promptfoo, show what it provides for evaluations and security testing, and how it can be used with Vertex AI. Read More →

GenAI GoogleAI VertexAI Gemini Google Cloud Platform

Firestore for Image Embeddings

Posted on 29 October 2024

In my previous Firestore for Text Embedding and Similarity Search post, I talked about how Firestore and LangChain can help you to store text embeddings and do similarity searches against them. With multimodal embedding models, you can generate embeddings not only for text but for images and video as well. In this post, I will show you how to store image embeddings in Firestore and later use them for similarity search. Read More →

GenAI GoogleAI VertexAI Gemini Firestore Google Cloud Platform

Firestore for Text Embedding and Similarity Search

Posted on 9 October 2024

In my previous Persisting LLM chat history to Firestore post, I showed how to persist chat messages in Firestore for more meaningful and context-aware conversations. Another common requirement in LLM applications is to ground responses in data for more relevant answers. For that, you need embeddings. In this post, I want to talk specifically about text embeddings and how Firestore and LangChain can help you to store text embeddings and do similarity searches against them. Read More →

GenAI GoogleAI VertexAI Gemini Firestore Google Cloud Platform

Persisting LLM chat history to Firestore

Posted on 1 October 2024

Firestore has long been my go-to NoSQL backend for my serverless apps. Recently, it’s becoming my go-to backend for my LLM powered apps too. In this series of posts, I want to show you how Firestore can help for your LLM apps. In the first post of the series, I want to talk about LLM powered chat applications. I know, not all LLM apps have to be chat based apps but a lot of them are because LLMs are simply very good at chat based communication. Read More →

GenAI GoogleAI VertexAI Gemini Firestore Google Cloud Platform

Semantic Kernel and Gemini

Posted on 19 August 2024

Introduction When you’re building a Large Language Model (LLMs) application, you typically start with the SDK of the LLM you’re trying to talk to. However, at some point, it might make sense to start using a higher level framework. This is especially true if you rely on multiple LLMs from different vendors. Instead of learning and using SDKs from multiple vendors, you can learn a higher level framework and use that to orchestrate your calls to multiple LLMs. Read More →

GenAI GoogleAI VertexAI Gemini Google Cloud Platform

DeepEval and Vertex AI

Posted on 12 August 2024

Introduction When you’re working with Large Language Models (LLMs), it’s crucial to have an evaluation framework in place. Only by constantly evaluating and testing your LLM outputs, you can tell if the changes you’re making to prompts or the output you’re getting back from the LLM are actually good. In this blog post, we’ll look into one of those evaluation frameworks called DeepEval, an open-source evaluation framework for LLMs. It allows to “unit test” LLM outputs in a similar way to Pytest. Read More →

GenAI VertexAI Gemini Google Cloud Platform