Tracing with Langtrace and Gemini


Large Language Models (LLMs) feel like a totally new technology with totally new problems. It’s true to some extend but at the same time, it also has the same old problems that we had to tackle in traditional technology.

For example, how do you figure out which LLM calls are taking too long or failed? At the bare minimum, you need logging but ideally, you use a full observability platform like OpenTelemetry with logging, tracing, metric and more. You need the good old software engineering practices, such as observability, applied to new technologies like LLMs.

In this post, I’ll talk about a subset of observability, tracing, and show you how you can trace your LLM calls in OpenTelemetry complaint way using Langtrace.

Introduction to Langtrace

Langtrace is an open-source observability tool that collects and analyzes traces in order to help you improve your LLM apps. It has an SDK to collect traces from LLM APIs, Vector Databases, and LLM based Frameworks. The traces are open telemetry compatible and can be exported to Langtrace or any other observability stack (Grafana, Datadog, Honeycomb etc). There’s also a web-based Langtrace Dashboard where you can view and analyze your traces.

Langtrace

Let’s take a look how to trace with Langtrace and Gemini on Google AI and Vertex AI. All the code is in main.py.

Setup

First, signup for Langtrace and create a project:

Langtrace createproject

Then, create an API key:

Langtrace create APIkey

Set it to an environment variable:

export LANGTRACE_API_KEY=your-langtrace-api-key

It’s also a good idea to create a Python virtual environment:

python -m venv .venv
source .venv/bin/activate

Install Langtrace:

pip install langtrace-python-sdk

Langtrace and Gemini on Google AI

Let’s now look at how to trace LLM calls with Langtrace and Gemini running on Google AI.

First, get an API key for Gemini and set it to an environment variable:

export GEMINI_API_KEY=your-gemini-api-key

Install Google AI Python SDK for the Gemini:

pip install google-generativeai

Now, you can initialize Langtrace and Gemini:

import os
from langtrace_python_sdk import langtrace  # Must precede any llm module imports
import google.generativeai as genai

langtrace.init(api_key=os.environ["LANGTRACE_API_KEY"])

model = genai.GenerativeModel("gemini-1.5-flash")
genai.configure(api_key=os.environ["GEMINI_API_KEY"])

And generate some content with Gemini on Google AI:

def generate_googleai_1():
    response = model.generate_content("What is Generative AI?")
    print(response.text)

    response = model.generate_content("Why is sky blue?")
    print(response.text)

Run it:

python main.py generate_googleai_1

In a few seconds, you’ll see traces for two LLM calls:

Langtracedashboard

You can also see more details within each trace:

Langtrace tracedetails

Grouping traces

In the previous example, the two traces for two LLM calls were displayed individually. Sometimes, it’s useful to group similar calls in the same trace. You can do that by with @with_langtrace_root_span:

@with_langtrace_root_span("generate_googleai")
def generate_googleai_2():
    response = model.generate_content("What is Generative AI?")
    print(response.text)

    response = model.generate_content("Why is sky blue?")
    print(response.text)

Run it:

python main.py generate_googleai_2

You’ll now see traces grouped together:

Langtracedashboard

You can also see more details within each trace:

Langtrace tracedetails

Langtrace and Gemini on Vertex AI

You can do everything I explained with Gemini on Vertex AI as well. Let’s take a quick look how.

Make sure your gcloud is set up with your Google Cloud project and it is set as an environment variable:

gcloud config set core/project your-google-cloud-project-id
export GOOGLE_CLOUD_PROJECT_ID=your-google-cloud-project-id

Make sure you’re logged in:

gcloud auth application-default login

Install Google Vertex AI Python SDK for the Gemini:

pip install google-cloud-aiplatform

Now, generate some content with Gemini on Vertex AI:

@with_langtrace_root_span("generate_vertexai")
def generate_vertexai():
    vertexai.init(project=os.environ["GOOGLE_CLOUD_PROJECT_ID"], location="us-central1")
    model = GenerativeModel("gemini-1.5-flash-002")

    response = model.generate_content("What is Generative AI?")
    print(response.text)

    response = model.generate_content("Why is sky blue?")
    print(response.text)

Run it:

python main.py generate_vertexai

In a few seconds, you’ll see traces for the two LLM calls:

Langtracedashboard

Nice!

Metrics

Last but not least, after sending traces, if you switch to Metrics tab, you can see some metrics on token counts, costs, and more:

Langtracemetrics

Conclusion

Langrace is quite useful to trace your LLM calls and get some basic metrics. It can also be used to manage prompts and for evaluations that I might talk about in a future blog post.

Here are some links for further reading:


See also