📓 Add Dataframe Quickstart¶

If your application was run (and logged) outside of TruLens, TruVirtual can be used to ingest and evaluate the logs.

This notebook walks through how to quickly log a dataframe of prompts, responses and contexts (optional) to TruLens as traces, and how to run evaluations with the trace data.

In [ ]:

Copied!

# !pip install trulens trulens-providers-openai openai
# !pip install trulens trulens-providers-openai openai

In [ ]:

Copied!

import os

os.environ["OPENAI_API_KEY"] = "sk-..."
import os

os.environ["OPENAI_API_KEY"] = "sk-..."

Create or load a dataframe¶

The dataframe should include minimally columns named query and response. You can also include a column named contexts if you wish to evaluate retrieval systems or RAGs.

In [ ]:

Copied!





import pandas as pd

data = {
    "query": ["Where is Germany?", "What is the capital of France?"],
    "response": ["Germany is in Europe", "The capital of France is Paris"],
    "contexts": [
        ["Germany is a country located in Europe."],
        [
            "France is a country in Europe and its capital is Paris.",
            "Germany is a country located in Europe",
        ],
    ],
}
df = pd.DataFrame(data)
df.head()
import pandas as pd

data = {
    "query": ["Where is Germany?", "What is the capital of France?"],
    "response": ["Germany is in Europe", "The capital of France is Paris"],
    "contexts": [
        ["Germany is a country located in Europe."],
        [
            "France is a country in Europe and its capital is Paris.",
            "Germany is a country located in Europe",
        ],
    ],
}
df = pd.DataFrame(data)
df.head()

Create a virtual app for tracking purposes.¶

This can be initialized simply, or you can track application metadata by passing a dict to VirtualApp(). For simplicity, we'll leave it empty here.

In [ ]:

Copied!

from trulens.apps.virtual import VirtualApp

virtual_app = VirtualApp()
from trulens.apps.virtual import VirtualApp

virtual_app = VirtualApp()

Next, let's define feedback functions.

The add_dataframe method we plan to use will load the prompt, context and response into virtual records. We should define our feedback functions to access this data in the structure it will be stored. We can do so as follows:

prompt: selected using .on_input()
response: selected using on_output()
context: selected using VirtualApp.select_context()

In [ ]:

Copied!





from trulens.core import Feedback
from trulens.providers.openai import OpenAI

# Initialize provider class
provider = OpenAI()

# Select context to be used in feedback.
context = VirtualApp.select_context()

# Question/statement relevance between question and each context chunk.
f_context_relevance = (
    Feedback(
        provider.context_relevance_with_cot_reasons, name="Context Relevance"
    )
    .on_input()
    .on(context)
)

# Define a groundedness feedback function
f_groundedness = (
    Feedback(
        provider.groundedness_measure_with_cot_reasons, name="Groundedness"
    )
    .on(context.collect())
    .on_output()
)

# Question/answer relevance between overall question and answer.
f_qa_relevance = Feedback(
    provider.relevance_with_cot_reasons, name="Answer Relevance"
).on_input_output()
from trulens.core import Feedback
from trulens.providers.openai import OpenAI

# Initialize provider class
provider = OpenAI()

# Select context to be used in feedback.
context = VirtualApp.select_context()

# Question/statement relevance between question and each context chunk.
f_context_relevance = (
    Feedback(
        provider.context_relevance_with_cot_reasons, name="Context Relevance"
    )
    .on_input()
    .on(context)
)

# Define a groundedness feedback function
f_groundedness = (
    Feedback(
        provider.groundedness_measure_with_cot_reasons, name="Groundedness"
    )
    .on(context.collect())
    .on_output()
)

# Question/answer relevance between overall question and answer.
f_qa_relevance = Feedback(
    provider.relevance_with_cot_reasons, name="Answer Relevance"
).on_input_output()

Start a TruLens logging session¶

In [ ]:

Copied!

from trulens.core import TruSession
from trulens.dashboard import run_dashboard

session = TruSession()
run_dashboard(session)
from trulens.core import TruSession
from trulens.dashboard import run_dashboard

session = TruSession()
run_dashboard(session)

Register the virtual app¶

We can now register our virtual app, including any feedback functions we'd like to use for evaluation.

In [ ]:

Copied!





from trulens.apps.virtual import TruVirtual

virtual_recorder = TruVirtual(
    app_name="RAG",
    app_version="simple",
    app=virtual_app,
    feedbacks=[f_context_relevance, f_groundedness, f_qa_relevance],
)
from trulens.apps.virtual import TruVirtual

virtual_recorder = TruVirtual(
    app_name="RAG",
    app_version="simple",
    app=virtual_app,
    feedbacks=[f_context_relevance, f_groundedness, f_qa_relevance],
)

Add the dataframe to TruLens¶

We can then add the dataframe to TruLens using the virual recorder method add_dataframe. Doing so will immediately log the traces, and kick off the computation of evaluations. After some time, the evaluation results will be accessible both from the sdk (e.g. session.get_leaderboard) and in the TruLens dashboard.

If you wish to skip evaluations and only log traces, you can simply skip the sections of this notebook where feedback functions are defined, and exclude them from the construction of the virtual_recorder.

In [ ]:

Copied!

virtual_records = virtual_recorder.add_dataframe(df)
virtual_records = virtual_recorder.add_dataframe(df)