Viewing Results¶

TruLens provides a broad set of capabilities for evaluating and tracking applications. In addition, TruLens ships with native tools for examining traces and evaluations in the form of a complete dashboard, and components that can be added to streamlit apps.

TruLens Dashboard¶

TruLens Dashboard Demo

To view and examine application logs and feedback results, TruLens provides a built-in Streamlit dashboard. That app has two pages, the Leaderboard which displays aggregate feedback results and metadata for each application version, and the Evaluations page where you can more closely examine individual traces and feedback results. This dashboard is launched by run_dashboard, and will run from a database url you specify with TruSession().

Launch the TruLens dashboard

from trulens.dashboard import run_dashboard
session = TruSession(database_url = ...) # or default.sqlite by default
run_dashboard(session)

By default, the dashboard will find and run on an unused port number. You can also specify a port number for the dashboard to run on. The function will output a link where the dashboard is running.

Specify a port

from trulens.dashboard import run_dashboard
run_dashboard(port=8502)

Note

If you are running in Google Colab, run_dashboard() will output a tunnel website and IP address that can be entered into the tunnel website.

Streamlit Components¶

In addition to the complete dashboard, several of the dashboard components can be used on their own and added to existing Streamlit dashboards.

Streamlit is an easy way to transform Python scripts into shareable web applications, and has become a popular way to interact with generative AI technology. Several TruLens UI components are now accessible for adding to Streamlit dashboards using the TruLens Streamlit module.

Consider the below app.py which consists of a simple RAG application that is already logged and evaluated with TruLens. Notice in particular, that we are getting both the application's response and record.

Simple Streamlit app with TruLens

import streamlit as st
from trulens.core import TruSession

from base import rag # a rag app with a query method
from base import tru_rag # a rag app wrapped by trulens

session = TruSession()

def generate_and_log_response(input_text):
    with tru_rag as recording:
        response = rag.query(input_text)
    record = recording.get()
    return record, response

with st.form("my_form"):
    text = st.text_area("Enter text:", "How do I launch a streamlit app?")
    submitted = st.form_submit_button("Submit")
    if submitted:
        record, response = generate_and_log_response(text)
        st.info(response)

With the record in hand, we can easily add TruLens components to display the evaluation results of the provided record using trulens_feedback. This will display the TruLens feedback result clickable pills as the feedback is available.

Display feedback results

from trulens.dashboard import streamlit as trulens_st

if submitted:
    trulens_st.trulens_feedback(record=record)

In addition to the feedback results, we can also display the record's trace to help with debugging using trulens_trace from the TruLens streamlit module.

Display the trace

from trulens.dashboard import streamlit as trulens_st

if submitted:
    trulens_st.trulens_trace(record=record)

Last, we can also display the TruLens leaderboard using render_leaderboard from the TruLens streamlit module to understand the aggregate performance across application versions.

Display the application leaderboard

from trulens.dashboard.Leaderboard import render_leaderboard

render_leaderboard()

In combination, the streamlit components allow you to make evaluation front-and-center in your app. This is particularly useful for developer playground use cases, or to ensure users of app reliability.