π¦ LlamaIndex Integration¶
TruLens provides TruLlama, a deep integration with LlamaIndex to allow you to inspect and evaluate the internals of your application built using LlamaIndex. This is done through the instrumentation of key LlamaIndex classes and methods. To see all classes and methods instrumented, see Appendix: LlamaIndex Instrumented Classes and Methods.
In addition to the default instrumentation, TruLlama exposes the select_context and select_source_nodes methods for evaluations that require access to retrieved context or source nodes. Exposing these methods bypasses the need to know the json structure of your app ahead of time, and makes your evaluations reusable across different apps.
Example usage¶
Below is a quick example of usage. First, we'll create a standard LlamaIndex query engine from Paul Graham's Essay, What I Worked On
Create a Llama-Index Query Engine
from llama_index.core import VectorStoreIndex
from llama_index.readers.web import SimpleWebPageReader
documents = SimpleWebPageReader(html_to_text=True).load_data(
["http://paulgraham.com/worked.html"]
)
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
To instrument an Llama-Index query engine, all that's required is to wrap it using TruLlama.
Instrument a Llama-Index Query Engine
from trulens.apps.llamaindex import TruLlama
tru_query_engine_recorder = TruLlama(query_engine)
with tru_query_engine_recorder as recording:
print(query_engine.query("What did the author do growing up?"))
To properly evaluate LLM apps we often need to point our evaluation at an internal step of our application, such as the retrieved context. Doing so allows us to evaluate for metrics including context relevance and groundedness.
For LlamaIndex applications where the source nodes are used, select_context
can be used to access the retrieved text for evaluation.
Evaluating retrieved context for Llama-Index query engines
import numpy as np
from trulens.core import Feedback
from trulens.providers.openai import OpenAI
provider = OpenAI()
context = TruLlama.select_context(query_engine)
f_context_relevance = (
Feedback(provider.context_relevance)
.on_input()
.on(context)
.aggregate(np.mean)
)
You can find the full quickstart available here: Llama-Index Quickstart
Async Support¶
TruLlama also provides async support for LlamaIndex through the aquery
,
achat
, and astream_chat
methods. This allows you to track and evaluate async
applications.
As an example, below is an LlamaIndex async chat engine (achat
).
Instrument an async Llama-Index app
from llama_index.core import VectorStoreIndex
from llama_index.readers.web import SimpleWebPageReader
from trulens.apps.llamaindex import TruLlama
documents = SimpleWebPageReader(html_to_text=True).load_data(
["http://paulgraham.com/worked.html"]
)
index = VectorStoreIndex.from_documents(documents)
chat_engine = index.as_chat_engine()
tru_chat_recorder = TruLlama(chat_engine)
with tru_chat_recorder as recording:
llm_response_async = await chat_engine.achat(
"What did the author do growing up?"
)
print(llm_response_async)
Streaming Support¶
TruLlama also provides streaming support for LlamaIndex. This allows you to track and evaluate streaming applications.
As an example, below is an LlamaIndex query engine with streaming.
Instrument an async Llama-Index app
from llama_index.core import VectorStoreIndex
from llama_index.readers.web import SimpleWebPageReader
documents = SimpleWebPageReader(html_to_text=True).load_data(
["http://paulgraham.com/worked.html"]
)
index = VectorStoreIndex.from_documents(documents)
chat_engine = index.as_chat_engine(streaming=True)
Just like with other methods, just wrap your streaming query engine with TruLlama and operate like before.
You can also print the response tokens as they are generated using the response_gen
attribute.
Instrument a streaming Llama-Index app
tru_chat_engine_recorder = TruLlama(chat_engine)
with tru_chat_engine_recorder as recording:
response = chat_engine.stream_chat("What did the author do growing up?")
for c in response.response_gen:
print(c)
For examples of using TruLlama
, check out the TruLens Cookbook
Appendix: LlamaIndex Instrumented Classes and Methods¶
The modules, classes, and methods that trulens instruments can be retrieved from the appropriate Instrument subclass.
Example
from trulens.apps.llamaindex import LlamaInstrument
LlamaInstrument().print_instrumentation()
Instrumenting other classes/methods.¶
Additional classes and methods can be instrumented by use of the
trulens.core.instruments.Instrument
methods and decorators. Examples of
such usage can be found in the custom app used in the custom_example.ipynb
notebook which can be found in
examples/expositional/end2end_apps/custom_app/custom_app.py
. More
information about these decorators can be found in the
docs/trulens/tracking/instrumentation/index.ipynb
notebook.
Inspecting instrumentation¶
The specific objects (of the above classes) and methods instrumented for a
particular app can be inspected using the App.print_instrumented
as
exemplified in the next cell. Unlike Instrument.print_instrumentation
, this
function only shows what in an app was actually instrumented.
Example
tru_chat_engine_recorder.print_instrumented()