trulens.apps.virtual¶
trulens.apps.virtual
¶
Virtual Apps¶
This module facilitates the ingestion and evaluation of application logs that were generated outside of TruLens. It allows for the creation of a virtual representation of your application, enabling the evaluation of logged data within the TruLens framework.
To begin, construct a virtual application representation. This can be
achieved through a simple dictionary or by utilizing the VirtualApp
class,
which allows for a more structured approach to storing application
information relevant for feedback evaluation.
Constructing a Virtual Application
virtual_app = {
'llm': {'modelname': 'some llm component model name'},
'template': 'information about the template used in the app',
'debug': 'optional fields for additional debugging information'
}
# Converting the dictionary to a VirtualApp instance
from trulens.core import Select
from trulens.apps.virtual import VirtualApp
virtual_app = VirtualApp(virtual_app)
virtual_app[Select.RecordCalls.llm.maxtokens] = 1024
Incorporate components into the virtual app for evaluation by utilizing the
Select
class. This approach allows for the reuse of setup configurations
when defining feedback functions.
Incorporating Components into the Virtual App
# Setting up a virtual app with a retriever component
from trulens.core import Select
retriever_component = Select.RecordCalls.retriever
virtual_app[retriever_component] = 'this is the retriever component'
With your virtual app configured, it's ready to store logged data.
VirtualRecord
offers a structured way to build records from your data for
ingestion into TruLens, distinguishing itself from direct Record
creation
by specifying calls through selectors.
Below is an example of adding records for a context retrieval component, emphasizing that only the data intended for tracking or evaluation needs to be provided.
Adding Records for a Context Retrieval Component
from trulens.apps.virtual import VirtualRecord
# Selector for the context retrieval component's `get_context` call
context_call = retriever_component.get_context
# Creating virtual records
rec1 = VirtualRecord(
main_input='Where is Germany?',
main_output='Germany is in Europe',
calls={
context_call: {
'args': ['Where is Germany?'],
'rets': ['Germany is a country located in Europe.']
}
}
)
rec2 = VirtualRecord(
main_input='Where is Germany?',
main_output='Poland is in Europe',
calls={
context_call: {
'args': ['Where is Germany?'],
'rets': ['Poland is a country located in Europe.']
}
}
)
data = [rec1, rec2]
For existing datasets, such as a dataframe of prompts, contexts, and responses, iterate through the dataframe to create virtual records for each entry.
Creating Virtual Records from a DataFrame
import pandas as pd
# Example dataframe
data = {
'prompt': ['Where is Germany?', 'What is the capital of France?'],
'response': ['Germany is in Europe', 'The capital of France is Paris'],
'context': [
'Germany is a country located in Europe.',
'France is a country in Europe and its capital is Paris.'
]
}
df = pd.DataFrame(data)
# Ingesting data from the dataframe into virtual records
data_dict = df.to_dict('records')
data = []
for record in data_dict:
rec = VirtualRecord(
main_input=record['prompt'],
main_output=record['response'],
calls={
context_call: {
'args': [record['prompt']],
'rets': [record['context']]
}
}
)
data.append(rec)
After constructing the virtual records, feedback functions can be developed
in the same manner as with non-virtual applications, using the newly added
context_call
selector for reference. The same process can be repeated for
any additional selector you add.
Developing Feedback Functions
from trulens.providers.openai import OpenAI
from trulens.core.feedback.feedback import Feedback
# Initializing the feedback provider
openai = OpenAI()
# Defining the context for feedback using the virtual `get_context` call
context = context_call.rets[:]
# Creating a feedback function for context relevance
f_context_relevance = Feedback(openai.context_relevance).on_input().on(context)
These feedback functions are then integrated into TruVirtual
to construct
the recorder, which can handle most configurations applicable to non-virtual
apps.
Integrating Feedback Functions into TruVirtual
from trulens.apps.virtual import TruVirtual
# Setting up the virtual recorder
virtual_recorder = TruVirtual(
app_name='a virtual app',
app_version='base',
app=virtual_app,
feedbacks=[f_context_relevance]
)
To process the records and run any feedback functions associated with the
recorder, use the add_record
method.
Example: "Logging records and running feedback functions"
```python
# Ingesting records into the virtual recorder
for record in data:
virtual_recorder.add_record(record)
```
Metadata about your application can also be included in the VirtualApp
for
evaluation purposes, offering a flexible way to store additional information
about the components of an LLM app.
Storing metadata in a VirtualApp
# Example of storing metadata in a VirtualApp
virtual_app = {
'llm': {'modelname': 'some llm component model name'},
'template': 'information about the template used in the app',
'debug': 'optional debugging information'
}
from trulens.core import Select
from trulens.apps.virtual import VirtualApp
virtual_app = VirtualApp(virtual_app)
virtual_app[Select.RecordCalls.llm.maxtokens] = 1024
This approach is particularly beneficial for evaluating the components of an LLM app.
Evaluating components of an LLM application
# Adding a retriever component to the virtual app
retriever_component = Select.RecordCalls.retriever
virtual_app[retriever_component] = 'this is the retriever component'
Attributes¶
virtual_module
module-attribute
¶
virtual_module = Module(
package_name="trulens",
module_name="trulens.apps.virtual",
)
Module to represent the module of virtual apps.
Virtual apps will record this as their module.
virtual_class
module-attribute
¶
virtual_class = Class(
module=virtual_module, name="VirtualApp"
)
Class to represent the class of virtual apps.
Virtual apps will record this as their class.
virtual_object
module-attribute
¶
virtual_object = Obj(cls=virtual_class, id=0)
Object to represent instances of virtual apps.
Virtual apps will record this as their instance.
virtual_method_root
module-attribute
¶
virtual_method_root = Method(
cls=virtual_class, obj=virtual_object, name="root"
)
Method call to represent the root call of virtual apps.
Virtual apps will record this as their root call.
virtual_method_call
module-attribute
¶
virtual_method_call = Method(
cls=virtual_class,
obj=virtual_object,
name="method_name_not_set",
)
Method call to represent virtual app calls that do not provide this information.
Method name will be replaced by the last attribute in the selector provided by user.
Classes¶
VirtualApp
¶
Bases: dict
A dictionary meant to represent the components of a virtual app.
TruVirtual
will refer to this class as the wrapped app. All calls will be
under VirtualApp.root
Functions¶
select_context
classmethod
¶
select_context()
Select the context of the virtual app. This is fixed to return the default path.
__setitem__
¶
Allow setitem to work on Lenses instead of just strings. Uses Lens.set
if a lens is given.
root
¶
root()
All virtual calls will have this on top of the stack as if their app was called using this as the main/root method.
VirtualRecord
¶
Bases: Record
Virtual records for virtual apps.
Many arguments are filled in by default values if not provided. See Record for all arguments. Listing here is only for those which are required for this method or filled with default values.
PARAMETER | DESCRIPTION |
---|---|
calls
|
A dictionary of calls to be recorded. The keys are selectors and the values are dictionaries with the keys listed in the next section. |
cost
|
Defaults to zero cost. |
perf
|
Defaults to time spanning the processing of this virtual record. Note that individual calls also include perf. Time span is extended to make sure it is not of duration zero. |
Call values are dictionaries containing arguments to RecordAppCall constructor. Values can also be lists of the same. This happens in non-virtual apps when the same method is recorded making multiple calls in a single app invocation. The following defaults are used if not provided.
PARAMETER | TYPE | DEFAULT |
---|---|---|
stack |
List[RecordAppCallMethod] | Two frames: a root call followed by a call by virtual_object, method name derived from the last element of the selector of this call. |
args |
JSON | [] |
rets |
JSON | [] |
perf |
Perf | Time spanning the processing of this virtual call. |
pid |
int | 0 |
tid |
int | 0 |
Attributes¶
cost
class-attribute
instance-attribute
¶
Costs associated with the record.
ts
class-attribute
instance-attribute
¶
Timestamp of last update.
This is usually set whenever a record is changed in any way.
main_input
class-attribute
instance-attribute
¶
The app's main input.
main_output
class-attribute
instance-attribute
¶
The app's main output if there was no error.
main_error
class-attribute
instance-attribute
¶
The app's main error if there was an error.
calls
class-attribute
instance-attribute
¶
calls: List[RecordAppCall] = []
The collection of calls recorded.
Note that these can be converted into a json structure with the same paths
as the app that generated this record via layout_calls_as_app
.
Invariant: calls are ordered by .perf.end_time
.
experimental_otel_spans
class-attribute
instance-attribute
¶
EXPERIMENTAL(otel-tracing): OTEL spans representation of this record.
This will be filled in only if the otel-tracing experimental feature is enabled.
feedback_and_future_results
class-attribute
instance-attribute
¶
feedback_and_future_results: Optional[
List[Tuple[FeedbackDefinition, Future[FeedbackResult]]]
] = Field(None, exclude=True)
Map of feedbacks to the futures for of their results.
These are only filled for records that were just produced. This will not
be filled in when read from database. Also, will not fill in when using
FeedbackMode.DEFERRED
.
feedback_results
class-attribute
instance-attribute
¶
feedback_results: Optional[List[Future[FeedbackResult]]] = (
Field(None, exclude=True)
)
Only the futures part of the above for backwards compatibility.
feedback_results_as_completed
property
¶
feedback_results_as_completed: Iterable[FeedbackResult]
Generate feedback results as they are completed.
Wraps feedback_results in as_completed.
Functions¶
wait_for_feedback_results
¶
wait_for_feedback_results(
feedback_timeout: Optional[float] = None,
) -> Dict[FeedbackDefinition, FeedbackResult]
Wait for feedback results to finish.
PARAMETER | DESCRIPTION |
---|---|
feedback_timeout
|
Timeout in seconds for each feedback function. If
not given, will use the default timeout
|
RETURNS | DESCRIPTION |
---|---|
Dict[FeedbackDefinition, FeedbackResult]
|
A mapping of feedback functions to their results. |
get
¶
Get a value from the record using a path.
PARAMETER | DESCRIPTION |
---|---|
path
|
Path to the value.
TYPE:
|
layout_calls_as_app
¶
layout_calls_as_app() -> Munch
Layout the calls in this record into the structure that follows that of the app that created this record.
This uses the paths stored in each RecordAppCall which are paths into the app.
Note: We cannot create a validated AppDefinition class (or subclass) object here as the layout of records differ in these ways:
-
Records do not include anything that is not an instrumented method hence have most of the structure of a app missing.
-
Records have RecordAppCall as their leafs where method definitions would be in the AppDefinition structure.
TruVirtual
¶
Bases: App
Recorder for virtual apps.
Virtual apps are data only in that they cannot be executed but for whom previously-computed results can be added using add_record. The VirtualRecord class may be useful for creating records for this. Fields used by non-virtual apps can be specified here, notably:
See App and AppDefinition for constructor arguments.
The app
field.¶
You can store any information you would like by passing in a dictionary to
TruVirtual in the app
field. This may involve an index of components or
versions, or anything else. You can refer to these values for evaluating
feedback.
Usage
You can use VirtualApp
to create the app
structure or a plain
dictionary. Using VirtualApp
lets you use Selectors to define components:
virtual_app = VirtualApp()
virtual_app[Select.RecordCalls.llm.maxtokens] = 1024
Example
virtual_app = dict(
llm=dict(
modelname="some llm component model name"
),
template="information about the template I used in my app",
debug="all of these fields are completely optional"
)
virtual = TruVirtual(
app_name="my_virtual_app",
app_version="base",
app=virtual_app
)
Attributes¶
tru_class_info
instance-attribute
¶
tru_class_info: Class
Class information of this pydantic object for use in deserialization.
Using this odd key to not pollute attribute names in whatever class we mix this into. Should be the same as CLASS_INFO.
app_id
class-attribute
instance-attribute
¶
Unique identifier for this app.
Computed deterministically from app_name and app_version. Leaving it here for it to be dumped when serializing. Also making it read-only as it should not be changed after creation.
app_version
instance-attribute
¶
app_version: AppVersion
Version tag for this app. Default is "base".
feedback_definitions
class-attribute
instance-attribute
¶
feedback_definitions: Sequence[FeedbackDefinitionID] = []
Feedback functions to evaluate on each record.
feedback_mode
class-attribute
instance-attribute
¶
feedback_mode: FeedbackMode = WITH_APP_THREAD
How to evaluate feedback functions upon producing a record.
record_ingest_mode
instance-attribute
¶
record_ingest_mode: RecordIngestMode = record_ingest_mode
Mode of records ingestion.
initial_app_loader_dump
class-attribute
instance-attribute
¶
initial_app_loader_dump: Optional[SerialBytes] = None
Serialization of a function that loads an app.
Dump is of the initial app state before any invocations. This can be used to create a new session.
Warning
Experimental work in progress.
app_extra_json
instance-attribute
¶
app_extra_json: JSON
Info to store about the app and to display in dashboard.
This can be used even if app itself cannot be serialized. app_extra_json
,
then, can stand in place for whatever data the user might want to keep track
of about the app.
feedbacks
class-attribute
instance-attribute
¶
Feedback functions to evaluate on each record.
session
class-attribute
instance-attribute
¶
session: TruSession = Field(
default_factory=TruSession, exclude=True
)
Session for this app.
recording_contexts
class-attribute
instance-attribute
¶
recording_contexts: ContextVar[_RecordingContext] = Field(
None, exclude=True
)
Sequences of records produced by the this class used as a context manager are stored in a RecordingContext.
Using a context var so that context managers can be nested.
instrumented_methods
class-attribute
instance-attribute
¶
instrumented_methods: Dict[int, Dict[Callable, Lens]] = (
Field(exclude=True, default_factory=dict)
)
Mapping of instrumented methods (by id(.) of owner object and the function) to their path in this app.
records_with_pending_feedback_results
class-attribute
instance-attribute
¶
records_with_pending_feedback_results: BlockingSet[
Record
] = Field(exclude=True, default_factory=BlockingSet)
Records produced by this app which might have yet to finish feedback runs.
manage_pending_feedback_results_thread
class-attribute
instance-attribute
¶
Thread for manager of pending feedback results queue.
See _manage_pending_feedback_results.
selector_check_warning
class-attribute
instance-attribute
¶
selector_check_warning: bool = False
Selector checking is disabled for virtual apps.
selector_nocheck
class-attribute
instance-attribute
¶
selector_nocheck: bool = True
The selector check must be disabled for virtual apps.
This is because methods that could be called are not known in advance of creating virtual records.
Functions¶
on_method_instrumented
¶
Called by instrumentation system for every function requested to be instrumented by this app.
get_method_path
¶
Get the path of the instrumented function method
relative to this app.
wrap_lazy_values
¶
wrap_lazy_values(
rets: Any,
wrap: Callable[[T], T],
on_done: Callable[[T], T],
context_vars: Optional[ContextVarsOrValues],
) -> Any
Wrap any lazy values in the return value of a method call to invoke handle_done when the value is ready.
This is used to handle library-specific lazy values that are hidden in containers not visible otherwise. Visible lazy values like iterators, generators, awaitables, and async generators are handled elsewhere.
PARAMETER | DESCRIPTION |
---|---|
rets
|
The return value of the method call.
TYPE:
|
wrap
|
A callback to be called when the lazy value is ready. Should return the input value or a wrapped version of it.
TYPE:
|
on_done
|
Called when the lazy values is done and is no longer lazy. This as opposed to a lazy value that evaluates to another lazy values. Should return the value or wrapper.
TYPE:
|
context_vars
|
The contextvars to be captured by the lazy value. If not given, all contexts are captured.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Any
|
The return value with lazy values wrapped. |
get_methods_for_func
¶
Get the methods (rather the inner functions) matching the given func
and the path of each.
on_new_record
¶
on_new_record(func) -> Iterable[_RecordingContext]
Called at the start of record creation.
on_add_record
¶
on_add_record(
ctx: _RecordingContext,
func: Callable,
sig: Signature,
bindings: BoundArguments,
ret: Any,
error: Any,
perf: Perf,
cost: Cost,
existing_record: Optional[Record] = None,
final: bool = False,
) -> Record
Called by instrumented methods if they use _new_record to construct a "record call list.
load
staticmethod
¶
load(obj, *args, **kwargs)
Deserialize/load this object using the class information in tru_class_info to lookup the actual class that will do the deserialization.
model_validate
classmethod
¶
model_validate(*args, **kwargs) -> Any
Deserialized a jsonized version of the app into the instance of the class it was serialized from.
Note
This process uses extra information stored in the jsonized object and handled by WithClassInfo.
continue_session
staticmethod
¶
continue_session(
app_definition_json: JSON, app: Any
) -> AppDefinition
Instantiate the given app
with the given state
app_definition_json
.
Warning
This is an experimental feature with ongoing work.
PARAMETER | DESCRIPTION |
---|---|
app_definition_json
|
The json serialized app.
TYPE:
|
app
|
The app to continue the session with.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
AppDefinition
|
A new |
new_session
staticmethod
¶
new_session(
app_definition_json: JSON,
initial_app_loader: Optional[Callable] = None,
) -> AppDefinition
Create an app instance at the start of a session.
Warning
This is an experimental feature with ongoing work.
Create a copy of the json serialized app with the enclosed app being initialized to its initial state before any records are produced (i.e. blank memory).
get_loadable_apps
staticmethod
¶
get_loadable_apps()
Gets a list of all of the loadable apps.
Warning
This is an experimental feature with ongoing work.
This is those that have initial_app_loader_dump
set.
wait_for_feedback_results
¶
Wait for all feedbacks functions to complete.
PARAMETER | DESCRIPTION |
---|---|
feedback_timeout
|
Timeout in seconds for waiting for feedback results for each feedback function. Note that this is not the total timeout for this entire blocking call. |
RETURNS | DESCRIPTION |
---|---|
List[Record]
|
A list of records that have been waited on. Note a record will be included even if a feedback computation for it failed or timed out. |
This applies to all feedbacks on all records produced by this app. This call will block until finished and if new records are produced while this is running, it will include them.
select_context
classmethod
¶
Try to find retriever components in the given app
and return a lens to
access the retrieved contexts that would appear in a record were these
components to execute.
main_call
¶
If available, a single text to a single text invocation of this app.
main_acall
async
¶
If available, a single text to a single text invocation of this app.
main_input
¶
main_input(
func: Callable, sig: Signature, bindings: BoundArguments
) -> JSON
Determine (guess) the main input string for a main app call.
PARAMETER | DESCRIPTION |
---|---|
func
|
The main function we are targeting in this determination.
TYPE:
|
sig
|
The signature of the above.
TYPE:
|
bindings
|
The arguments to be passed to the function.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
JSON
|
The main input string. |
main_output
¶
main_output(
func: Callable,
sig: Signature,
bindings: BoundArguments,
ret: Any,
) -> JSON
Determine (guess) the "main output" string for a given main app call.
This is for functions whose output is not a string.
PARAMETER | DESCRIPTION |
---|---|
func
|
The main function whose main output we are guessing.
TYPE:
|
sig
|
The signature of the above function.
TYPE:
|
bindings
|
The arguments that were passed to that function.
TYPE:
|
ret
|
The return value of the function.
TYPE:
|
awith_
async
¶
awith_(
func: CallableMaybeAwaitable[A, T], *args, **kwargs
) -> T
Call the given async func
with the given *args
and **kwargs
while recording, producing func
results.
The record of the computation is available through other means like the
database or dashboard. If you need a record of this execution
immediately, you can use awith_record
or the App
as a context
manager instead.
with_
async
¶
with_(func: Callable[[A], T], *args, **kwargs) -> T
Call the given async func
with the given *args
and **kwargs
while recording, producing func
results.
The record of the computation is available through other means like the
database or dashboard. If you need a record of this execution
immediately, you can use awith_record
or the App
as a context
manager instead.
with_record
¶
with_record(
func: Callable[[A], T],
*args,
record_metadata: JSON = None,
**kwargs
) -> Tuple[T, Record]
Call the given func
with the given *args
and **kwargs
, producing
its results as well as a record of the execution.
awith_record
async
¶
awith_record(
func: Callable[[A], Awaitable[T]],
*args,
record_metadata: JSON = None,
**kwargs
) -> Tuple[T, Record]
Call the given func
with the given *args
and **kwargs
, producing
its results as well as a record of the execution.
dummy_record
¶
dummy_record(
cost: Cost = Cost(),
perf: Perf = now(),
ts: datetime = now(),
main_input: str = "main_input are strings.",
main_output: str = "main_output are strings.",
main_error: str = "main_error are strings.",
meta: Dict = {"metakey": "meta are dicts"},
tags: str = "tags are strings",
) -> Record
Create a dummy record with some of the expected structure without actually invoking the app.
The record is a guess of what an actual record might look like but will be missing information that can only be determined after a call is made.
All args are Record fields except these:
- `record_id` is generated using the default id naming schema.
- `app_id` is taken from this recorder.
- `calls` field is constructed based on instrumented methods.
instrumented
¶
instrumented() -> Iterable[Tuple[Lens, ComponentView]]
Iteration over instrumented components and their categories.
format_instrumented_methods
¶
format_instrumented_methods() -> str
Build a string containing a listing of instrumented methods.
print_instrumented_components
¶
print_instrumented_components() -> None
Print instrumented components and their categories.
__init__
¶
__init__(
app: Optional[Union[VirtualApp, JSON]] = None,
**kwargs: Any
)
Virtual app for logging existing app results.
add_record
¶
add_record(
record: Record,
feedback_mode: Optional[FeedbackMode] = None,
) -> Record
Add the given record to the database and evaluate any pre-specified feedbacks on it.
The class VirtualRecord
may be useful for creating
records for virtual models. If feedback_mode
is specified, will use
that mode for this record only.
add_dataframe
¶
add_dataframe(
df, feedback_mode: Optional[FeedbackMode] = None
) -> List[Record]
Add the given dataframe as records to the database and evaluate any pre-specified feedbacks on them.
The class VirtualRecord
may be useful for creating records for virtual models.
If feedback_mode
is specified, will use that mode for these records only.