Skip to content

trulens.benchmark.generate.generate_test_set

trulens.benchmark.generate.generate_test_set

Classes

GenerateTestSet

This class is responsible for generating a test set using the provided application callable.

Functions
__init__
__init__(app_callable: Callable)

Initialize the GenerateTestSet class.

PARAMETER DESCRIPTION
app_callable

The application callable to be used for generating the test set.

TYPE: Callable

_generate_themes
_generate_themes(test_breadth: int) -> str

Generates themes of the context available using a RAG application. These themes, which comprise the test breadth, will be used as categories for test set generation.

PARAMETER DESCRIPTION
test_breadth

The breadth of the test.

TYPE: int

RETURNS DESCRIPTION
list

A list of test categories.

TYPE: str

_format_themes
_format_themes(themes: str, test_breadth: int) -> list

Formats the themes into a python list using an LLM.

PARAMETER DESCRIPTION
themes

The themes to be formatted.

TYPE: str

RETURNS DESCRIPTION
list

A list of formatted themes.

TYPE: list

_generate_test_prompts
_generate_test_prompts(
    test_category: str,
    test_depth: int,
    examples: Optional[list] = None,
) -> str

Generate raw test prompts for a given category, optionally using few shot examples.

PARAMETER DESCRIPTION
test_category

The category for which to generate test prompts.

TYPE: str

test_depth

The depth of the test prompts.

TYPE: int

examples

An optional list of examples to guide the style of the questions.

TYPE: Optional[list] DEFAULT: None

RETURNS DESCRIPTION
str

A string containing test prompts.

TYPE: str

_format_test_prompts
_format_test_prompts(raw_test_prompts: str) -> list

Format the raw test prompts into a python list using an LLM.

PARAMETER DESCRIPTION
raw_test_prompts

The raw test prompts to be formatted.

TYPE: str

RETURNS DESCRIPTION
list

A list of formatted test prompts.

TYPE: list

_generate_and_format_test_prompts
_generate_and_format_test_prompts(
    test_category: str,
    test_depth: int,
    examples: Optional[list] = None,
) -> list

Generate test prompts for a given category, optionally using few shot examples.

PARAMETER DESCRIPTION
test_category

The category for which to generate test prompts.

TYPE: str

test_depth

The depth of the test prompts.

TYPE: int

examples

An optional list of examples to guide the style of the questions.

TYPE: Optional[list] DEFAULT: None

RETURNS DESCRIPTION
list

A list of test prompts.

TYPE: list

generate_test_set
generate_test_set(
    test_breadth: int,
    test_depth: int,
    examples: Optional[list] = None,
) -> dict

Generate a test set, optionally using few shot examples provided.

PARAMETER DESCRIPTION
test_breadth

The breadth of the test set.

TYPE: int

test_depth

The depth of the test set.

TYPE: int

examples

An optional list of examples to guide the style of the questions.

TYPE: Optional[list] DEFAULT: None

RETURNS DESCRIPTION
dict

A dictionary containing the test set.

TYPE: dict

Example
# Instantiate GenerateTestSet with your app callable, in this case: rag_chain.invoke
test = GenerateTestSet(app_callable = rag_chain.invoke)

# Generate the test set of a specified breadth and depth without examples
test_set = test.generate_test_set(test_breadth = 3, test_depth = 2)

# Generate the test set of a specified breadth and depth with examples
examples = ["Why is it hard for AI to plan very far into the future?", "How could letting AI reflect on what went wrong help it improve in the future?"]
test_set_with_examples = test.generate_test_set(test_breadth = 3, test_depth = 2, examples = examples)