Skip to content

trulens.providers.huggingface

trulens.providers.huggingface

Additional Dependency Required

To use this module, you must have the trulens-providers-huggingface package installed.

pip install trulens-providers-huggingface

Classes

Huggingface

Bases: HuggingfaceBase

Out of the box feedback functions calling Huggingface APIs.

Attributes
tru_class_info instance-attribute
tru_class_info: Class

Class information of this pydantic object for use in deserialization.

Using this odd key to not pollute attribute names in whatever class we mix this into. Should be the same as CLASS_INFO.

Functions
__rich_repr__
__rich_repr__() -> Result

Requirement for pretty printing using the rich package.

load staticmethod
load(obj, *args, **kwargs)

Deserialize/load this object using the class information in tru_class_info to lookup the actual class that will do the deserialization.

model_validate classmethod
model_validate(*args, **kwargs) -> Any

Deserialized a jsonized version of the app into the instance of the class it was serialized from.

Note

This process uses extra information stored in the jsonized object and handled by WithClassInfo.

language_match
language_match(
    text1: str, text2: str
) -> Tuple[float, Dict]

Uses Huggingface's papluca/xlm-roberta-base-language-detection model.

A function that uses language detection on text1 and text2 and calculates the probit difference on the language detected on text1. The function is: 1.0 - (|probit_language_text1(text1) - probit_language_text1(text2))

Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()

feedback = Feedback(huggingface_provider.language_match).on_input_output()
PARAMETER DESCRIPTION
text1

Text to evaluate.

TYPE: str

text2

Comparative text to evaluate.

TYPE: str

RETURNS DESCRIPTION
float

A value between 0 and 1. 0 being "different languages" and 1 being "same languages".

TYPE: Tuple[float, Dict]

groundedness_measure_with_nli
groundedness_measure_with_nli(
    source: str, statement: str
) -> Tuple[float, dict]

A measure to track if the source material supports each sentence in the statement using an NLI model.

First the response will be split into statements using a sentence tokenizer.The NLI model will process each statement using a natural language inference model, and will use the entire source.

Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface

huggingface_provider = Huggingface()

f_groundedness = (
    Feedback(huggingface_provider.groundedness_measure_with_nli)
    .on(context)
    .on_output()
PARAMETER DESCRIPTION
source

The source that should support the statement

TYPE: str

statement

The statement to check groundedness

TYPE: str

RETURNS DESCRIPTION
Tuple[float, dict]

Tuple[float, str]: A tuple containing a value between 0.0 (not grounded) and 1.0 (grounded) and a string containing the reasons for the evaluation.

context_relevance
context_relevance(prompt: str, context: str) -> float

Uses Huggingface's truera/context_relevance model, a model that uses computes the relevance of a given context to the prompt. The model can be found at https://huggingface.co/truera/context_relevance.

Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()

feedback = (
    Feedback(huggingface_provider.context_relevance)
    .on_input()
    .on(context)
    .aggregate(np.mean)
    )
PARAMETER DESCRIPTION
prompt

The given prompt.

TYPE: str

context

Comparative contextual information.

TYPE: str

RETURNS DESCRIPTION
float

A value between 0 and 1. 0 being irrelevant and 1 being a relevant context for addressing the prompt.

TYPE: float

positive_sentiment
positive_sentiment(text: str) -> float

Uses Huggingface's cardiffnlp/twitter-roberta-base-sentiment model. A function that uses a sentiment classifier on text.

Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()

feedback = Feedback(huggingface_provider.positive_sentiment).on_output()
PARAMETER DESCRIPTION
text

Text to evaluate.

TYPE: str

RETURNS DESCRIPTION
float

A value between 0 (negative sentiment) and 1 (positive sentiment).

TYPE: float

toxic
toxic(text: str) -> float

A function that uses a toxic comment classifier on text.

Uses Huggingface's martin-ha/toxic-comment-model model.

Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface

huggingface_provider = Huggingface()
feedback = Feedback(huggingface_provider.toxic).on_output()
PARAMETER DESCRIPTION
text

Text to evaluate.

TYPE: str

RETURNS DESCRIPTION
float

A value between 0 (not toxic) and 1 (toxic).

TYPE: float

pii_detection
pii_detection(text: str) -> float

NER model to detect PII.

Example
hugs = Huggingface()

# Define a pii_detection feedback function using HuggingFace.
f_pii_detection = Feedback(hugs.pii_detection).on_input()
PARAMETER DESCRIPTION
text

A text prompt that may contain a PII.

TYPE: str

RETURNS DESCRIPTION
float

The likelihood that a PII is contained in the input text.

TYPE: float

pii_detection_with_cot_reasons
pii_detection_with_cot_reasons(text: str)

NER model to detect PII, with reasons.

Example
hugs = Huggingface()

# Define a pii_detection feedback function using HuggingFace.
f_pii_detection = Feedback(hugs.pii_detection).on_input()

Args: text: A text prompt that may contain a name.

Returns: Tuple[float, str]: A tuple containing a the likelihood that a PII is contained in the input text and a string containing what PII is detected (if any).

hallucination_evaluator
hallucination_evaluator(
    model_output: str, retrieved_text_chunks: str
) -> float

Evaluates the hallucination score for a combined input of two statements as a float 0<x<1 representing a true/false boolean. if the return is greater than 0.5 the statement is evaluated as true. if the return is less than 0.5 the statement is evaluated as a hallucination.

Example
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()

score = huggingface_provider.hallucination_evaluator("The sky is blue. [SEP] Apples are red , the grass is green.")
PARAMETER DESCRIPTION
model_output

This is what an LLM returns based on the text chunks retrieved during RAG

TYPE: str

retrieved_text_chunks

These are the text chunks you have retrieved during RAG

TYPE: str

RETURNS DESCRIPTION
float

Hallucination score

TYPE: float

__init__
__init__(
    name: str = "huggingface",
    endpoint: Optional[Endpoint] = None,
    **kwargs
)

Create a Huggingface Provider with out of the box feedback functions.

Example
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()

HuggingfaceLocal

Bases: HuggingfaceBase

Out of the box feedback functions using HuggingFace models locally.

Attributes
tru_class_info instance-attribute
tru_class_info: Class

Class information of this pydantic object for use in deserialization.

Using this odd key to not pollute attribute names in whatever class we mix this into. Should be the same as CLASS_INFO.

endpoint class-attribute instance-attribute
endpoint: Optional[Endpoint] = None

Endpoint supporting this provider.

Remote API invocations are handled by the endpoint.

Functions
__rich_repr__
__rich_repr__() -> Result

Requirement for pretty printing using the rich package.

load staticmethod
load(obj, *args, **kwargs)

Deserialize/load this object using the class information in tru_class_info to lookup the actual class that will do the deserialization.

model_validate classmethod
model_validate(*args, **kwargs) -> Any

Deserialized a jsonized version of the app into the instance of the class it was serialized from.

Note

This process uses extra information stored in the jsonized object and handled by WithClassInfo.

language_match
language_match(
    text1: str, text2: str
) -> Tuple[float, Dict]

Uses Huggingface's papluca/xlm-roberta-base-language-detection model.

A function that uses language detection on text1 and text2 and calculates the probit difference on the language detected on text1. The function is: 1.0 - (|probit_language_text1(text1) - probit_language_text1(text2))

Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()

feedback = Feedback(huggingface_provider.language_match).on_input_output()
PARAMETER DESCRIPTION
text1

Text to evaluate.

TYPE: str

text2

Comparative text to evaluate.

TYPE: str

RETURNS DESCRIPTION
float

A value between 0 and 1. 0 being "different languages" and 1 being "same languages".

TYPE: Tuple[float, Dict]

groundedness_measure_with_nli
groundedness_measure_with_nli(
    source: str, statement: str
) -> Tuple[float, dict]

A measure to track if the source material supports each sentence in the statement using an NLI model.

First the response will be split into statements using a sentence tokenizer.The NLI model will process each statement using a natural language inference model, and will use the entire source.

Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface

huggingface_provider = Huggingface()

f_groundedness = (
    Feedback(huggingface_provider.groundedness_measure_with_nli)
    .on(context)
    .on_output()
PARAMETER DESCRIPTION
source

The source that should support the statement

TYPE: str

statement

The statement to check groundedness

TYPE: str

RETURNS DESCRIPTION
Tuple[float, dict]

Tuple[float, str]: A tuple containing a value between 0.0 (not grounded) and 1.0 (grounded) and a string containing the reasons for the evaluation.

context_relevance
context_relevance(prompt: str, context: str) -> float

Uses Huggingface's truera/context_relevance model, a model that uses computes the relevance of a given context to the prompt. The model can be found at https://huggingface.co/truera/context_relevance.

Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()

feedback = (
    Feedback(huggingface_provider.context_relevance)
    .on_input()
    .on(context)
    .aggregate(np.mean)
    )
PARAMETER DESCRIPTION
prompt

The given prompt.

TYPE: str

context

Comparative contextual information.

TYPE: str

RETURNS DESCRIPTION
float

A value between 0 and 1. 0 being irrelevant and 1 being a relevant context for addressing the prompt.

TYPE: float

positive_sentiment
positive_sentiment(text: str) -> float

Uses Huggingface's cardiffnlp/twitter-roberta-base-sentiment model. A function that uses a sentiment classifier on text.

Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()

feedback = Feedback(huggingface_provider.positive_sentiment).on_output()
PARAMETER DESCRIPTION
text

Text to evaluate.

TYPE: str

RETURNS DESCRIPTION
float

A value between 0 (negative sentiment) and 1 (positive sentiment).

TYPE: float

toxic
toxic(text: str) -> float

A function that uses a toxic comment classifier on text.

Uses Huggingface's martin-ha/toxic-comment-model model.

Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface

huggingface_provider = Huggingface()
feedback = Feedback(huggingface_provider.toxic).on_output()
PARAMETER DESCRIPTION
text

Text to evaluate.

TYPE: str

RETURNS DESCRIPTION
float

A value between 0 (not toxic) and 1 (toxic).

TYPE: float

pii_detection
pii_detection(text: str) -> float

NER model to detect PII.

Example
hugs = Huggingface()

# Define a pii_detection feedback function using HuggingFace.
f_pii_detection = Feedback(hugs.pii_detection).on_input()
PARAMETER DESCRIPTION
text

A text prompt that may contain a PII.

TYPE: str

RETURNS DESCRIPTION
float

The likelihood that a PII is contained in the input text.

TYPE: float

pii_detection_with_cot_reasons
pii_detection_with_cot_reasons(text: str)

NER model to detect PII, with reasons.

Example
hugs = Huggingface()

# Define a pii_detection feedback function using HuggingFace.
f_pii_detection = Feedback(hugs.pii_detection).on_input()

Args: text: A text prompt that may contain a name.

Returns: Tuple[float, str]: A tuple containing a the likelihood that a PII is contained in the input text and a string containing what PII is detected (if any).

hallucination_evaluator
hallucination_evaluator(
    model_output: str, retrieved_text_chunks: str
) -> float

Evaluates the hallucination score for a combined input of two statements as a float 0<x<1 representing a true/false boolean. if the return is greater than 0.5 the statement is evaluated as true. if the return is less than 0.5 the statement is evaluated as a hallucination.

Example
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()

score = huggingface_provider.hallucination_evaluator("The sky is blue. [SEP] Apples are red , the grass is green.")
PARAMETER DESCRIPTION
model_output

This is what an LLM returns based on the text chunks retrieved during RAG

TYPE: str

retrieved_text_chunks

These are the text chunks you have retrieved during RAG

TYPE: str

RETURNS DESCRIPTION
float

Hallucination score

TYPE: float