trulens.providers.huggingface¶
trulens.providers.huggingface
¶
Additional Dependency Required
To use this module, you must have the trulens-providers-huggingface
package installed.
pip install trulens-providers-huggingface
Classes¶
Huggingface
¶
Bases: HuggingfaceBase
Out of the box feedback functions calling Huggingface APIs.
Attributes¶
tru_class_info
instance-attribute
¶
tru_class_info: Class
Class information of this pydantic object for use in deserialization.
Using this odd key to not pollute attribute names in whatever class we mix this into. Should be the same as CLASS_INFO.
Functions¶
load
staticmethod
¶
load(obj, *args, **kwargs)
Deserialize/load this object using the class information in tru_class_info to lookup the actual class that will do the deserialization.
model_validate
classmethod
¶
model_validate(*args, **kwargs) -> Any
Deserialized a jsonized version of the app into the instance of the class it was serialized from.
Note
This process uses extra information stored in the jsonized object and handled by WithClassInfo.
language_match
¶
Uses Huggingface's papluca/xlm-roberta-base-language-detection model.
A function that uses language detection on text1
and text2
and
calculates the probit difference on the language detected on text1. The
function is: 1.0 - (|probit_language_text1(text1) -
probit_language_text1(text2))
Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()
feedback = Feedback(huggingface_provider.language_match).on_input_output()
PARAMETER | DESCRIPTION |
---|---|
text1 |
Text to evaluate.
TYPE:
|
text2 |
Comparative text to evaluate.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
float
|
A value between 0 and 1. 0 being "different languages" and 1 being "same languages". |
groundedness_measure_with_nli
¶
A measure to track if the source material supports each sentence in the statement using an NLI model.
First the response will be split into statements using a sentence tokenizer.The NLI model will process each statement using a natural language inference model, and will use the entire source.
Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()
f_groundedness = (
Feedback(huggingface_provider.groundedness_measure_with_nli)
.on(context)
.on_output()
PARAMETER | DESCRIPTION |
---|---|
source |
The source that should support the statement
TYPE:
|
statement |
The statement to check groundedness
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tuple[float, dict]
|
Tuple[float, str]: A tuple containing a value between 0.0 (not grounded) and 1.0 (grounded) and a string containing the reasons for the evaluation. |
context_relevance
¶
Uses Huggingface's truera/context_relevance model, a model that uses computes the relevance of a given context to the prompt. The model can be found at https://huggingface.co/truera/context_relevance.
Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()
feedback = (
Feedback(huggingface_provider.context_relevance)
.on_input()
.on(context)
.aggregate(np.mean)
)
PARAMETER | DESCRIPTION |
---|---|
prompt |
The given prompt.
TYPE:
|
context |
Comparative contextual information.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
float
|
A value between 0 and 1. 0 being irrelevant and 1 being a relevant context for addressing the prompt.
TYPE:
|
positive_sentiment
¶
Uses Huggingface's cardiffnlp/twitter-roberta-base-sentiment model. A
function that uses a sentiment classifier on text
.
Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()
feedback = Feedback(huggingface_provider.positive_sentiment).on_output()
PARAMETER | DESCRIPTION |
---|---|
text |
Text to evaluate.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
float
|
A value between 0 (negative sentiment) and 1 (positive sentiment).
TYPE:
|
toxic
¶
A function that uses a toxic comment classifier on text
.
Uses Huggingface's martin-ha/toxic-comment-model model.
Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()
feedback = Feedback(huggingface_provider.toxic).on_output()
PARAMETER | DESCRIPTION |
---|---|
text |
Text to evaluate.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
float
|
A value between 0 (not toxic) and 1 (toxic).
TYPE:
|
pii_detection
¶
NER model to detect PII.
Example
hugs = Huggingface()
# Define a pii_detection feedback function using HuggingFace.
f_pii_detection = Feedback(hugs.pii_detection).on_input()
PARAMETER | DESCRIPTION |
---|---|
text |
A text prompt that may contain a PII.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
float
|
The likelihood that a PII is contained in the input text.
TYPE:
|
pii_detection_with_cot_reasons
¶
pii_detection_with_cot_reasons(text: str)
NER model to detect PII, with reasons.
Example
hugs = Huggingface()
# Define a pii_detection feedback function using HuggingFace.
f_pii_detection = Feedback(hugs.pii_detection).on_input()
Args: text: A text prompt that may contain a name.
Returns: Tuple[float, str]: A tuple containing a the likelihood that a PII is contained in the input text and a string containing what PII is detected (if any).
hallucination_evaluator
¶
Evaluates the hallucination score for a combined input of two statements as a float 0<x<1 representing a true/false boolean. if the return is greater than 0.5 the statement is evaluated as true. if the return is less than 0.5 the statement is evaluated as a hallucination.
Example
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()
score = huggingface_provider.hallucination_evaluator("The sky is blue. [SEP] Apples are red , the grass is green.")
PARAMETER | DESCRIPTION |
---|---|
model_output |
This is what an LLM returns based on the text chunks retrieved during RAG
TYPE:
|
retrieved_text_chunks |
These are the text chunks you have retrieved during RAG
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
float
|
Hallucination score
TYPE:
|
HuggingfaceLocal
¶
Bases: HuggingfaceBase
Out of the box feedback functions using HuggingFace models locally.
Attributes¶
tru_class_info
instance-attribute
¶
tru_class_info: Class
Class information of this pydantic object for use in deserialization.
Using this odd key to not pollute attribute names in whatever class we mix this into. Should be the same as CLASS_INFO.
endpoint
class-attribute
instance-attribute
¶
Endpoint supporting this provider.
Remote API invocations are handled by the endpoint.
Functions¶
load
staticmethod
¶
load(obj, *args, **kwargs)
Deserialize/load this object using the class information in tru_class_info to lookup the actual class that will do the deserialization.
model_validate
classmethod
¶
model_validate(*args, **kwargs) -> Any
Deserialized a jsonized version of the app into the instance of the class it was serialized from.
Note
This process uses extra information stored in the jsonized object and handled by WithClassInfo.
language_match
¶
Uses Huggingface's papluca/xlm-roberta-base-language-detection model.
A function that uses language detection on text1
and text2
and
calculates the probit difference on the language detected on text1. The
function is: 1.0 - (|probit_language_text1(text1) -
probit_language_text1(text2))
Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()
feedback = Feedback(huggingface_provider.language_match).on_input_output()
PARAMETER | DESCRIPTION |
---|---|
text1 |
Text to evaluate.
TYPE:
|
text2 |
Comparative text to evaluate.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
float
|
A value between 0 and 1. 0 being "different languages" and 1 being "same languages". |
groundedness_measure_with_nli
¶
A measure to track if the source material supports each sentence in the statement using an NLI model.
First the response will be split into statements using a sentence tokenizer.The NLI model will process each statement using a natural language inference model, and will use the entire source.
Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()
f_groundedness = (
Feedback(huggingface_provider.groundedness_measure_with_nli)
.on(context)
.on_output()
PARAMETER | DESCRIPTION |
---|---|
source |
The source that should support the statement
TYPE:
|
statement |
The statement to check groundedness
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tuple[float, dict]
|
Tuple[float, str]: A tuple containing a value between 0.0 (not grounded) and 1.0 (grounded) and a string containing the reasons for the evaluation. |
context_relevance
¶
Uses Huggingface's truera/context_relevance model, a model that uses computes the relevance of a given context to the prompt. The model can be found at https://huggingface.co/truera/context_relevance.
Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()
feedback = (
Feedback(huggingface_provider.context_relevance)
.on_input()
.on(context)
.aggregate(np.mean)
)
PARAMETER | DESCRIPTION |
---|---|
prompt |
The given prompt.
TYPE:
|
context |
Comparative contextual information.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
float
|
A value between 0 and 1. 0 being irrelevant and 1 being a relevant context for addressing the prompt.
TYPE:
|
positive_sentiment
¶
Uses Huggingface's cardiffnlp/twitter-roberta-base-sentiment model. A
function that uses a sentiment classifier on text
.
Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()
feedback = Feedback(huggingface_provider.positive_sentiment).on_output()
PARAMETER | DESCRIPTION |
---|---|
text |
Text to evaluate.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
float
|
A value between 0 (negative sentiment) and 1 (positive sentiment).
TYPE:
|
toxic
¶
A function that uses a toxic comment classifier on text
.
Uses Huggingface's martin-ha/toxic-comment-model model.
Example
from trulens.core import Feedback
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()
feedback = Feedback(huggingface_provider.toxic).on_output()
PARAMETER | DESCRIPTION |
---|---|
text |
Text to evaluate.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
float
|
A value between 0 (not toxic) and 1 (toxic).
TYPE:
|
pii_detection
¶
NER model to detect PII.
Example
hugs = Huggingface()
# Define a pii_detection feedback function using HuggingFace.
f_pii_detection = Feedback(hugs.pii_detection).on_input()
PARAMETER | DESCRIPTION |
---|---|
text |
A text prompt that may contain a PII.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
float
|
The likelihood that a PII is contained in the input text.
TYPE:
|
pii_detection_with_cot_reasons
¶
pii_detection_with_cot_reasons(text: str)
NER model to detect PII, with reasons.
Example
hugs = Huggingface()
# Define a pii_detection feedback function using HuggingFace.
f_pii_detection = Feedback(hugs.pii_detection).on_input()
Args: text: A text prompt that may contain a name.
Returns: Tuple[float, str]: A tuple containing a the likelihood that a PII is contained in the input text and a string containing what PII is detected (if any).
hallucination_evaluator
¶
Evaluates the hallucination score for a combined input of two statements as a float 0<x<1 representing a true/false boolean. if the return is greater than 0.5 the statement is evaluated as true. if the return is less than 0.5 the statement is evaluated as a hallucination.
Example
from trulens.providers.huggingface import Huggingface
huggingface_provider = Huggingface()
score = huggingface_provider.hallucination_evaluator("The sky is blue. [SEP] Apples are red , the grass is green.")
PARAMETER | DESCRIPTION |
---|---|
model_output |
This is what an LLM returns based on the text chunks retrieved during RAG
TYPE:
|
retrieved_text_chunks |
These are the text chunks you have retrieved during RAG
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
float
|
Hallucination score
TYPE:
|