❄️ Snowflake Quickstart with Cortex LLM Functions¶
In this quickstart you will learn build and evaluate a RAG application with Snowflake Cortex LLM Functions.
Building and evaluating RAG applications with Snowflake Cortex offers developers a unique opportunity to leverage a top-tier, enterprise-focused LLM that is both cost-effective and open-source. Cortex excels in enterprise tasks like SQL generation and coding, providing a robust foundation for developing intelligent applications with significant cost savings.
In this example, we will use Arctic Embed (snowflake-arctic-embed-m
) as our embedding model via HuggingFace, and LLM of your choice for both generation and as the LLM judge to power TruLens feedback functions. The LLM models are fully-mananaged by Cortex LLM functions
Note, you'll need to have an active Snowflake account to run Cortex LLM functions from Snowflake's data warehouse.
# !pip install trulens trulens-providers-cortex chromadb sentence-transformers snowflake-snowpark-python snowflake-ml-python>=1.7.1
import os
from snowflake.snowpark import Session
from trulens.core.utils.keys import check_keys
check_keys("SNOWFLAKE_ACCOUNT", "SNOWFLAKE_USER", "SNOWFLAKE_USER_PASSWORD")
connection_params = {
"account": os.environ["SNOWFLAKE_ACCOUNT"],
"user": os.environ["SNOWFLAKE_USER"],
"password": os.environ["SNOWFLAKE_USER_PASSWORD"],
"role": os.environ.get("SNOWFLAKE_ROLE", "ENGINEER"),
"database": os.environ.get("SNOWFLAKE_DATABASE"),
"schema": os.environ.get("SNOWFLAKE_SCHEMA"),
"warehouse": os.environ.get("SNOWFLAKE_WAREHOUSE"),
}
# Create a Snowflake session
snowpark_session = Session.builder.configs(connection_params).create()
Get Data¶
In this case, we'll just initialize some simple text in the notebook.
university_info = """
The University of Washington, founded in 1861 in Seattle, is a public research university
with over 45,000 students across three campuses in Seattle, Tacoma, and Bothell.
As the flagship institution of the six public universities in Washington state,
UW encompasses over 500 buildings and 20 million square feet of space,
including one of the largest library systems in the world.
"""
Create Vector Store¶
Create a chromadb vector store in memory.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Snowflake/snowflake-arctic-embed-m")
document_embeddings = model.encode([university_info])
import chromadb
chroma_client = chromadb.Client()
vector_store = chroma_client.get_or_create_collection(name="Universities")
Add the university_info to the embedding database.
vector_store.add(
"uni_info", documents=university_info, embeddings=document_embeddings
)
Build RAG from scratch¶
Build a custom RAG from scratch, and add TruLens custom instrumentation.
from trulens.apps.custom import instrument
from trulens.core import TruSession
session = TruSession()
session.reset_database()
from snowflake.cortex import Complete
class RAG_from_scratch:
@instrument
def retrieve(self, query: str) -> list:
"""
Retrieve relevant text from vector store.
"""
results = vector_store.query(
query_embeddings=model.encode([query], prompt_name="query"),
n_results=2,
)
return results["documents"]
@instrument
def generate_completion(self, query: str, context_str: list) -> str:
"""
Generate answer from context.
"""
prompt = f"""
We have provided context information below.
{context_str}
Given this information, please answer the question: {query}
"""
resp = Complete(model='mistral-large2', prompt=[{'role': 'user', 'content': prompt}], session=snowpark_session)
return resp
@instrument
def query(self, query: str) -> str:
context_str = self.retrieve(query)
completion = self.generate_completion(query, context_str)
return completion
rag = RAG_from_scratch()
Set up feedback functions.¶
Here we'll use groundedness, answer relevance and context relevance to detect hallucination.
import numpy as np
from trulens.core import Feedback
from trulens.core import Select
from trulens.providers.cortex import Cortex
provider = Cortex(
snowpark_session=snowpark_session,
model_engine="llama3.1-8b",
)
# Define a groundedness feedback function
f_groundedness = (
Feedback(
provider.groundedness_measure_with_cot_reasons, name="Groundedness"
)
.on(Select.RecordCalls.retrieve.rets.collect())
.on_output()
)
# Question/answer relevance between overall question and answer.
f_answer_relevance = (
Feedback(provider.relevance_with_cot_reasons, name="Answer Relevance")
.on(Select.RecordCalls.retrieve.args.query)
.on_output()
)
# Question/statement relevance between question and each context chunk.
f_context_relevance = (
Feedback(
provider.context_relevance_with_cot_reasons, name="Context Relevance"
)
.on(Select.RecordCalls.retrieve.args.query)
.on(Select.RecordCalls.retrieve.rets.collect())
.aggregate(np.mean)
)
f_coherence = Feedback(
provider.coherence_with_cot_reasons, name="coherence"
).on_output()
Construct the app¶
Wrap the custom RAG with TruCustomApp, add list of feedbacks for eval
from trulens.apps.custom import TruCustomApp
tru_rag = TruCustomApp(
rag,
app_name="RAG",
app_version="v1",
feedbacks=[
f_groundedness,
f_answer_relevance,
f_context_relevance,
f_coherence,
],
)
Run the app¶
Use tru_rag
as a context manager for the custom RAG-from-scratch app.
with tru_rag as recording:
resp = rag.query("When is University of Washington founded?")
resp
session.get_leaderboard(app_ids=[])
from trulens.dashboard import run_dashboard
run_dashboard(session)