Integration: Snowflake
A Snowflake integration that allows table retrieval from a Snowflake database.
Table of Contents
Installation
Use pip
to install Snowflake:
pip install snowflake-haystack
Usage
Once installed, initialize the SnowflakeTableRetriever
to use it with Haystack 2.0:
from haystack_integrations.components.retrievers.snowflake import SnowflakeTableRetriever
# Provide your Snowflake credentials during intialization.
executor = SnowflakeTableRetriever(
user="<ACCOUNT-USER>",
account="<ACCOUNT-IDENTIFIER>",
api_key=Secret.from_env_var("SNOWFLAKE_API_KEY"),
warehouse="<WAREHOUSE-NAME>",
)
Ensure you have select
access to the tables before querying the database. More details
here:
response = executor.run(query="""select * from database_name.schema_name.table_name""")
During component initialization, you could provide the schema and database name to avoid needing to provide them in the SQL query:
executor = SnowflakeTableRetriever(
...
schema_name="<SCHEMA-NAME>",
database ="<DB-NAME>"
)
response = executor.run(query="""select * from table_name""")
Snowflake table retriever returns a Pandas dataframe and a Markdown version of the table:
print(response["dataframe"].head(2)) # Pandas dataframe
# Column 1 Column 2
# 0 Value1 Value2
# 1 Value1 Value2
print(response["table"]) # Markdown
# | Column 1 | Column 2 |
# |:----------|:----------|
# | Value1 | Value2 |
# | Value1 | Value2 |
Using SnowflakeTableRetriever
within a pipeline:
from haystack import Pipeline
from haystack.utils import Secret
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator
from haystack_integrations.components.retrievers.snowflake import SnowflakeTableRetriever
executor = SnowflakeTableRetriever(
user="<ACCOUNT-USER>",
account="<ACCOUNT-IDENTIFIER>",
api_key=Secret.from_env_var("SNOWFLAKE_API_KEY"),
warehouse="<WAREHOUSE-NAME>",
)
pipeline = Pipeline()
pipeline.add_component("builder", PromptBuilder(template="Describe this table: {{ table }}"))
pipeline.add_component("snowflake", executor)
pipeline.add_component("llm", OpenAIGenerator(model="gpt-4o"))
pipeline.connect("snowflake.table", "builder.table")
pipeline.connect("builder", "llm")
pipeline.run(data={"query": "select employee, salary from table limit 10;"})
Examples
You can find a code example showing how to use the Snowflake Retriever under the example/
folder of
this repo.
License
snowflake-haystack
is distributed under the terms of the
Apache-2.0 license.