Skip to content

Commit 874c08b

Browse files
committed
Update doc
1 parent f27b026 commit 874c08b

File tree

1 file changed

+17
-21
lines changed

1 file changed

+17
-21
lines changed

docs/website/docs/dlt-ecosystem/destinations/snowflake.md

Lines changed: 17 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,8 @@ keywords: [Snowflake, destination, data warehouse]
66

77
# Snowflake
88

9-
## Install dlt with Snowflake
10-
**To install the dlt library with Snowflake dependencies, run:**
9+
## Install `dlt` with Snowflake
10+
**To install the `dlt` library with Snowflake dependencies, run:**
1111
```sh
1212
pip install dlt[snowflake]
1313
```
@@ -25,7 +25,7 @@ pip install -r requirements.txt
2525
```
2626
This will install `dlt` with the `snowflake` extra, which contains the Snowflake Python dbapi client.
2727

28-
**3. Create a new database, user, and give dlt access.**
28+
**3. Create a new database, user, and give `dlt` access.**
2929

3030
Read the next chapter below.
3131

@@ -39,19 +39,11 @@ username = "loader"
3939
host = "kgiotue-wn98412"
4040
warehouse = "COMPUTE_WH"
4141
role = "DLT_LOADER_ROLE"
42-
application = "dltHub_dlt"
4342
```
4443
In the case of Snowflake, the **host** is your [Account Identifier](https://docs.snowflake.com/en/user-guide/admin-account-identifier). You can get it in **Admin**/**Accounts** by copying the account URL: https://kgiotue-wn98412.snowflakecomputing.com and extracting the host name (**kgiotue-wn98412**).
4544

4645
The **warehouse** and **role** are optional if you assign defaults to your user. In the example below, we do not do that, so we set them explicitly.
4746

48-
:::note
49-
The `application` field enables Snowflake to identify details about connections made to
50-
Snowflake instances. Snowflake will use this identifier to better understand the usage patterns
51-
associated with specific partner integrations. It is set to `dltHub_dlt` by default, if you prefer not to share the application ID,
52-
just set `application` to an empty string (`""`).
53-
:::
54-
5547
### Setup the database user and permissions
5648
The instructions below assume that you use the default account setup that you get after creating a Snowflake account. You should have a default warehouse named **COMPUTE_WH** and a Snowflake account. Below, we create a new database, user, and assign permissions. The permissions are very generous. A more experienced user can easily reduce `dlt` permissions to just one schema in the database.
5749
```sql
@@ -64,7 +56,7 @@ CREATE ROLE DLT_LOADER_ROLE;
6456
GRANT ROLE DLT_LOADER_ROLE TO USER loader;
6557
-- give database access to new role
6658
GRANT USAGE ON DATABASE dlt_data TO DLT_LOADER_ROLE;
67-
-- allow dlt to create new schemas
59+
-- allow `dlt` to create new schemas
6860
GRANT CREATE SCHEMA ON DATABASE dlt_data TO ROLE DLT_LOADER_ROLE
6961
-- allow access to a warehouse named COMPUTE_WH
7062
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO DLT_LOADER_ROLE;
@@ -150,22 +142,22 @@ Names of tables and columns in [schemas](../../general-usage/schema.md) are kept
150142

151143
## Staging support
152144

153-
Snowflake supports S3 and GCS as file staging destinations. dlt will upload files in the parquet format to the bucket provider and will ask Snowflake to copy their data directly into the db.
145+
Snowflake supports S3 and GCS as file staging destinations. `dlt` will upload files in the parquet format to the bucket provider and will ask Snowflake to copy their data directly into the db.
154146

155147
Alternatively to parquet files, you can also specify jsonl as the staging file format. For this, set the `loader_file_format` argument of the `run` command of the pipeline to `jsonl`.
156148

157149
### Snowflake and Amazon S3
158150

159-
Please refer to the [S3 documentation](./filesystem.md#aws-s3) to learn how to set up your bucket with the bucket_url and credentials. For S3, the dlt Redshift loader will use the AWS credentials provided for S3 to access the S3 bucket if not specified otherwise (see config options below). Alternatively, you can create a stage for your S3 Bucket by following the instructions provided in the [Snowflake S3 documentation](https://docs.snowflake.com/en/user-guide/data-load-s3-config-storage-integration).
151+
Please refer to the [S3 documentation](./filesystem.md#aws-s3) to learn how to set up your bucket with the bucket_url and credentials. For S3, the `dlt` Redshift loader will use the AWS credentials provided for S3 to access the S3 bucket if not specified otherwise (see config options below). Alternatively, you can create a stage for your S3 Bucket by following the instructions provided in the [Snowflake S3 documentation](https://docs.snowflake.com/en/user-guide/data-load-s3-config-storage-integration).
160152
The basic steps are as follows:
161153

162154
* Create a storage integration linked to GCS and the right bucket
163155
* Grant access to this storage integration to the Snowflake role you are using to load the data into Snowflake.
164156
* Create a stage from this storage integration in the PUBLIC namespace, or the namespace of the schema of your data.
165157
* Also grant access to this stage for the role you are using to load data into Snowflake.
166-
* Provide the name of your stage (including the namespace) to dlt like so:
158+
* Provide the name of your stage (including the namespace) to `dlt` like so:
167159

168-
To prevent dlt from forwarding the S3 bucket credentials on every command, and set your S3 stage, change these settings:
160+
To prevent `dlt` from forwarding the S3 bucket credentials on every command, and set your S3 stage, change these settings:
169161

170162
```toml
171163
[destination]
@@ -175,7 +167,7 @@ stage_name="PUBLIC.my_s3_stage"
175167
To run Snowflake with S3 as the staging destination:
176168

177169
```py
178-
# Create a dlt pipeline that will load
170+
# Create a `dlt` pipeline that will load
179171
# chess player data to the Snowflake destination
180172
# via staging on S3
181173
pipeline = dlt.pipeline(
@@ -194,7 +186,7 @@ Please refer to the [Google Storage filesystem documentation](./filesystem.md#go
194186
* Grant access to this storage integration to the Snowflake role you are using to load the data into Snowflake.
195187
* Create a stage from this storage integration in the PUBLIC namespace, or the namespace of the schema of your data.
196188
* Also grant access to this stage for the role you are using to load data into Snowflake.
197-
* Provide the name of your stage (including the namespace) to dlt like so:
189+
* Provide the name of your stage (including the namespace) to `dlt` like so:
198190

199191
```toml
200192
[destination]
@@ -204,7 +196,7 @@ stage_name="PUBLIC.my_gcs_stage"
204196
To run Snowflake with GCS as the staging destination:
205197

206198
```py
207-
# Create a dlt pipeline that will load
199+
# Create a `dlt` pipeline that will load
208200
# chess player data to the Snowflake destination
209201
# via staging on GCS
210202
pipeline = dlt.pipeline(
@@ -225,7 +217,7 @@ Please consult the Snowflake Documentation on [how to create a stage for your Az
225217
* Grant access to this storage integration to the Snowflake role you are using to load the data into Snowflake.
226218
* Create a stage from this storage integration in the PUBLIC namespace, or the namespace of the schema of your data.
227219
* Also grant access to this stage for the role you are using to load data into Snowflake.
228-
* Provide the name of your stage (including the namespace) to dlt like so:
220+
* Provide the name of your stage (including the namespace) to `dlt` like so:
229221

230222
```toml
231223
[destination]
@@ -235,7 +227,7 @@ stage_name="PUBLIC.my_azure_stage"
235227
To run Snowflake with Azure as the staging destination:
236228

237229
```py
238-
# Create a dlt pipeline that will load
230+
# Create a `dlt` pipeline that will load
239231
# chess player data to the Snowflake destination
240232
# via staging on Azure
241233
pipeline = dlt.pipeline(
@@ -262,5 +254,9 @@ This destination [integrates with dbt](../transformations/dbt/dbt.md) via [dbt-s
262254
### Syncing of `dlt` state
263255
This destination fully supports [dlt state sync](../../general-usage/state#syncing-state-with-destination)
264256

257+
### Snowflake connection identifier
258+
We enable Snowflake to identify that the connection is created by `dlt`. Snowflake will use this identifier to better understand the usage patterns
259+
associated with `dlt` integration. The connection identifier is `dltHub_dlt`.
260+
265261
<!--@@@DLT_TUBA snowflake-->
266262

0 commit comments

Comments
 (0)