You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/website/docs/dlt-ecosystem/destinations/snowflake.md
+17-21Lines changed: 17 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,8 +6,8 @@ keywords: [Snowflake, destination, data warehouse]
6
6
7
7
# Snowflake
8
8
9
-
## Install dlt with Snowflake
10
-
**To install the dlt library with Snowflake dependencies, run:**
9
+
## Install `dlt` with Snowflake
10
+
**To install the `dlt` library with Snowflake dependencies, run:**
11
11
```sh
12
12
pip install dlt[snowflake]
13
13
```
@@ -25,7 +25,7 @@ pip install -r requirements.txt
25
25
```
26
26
This will install `dlt` with the `snowflake` extra, which contains the Snowflake Python dbapi client.
27
27
28
-
**3. Create a new database, user, and give dlt access.**
28
+
**3. Create a new database, user, and give `dlt` access.**
29
29
30
30
Read the next chapter below.
31
31
@@ -39,19 +39,11 @@ username = "loader"
39
39
host = "kgiotue-wn98412"
40
40
warehouse = "COMPUTE_WH"
41
41
role = "DLT_LOADER_ROLE"
42
-
application = "dltHub_dlt"
43
42
```
44
43
In the case of Snowflake, the **host** is your [Account Identifier](https://docs.snowflake.com/en/user-guide/admin-account-identifier). You can get it in **Admin**/**Accounts** by copying the account URL: https://kgiotue-wn98412.snowflakecomputing.com and extracting the host name (**kgiotue-wn98412**).
45
44
46
45
The **warehouse** and **role** are optional if you assign defaults to your user. In the example below, we do not do that, so we set them explicitly.
47
46
48
-
:::note
49
-
The `application` field enables Snowflake to identify details about connections made to
50
-
Snowflake instances. Snowflake will use this identifier to better understand the usage patterns
51
-
associated with specific partner integrations. It is set to `dltHub_dlt` by default, if you prefer not to share the application ID,
52
-
just set `application` to an empty string (`""`).
53
-
:::
54
-
55
47
### Setup the database user and permissions
56
48
The instructions below assume that you use the default account setup that you get after creating a Snowflake account. You should have a default warehouse named **COMPUTE_WH** and a Snowflake account. Below, we create a new database, user, and assign permissions. The permissions are very generous. A more experienced user can easily reduce `dlt` permissions to just one schema in the database.
57
49
```sql
@@ -64,7 +56,7 @@ CREATE ROLE DLT_LOADER_ROLE;
64
56
GRANT ROLE DLT_LOADER_ROLE TO USER loader;
65
57
-- give database access to new role
66
58
GRANT USAGE ON DATABASE dlt_data TO DLT_LOADER_ROLE;
67
-
-- allow dlt to create new schemas
59
+
-- allow `dlt` to create new schemas
68
60
GRANT CREATE SCHEMA ON DATABASE dlt_data TO ROLE DLT_LOADER_ROLE
69
61
-- allow access to a warehouse named COMPUTE_WH
70
62
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO DLT_LOADER_ROLE;
@@ -150,22 +142,22 @@ Names of tables and columns in [schemas](../../general-usage/schema.md) are kept
150
142
151
143
## Staging support
152
144
153
-
Snowflake supports S3 and GCS as file staging destinations. dlt will upload files in the parquet format to the bucket provider and will ask Snowflake to copy their data directly into the db.
145
+
Snowflake supports S3 and GCS as file staging destinations. `dlt` will upload files in the parquet format to the bucket provider and will ask Snowflake to copy their data directly into the db.
154
146
155
147
Alternatively to parquet files, you can also specify jsonl as the staging file format. For this, set the `loader_file_format` argument of the `run` command of the pipeline to `jsonl`.
156
148
157
149
### Snowflake and Amazon S3
158
150
159
-
Please refer to the [S3 documentation](./filesystem.md#aws-s3) to learn how to set up your bucket with the bucket_url and credentials. For S3, the dlt Redshift loader will use the AWS credentials provided for S3 to access the S3 bucket if not specified otherwise (see config options below). Alternatively, you can create a stage for your S3 Bucket by following the instructions provided in the [Snowflake S3 documentation](https://docs.snowflake.com/en/user-guide/data-load-s3-config-storage-integration).
151
+
Please refer to the [S3 documentation](./filesystem.md#aws-s3) to learn how to set up your bucket with the bucket_url and credentials. For S3, the `dlt` Redshift loader will use the AWS credentials provided for S3 to access the S3 bucket if not specified otherwise (see config options below). Alternatively, you can create a stage for your S3 Bucket by following the instructions provided in the [Snowflake S3 documentation](https://docs.snowflake.com/en/user-guide/data-load-s3-config-storage-integration).
160
152
The basic steps are as follows:
161
153
162
154
* Create a storage integration linked to GCS and the right bucket
163
155
* Grant access to this storage integration to the Snowflake role you are using to load the data into Snowflake.
164
156
* Create a stage from this storage integration in the PUBLIC namespace, or the namespace of the schema of your data.
165
157
* Also grant access to this stage for the role you are using to load data into Snowflake.
166
-
* Provide the name of your stage (including the namespace) to dlt like so:
158
+
* Provide the name of your stage (including the namespace) to `dlt` like so:
167
159
168
-
To prevent dlt from forwarding the S3 bucket credentials on every command, and set your S3 stage, change these settings:
160
+
To prevent `dlt` from forwarding the S3 bucket credentials on every command, and set your S3 stage, change these settings:
0 commit comments