Skip to content

Commit fc52484

Browse files
authored
Sm doc update (#452)
1 parent f6c7390 commit fc52484

File tree

12 files changed

+27
-20
lines changed

12 files changed

+27
-20
lines changed

athena-docdb/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,8 @@ To support these two SQL statements we'd need to add two environment variables t
3030
1. **docdb_instance_1** - The value should be the DocumentDB connection details in the format of:mongodb://<username>:<password>@<hostname>:<port>/?ssl=true&ssl_ca_certs=rds-combined-ca-bundle.pem&replicaSet=rs0
3131
2. **docdb_instance_2** - The value should be the DocumentDB connection details in the format of: mongodb://<username>:<password>@<hostname>:<port>/?ssl=true&ssl_ca_certs=rds-combined-ca-bundle.pem&replicaSet=rs0
3232

33-
You can also optionally use SecretsManager for part or all of the value for the preceeding connection details. For example, if I set a Lambda environment variable for **docdb_instance_1** to be "mongodb://${docdb_instance_1_creds}@myhostname.com:123/?ssl=true&ssl_ca_certs=rds-combined-ca-bundle.pem&replicaSet=rs0" the Athena Federation
34-
SDK will automatically attempt to retrieve a secret from AWS SecretsManager named "docdb_instance_1_creds" and inject that value in place of "${docdb_instance_1_creds}". Basically anything between ${...} is attempted as a secret in SecretsManager. If no such secret exists, the text isn't replaced.
33+
You can also optionally use AWS Secrets Manager for part or all of the value for the preceding connection details. For example, if I set a Lambda environment variable for **docdb_instance_1** to be "mongodb://${docdb_instance_1_creds}@myhostname.com:123/?ssl=true&ssl_ca_certs=rds-combined-ca-bundle.pem&replicaSet=rs0" the Athena Federation
34+
SDK will automatically attempt to retrieve a secret from AWS Secrets Manager named "docdb_instance_1_creds" and inject that value in place of "${docdb_instance_1_creds}". Basically anything between ${...} is attempted as a secret in SecretsManager. If no such secret exists, the text isn't replaced. To use the Athena Federated Query feature with AWS Secrets Manager, the VPC connected to your Lambda function should have [internet access](https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/) or a [VPC endpoint](https://docs.aws.amazon.com/secretsmanager/latest/userguide/vpc-endpoint-overview.html#vpc-endpoint-create) to connect to Secrets Manager.
3535

3636

3737
### Setting Up Databases & Tables

athena-elasticsearch/README.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,15 +43,15 @@ For any other type of Elasticsearch instance (e.g. self-hosted), the associated
4343
must be specified in the **domain_mapping** variable. This also determines which credentials will
4444
be used to access the endpoint. If **auto_discover_endpoint**=**true**, then AWS credentials will
4545
be used to authenticate to Elasticsearch. Otherwise, username/password credentials retrieved from
46-
Amazon Secrets Manager via the **domain_mapping** variable will be used.
46+
Amazon Secrets Manager via the **domain_mapping** variable will be used.*
4747

4848
3. **domain_mapping** - Used only when **auto_discover_endpoint**=**false**,
4949
this is the mapping between the domain names and their associated endpoints. The variable can
5050
accommodate multiple Elasticsearch endpoints using the following format:
5151
`domain1=endpoint1,domain2=endpoint2,domain3=endpoint3,...` For the purpose of authenticating to
5252
an Elasticsearch endpoint, this connector supports substitution strings injected with the format
5353
`${SecretName}:` with username and password retrieved from AWS Secrets Manager (see example
54-
below). The colon `:` at the end of the expression serves as a separator from the rest of the
54+
below).* The colon `:` at the end of the expression serves as a separator from the rest of the
5555
endpoint.
5656
```
5757
Example (using secret elasticsearch-creds):
@@ -83,6 +83,8 @@ my_bucket).
8383
above bucket where large responses spill. You should configure an S3 lifecycle on this
8484
location to delete old spills after X days/hours.
8585

86+
*To use the Athena Federated Query feature with AWS Secrets Manager, the VPC connected to your Lambda function should have [internet access](https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/) or a [VPC endpoint](https://docs.aws.amazon.com/secretsmanager/latest/userguide/vpc-endpoint-overview.html#vpc-endpoint-create) to connect to Secrets Manager.
87+
8688
## Setting Up Databases & Tables
8789

8890
A Glue table can be set up as a supplemental metadata definition source. To enable

athena-federation-integ-test/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,7 @@ Integration-Test framework provides the following public API allowing access to
144144
return secretCredentials;
145145
}
146146
```
147+
To use the Athena Federated Query feature with AWS Secrets Manager, the VPC connected to your Lambda function should have [internet access](https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/) or a [VPC endpoint](https://docs.aws.amazon.com/secretsmanager/latest/userguide/vpc-endpoint-overview.html#vpc-endpoint-create) to connect to Secrets Manager.
147148

148149
**Environment variables** - Parameters used by the connectors' internal logic:
149150
* **spill_bucket** - The S3 bucket used for spilling excess data.

athena-federation-sdk/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ For those seeking to write their own connectors, we recommend you being by going
1515
* **Federated Metadata** - It is not always practical to centralize table metadata in a centralized meta-store. As such, this SDK allows Athena to delegate portions of its query planning to your connector in order to retrieve metadata about your data source.
1616
* **Glue DataCatalog Support** - You can optionally enable a pre-built Glue MetadataHandler in your connector which will first attempt to fetch metadata from Glue about any table being queried before given you an opportunitiy to modify or re-write the retrieved metadata. This can be handy when you are using a custom format it S3 or if your data source doesn't have its own source of metadata (e.g. redis).
1717
* **Federated UDFs** - Athena can delegate calls for batchable Scalar UDFs to your Lambda function, allowing you to write your own custom User Defined Functions.
18-
* **AWS Secrets Manager Integration** - If your connectors need passwords or other sensitive information, you can optionally use the SDK's built in tooling to resolve secrets. For example, if you have a config with a jdbc connection string you can do: "jdbc://${username}:${password}@hostname:post?options" and the SDK will automatically replace ${username} and ${password} with AWS Secrets Manager secrets of the same name.
18+
* **AWS Secrets Manager Integration** - If your connectors need passwords or other sensitive information, you can optionally use the SDK's built in tooling to resolve secrets. For example, if you have a config with a jdbc connection string you can do: "jdbc://${username}:${password}@hostname:post?options" and the SDK will automatically replace ${username} and ${password} with AWS Secrets Manager secrets of the same name. To use the Athena Federated Query feature with AWS Secrets Manager, the VPC connected to your Lambda function should have [internet access](https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/) or a [VPC endpoint](https://docs.aws.amazon.com/secretsmanager/latest/userguide/vpc-endpoint-overview.html#vpc-endpoint-create) to connect to Secrets Manager.
1919
* **Federated Identity** - When Athena federates a query to your connector, you may want to perform Authz based on the identitiy of the entity that executed the Athena Query.
2020
* **Partition Pruning** - Athena will call you connector to understand how the table being queried is partitioned as well as to obtain which partitions need to be read for a given query. If your source supports partitioning, this give you an opportunity to use the query predicate to perform partition prunning.
2121
* **Parallelized & Pipelined Reads** - Athena will parallelize reading your tables based on the partitioning information you provide. You also have the opportunity to tell Athena how (and if) it should split each partition into multiple (potentially concurrent) read operations. Behind the scenes Athena will parallelize reading the split (work units) you've created and pipeline reads to reduce the performance impact of reading a remote source.

athena-hbase/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -29,11 +29,11 @@ You can also provide one or more properties which define the HBase connection de
2929

3030
To support these two SQL statements we'd need to add two environment variables to our Lambda function:
3131

32-
1. **hbase_instance_1** - The value should be the HBase connection details in the format of: master_hostname:zookeeper_port:hbase_port
33-
2. **hbase_instance_2** - The value should be the HBase connection details in the format of: master_hostname:zookeeper_port:hbase_port
34-
35-
You can also optionally use SecretsManager for part or all of the value for the preceeding connection details. For example, if I set a Lambda environment variable for **hbase_instance_1** to be "${hbase_host_1}:${hbase_master_port_1}:${hbase_zookeeper_port_1}" the Athena Federation SDK will automatically attempt to retrieve a secret from AWS SecretsManager named "hbase_host_1" and inject that value in place of "${hbase_host_1}". It wil do the same for the other secrets: hbase_zookeeper_port_1, hbase_master_port_1. Basically anything between ${...} is attempted as a secret in SecretsManager. If no such secret exists, the text isn't replaced.
32+
1. **hbase_instance_1** - The value should be the HBase connection details in the format of: master_hostname:hbase_port:zookeeper_port
33+
2. **hbase_instance_2** - The value should be the HBase connection details in the format of: master_hostname:hbase_port:zookeeper_port
3634

35+
You can also optionally use SecretsManager for part or all of the value for the preceding connection details. For example, if I set a Lambda environment variable for **hbase_instance_1** to be "${hbase_host_1}:${hbase_master_port_1}:${hbase_zookeeper_port_1}" the Athena Federation SDK will automatically attempt to retrieve a secret from AWS SecretsManager named "hbase_host_1" and inject that value in place of "${hbase_host_1}". It wil do the same for the other secrets: hbase_zookeeper_port_1, hbase_master_port_1. Basically anything between ${...} is attempted as a secret in SecretsManager. If no such secret exists, the text isn't replaced.
36+
To use the Athena Federated Query feature with AWS Secrets Manager, the VPC connected to your Lambda function should have [internet access](https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/) or a [VPC endpoint](https://docs.aws.amazon.com/secretsmanager/latest/userguide/vpc-endpoint-overview.html#vpc-endpoint-create) to connect to Secrets Manager.
3737

3838
### Setting Up Databases & Tables
3939

athena-hbase/athena-hbase.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ Parameters:
4646
Description: 'The name or prefix of a set of names within Secrets Manager that this function should have access to. (e.g. hbase-*).'
4747
Type: String
4848
HBaseConnectionString:
49-
Description: 'The HBase connection details to use by default in the format: master_hostname:zookeeper_port:hbase_port and optionally using SecretsManager (e.g. ${secret_name}).'
49+
Description: 'The HBase connection details to use by default in the format: master_hostname:hbase_port:zookeeper_port and optionally using SecretsManager (e.g. ${secret_name}).'
5050
Type: String
5151
Resources:
5252
ConnectorConfig:

athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseConnectionFactory.java

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535
/**
3636
* Creates and Caches HBase Connection Instances, using the connection string as the cache key.
3737
*
38-
* @Note Connection String format is expected to be host:zookeeper_port:master_port
38+
* @Note Connection String format is expected to be host:master_port:zookeeper_port
3939
*/
4040
public class HbaseConnectionFactory
4141
{
@@ -78,7 +78,7 @@ public synchronized Map<String, String> getClientConfigs()
7878
/**
7979
* Gets or Creates an HBase connection for the given connection string.
8080
*
81-
* @param conStr HBase connection details, format is expected to be host:zookeeper_port:master_port
81+
* @param conStr HBase connection details, format is expected to be host:master_port:zookeeper_port
8282
* @return An HBase connection if the connection succeeded, else the function will throw.
8383
*/
8484
public synchronized Connection getOrCreateConn(String conStr)

athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/connection/HbaseConnectionFactory.java

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@
3333
/**
3434
* Creates and Caches HBase Connection Instances, using the connection string as the cache key.
3535
*
36-
* @Note Connection String format is expected to be host:zookeeper_port:master_port
36+
* @Note Connection String format is expected to be host:master_port:zookeeper_port
3737
*/
3838
public class HbaseConnectionFactory
3939
{
@@ -78,7 +78,7 @@ public synchronized Map<String, String> getClientConfigs()
7878
/**
7979
* Gets or Creates an HBase connection for the given connection string.
8080
*
81-
* @param conStr HBase connection details, format is expected to be host:zookeeper_port:master_port
81+
* @param conStr HBase connection details, format is expected to be host:master_port:zookeeper_port
8282
* @return An HBase connection if the connection succeeded, else the function will throw.
8383
*/
8484
public synchronized HBaseConnection getOrCreateConn(String conStr)

athena-jdbc/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -156,7 +156,7 @@ See respective database documentation for conversion between JDBC and database t
156156

157157
We support two ways to input database user name and password:
158158

159-
1. **AWS Secrets Manager:** The name of the secret in AWS Secrets Manager can be embedded in JDBC connection string, which is used to replace with `username` and `password` values from Secret. Support is tightly integrated for AWS RDS database instances. When using AWS RDS, we highly recommend using AWS Secrets Manager, including credential rotation. If your database is not using AWS RDS, store credentials as JSON in the following format `{“username”: “${username}”, “password”: “${password}”}.`.
159+
1. **AWS Secrets Manager:** The name of the secret in AWS Secrets Manager can be embedded in JDBC connection string, which is used to replace with `username` and `password` values from Secret. Support is tightly integrated for AWS RDS database instances. When using AWS RDS, we highly recommend using AWS Secrets Manager, including credential rotation. If your database is not using AWS RDS, store credentials as JSON in the following format `{“username”: “${username}”, “password”: “${password}”}.`. To use the Athena Federated Query feature with AWS Secrets Manager, the VPC connected to your Lambda function should have [internet access](https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/) or a [VPC endpoint](https://docs.aws.amazon.com/secretsmanager/latest/userguide/vpc-endpoint-overview.html#vpc-endpoint-create) to connect to Secrets Manager.
160160
2. **Connection String:** Username and password can be specified as properties in the JDBC connection string.
161161

162162
# Partitions and Splits

athena-redis/README.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,11 +23,13 @@ The Athena Redis Connector exposes several configuration options via Lambda envi
2323

2424
To enable a Glue Table for use with Redis, you can set the following properties on the Table. redis-endpoint , redis-value-type, and one of redis-keys-zset or redis-key-prefix. Also note that any Glue database which may contain redis tables should have "redis-db-flag" somewhere in the URI property of the Database. You can set this from the Glue Console by editing the database.
2525

26-
1. **redis-endpoint** - (required) The hostname:port:password of the redis server that data for this table should come from. (e.g. athena-federation-demo.cache.amazonaws.com:6379) Alternatively, you can store the endpoint or part of the endpoint in SecretsManager by using ${secret_name} as the table property value.
26+
1. **redis-endpoint** - (required) The hostname:port:password of the redis server that data for this table should come from. (e.g. athena-federation-demo.cache.amazonaws.com:6379) Alternatively, you can store the endpoint or part of the endpoint in AWS Secrets Manager by using ${secret_name} as the table property value.*
2727
2. **redis-keys-zset** - (required if not using # 3) A comma separated list of keys whose value is a zset. Each of the values in the zset is then treated as a key that is part of this table. You must set either this or redis-key-prefix. (e.g. active-orders,pending-orders)
2828
3. **redis-key-prefix** - (required if not using # 2) A comma separated list of key prefixes to scan for values that should be part of this table. You must set either this or redis-keys-zset on the table. (e.g. accounts-*,acct-)
2929
4. **redis-value-type** - (required) Defines how the value for the keys defined by either redis-key-prefix or redis-keys-zset will be mapped to your table. literal maps to a single column. zset also maps to a single column but each key can essentially store N rows. hash allows for each key to be a row with multiple columns. (e.g. hash or literal or zset)
30-
30+
31+
*To use the Athena Federated Query feature with AWS Secrets Manager, the VPC connected to your Lambda function should have [internet access](https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/) or a [VPC endpoint](https://docs.aws.amazon.com/secretsmanager/latest/userguide/vpc-endpoint-overview.html#vpc-endpoint-create) to connect to Secrets Manager.
32+
3133
### Data Types
3234

3335
All Redis values are retrieved as the basic String data type. From there they are converted to one of the below Apache Arrow data types used by the Athena Query Federation SDK based on how you've defined your table(s) in Glue's DataCatalog.

0 commit comments

Comments
 (0)