Skip to content

Commit 9ddcdc9

Browse files
committed
Update README format
1 parent c71ac3a commit 9ddcdc9

File tree

1 file changed

+38
-52
lines changed

1 file changed

+38
-52
lines changed

dataflow/run_template/README.md

Lines changed: 38 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -1,55 +1,52 @@
11
# Run template
22

3-
[`main.py`](main.py) - Script to run an [Apache Beam] template on [Google Cloud Dataflow].
3+
[![Open in Cloud Shell](http://gstatic.com/cloudssh/images/open-btn.svg)](https://console.cloud.google.com/cloudshell/editor)
44

5-
The following examples show how to run the [`Word_Count` template], but you can run any other template.
5+
This sample demonstrate how to run an
6+
[Apache Beam](https://beam.apache.org/)
7+
template on [Google Cloud Dataflow](https://cloud.google.com/dataflow/docs/).
8+
For more information, see the
9+
[Running templates](https://cloud.google.com/dataflow/docs/guides/templates/running-templates)
10+
docs page.
611

7-
For the `Word_Count` template, we require to pass an `output` Cloud Storage path prefix, and optionally we can pass an `inputFile` Cloud Storage file pattern for the inputs.
12+
The following examples show how to run the
13+
[`Word_Count` template](https://github.com/GoogleCloudPlatform/DataflowTemplates/blob/master/src/main/java/com/google/cloud/teleport/templates/WordCount.java),
14+
but you can run any other template.
15+
16+
For the `Word_Count` template, we require to pass an `output` Cloud Storage path prefix,
17+
and optionally we can pass an `inputFile` Cloud Storage file pattern for the inputs.
818
If `inputFile` is not passed, it will take `gs://apache-beam-samples/shakespeare/kinglear.txt` as default.
919

1020
## Before you begin
1121

12-
1. Install the [Cloud SDK].
13-
14-
1. [Create a new project].
15-
16-
1. [Enable billing].
17-
18-
1. [Enable the APIs](https://console.cloud.google.com/flows/enableapi?apiid=dataflow,compute_component,logging,storage_component,storage_api,bigquery,pubsub,datastore.googleapis.com,cloudfunctions.googleapis.com,cloudresourcemanager.googleapis.com): Dataflow, Compute Engine, Stackdriver Logging, Cloud Storage, Cloud Storage JSON, BigQuery, Pub/Sub, Datastore, Cloud Functions, and Cloud Resource Manager.
19-
20-
1. Setup the Cloud SDK to your GCP project.
21-
22-
```bash
23-
gcloud init
24-
```
22+
Follow the
23+
[Getting started with Google Cloud Dataflow](../README.md)
24+
page, and make sure you have a Google Cloud project with billing enabled
25+
and a *service account JSON key* set up in your `GOOGLE_APPLICATION_CREDENTIALS` environment variable.
26+
Additionally, for this sample you need the following:
2527

2628
1. Create a Cloud Storage bucket.
2729

28-
```bash
29-
gsutil mb gs://your-gcs-bucket
30+
```sh
31+
export BUCKET=your-gcs-bucket
32+
gsutil mb gs://$BUCKET
3033
```
3134

32-
## Setup
33-
34-
The following instructions will help you prepare your development environment.
35-
36-
1. [Install Python and virtualenv].
37-
3835
1. Clone the `python-docs-samples` repository.
3936

40-
```bash
41-
git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git
42-
```
37+
```sh
38+
git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git
39+
```
4340

4441
1. Navigate to the sample code directory.
4542

46-
```bash
43+
```sh
4744
cd python-docs-samples/dataflow/run_template
4845
```
4946

5047
1. Create a virtual environment and activate it.
5148

52-
```bash
49+
```sh
5350
virtualenv env
5451
source env/bin/activate
5552
```
@@ -58,18 +55,18 @@ The following instructions will help you prepare your development environment.
5855
5956
1. Install the sample requirements.
6057

61-
```bash
58+
```sh
6259
pip install -U -r requirements.txt
6360
```
6461

6562
## Running locally
6663

67-
To run a Dataflow template from the command line.
64+
* [`main.py`](main.py)
65+
* [REST API dataflow/projects.templates.launch](https://cloud.google.com/dataflow/docs/reference/rest/v1b3/projects.templates/launch)
6866

69-
> NOTE: To run locally, you'll need to [create a service account key] as a JSON file.
70-
> Then export an environment variable called `GOOGLE_APPLICATION_CREDENTIALS` pointing it to your service account file.
67+
To run a Dataflow template from the command line.
7168

72-
```bash
69+
```sh
7370
python main.py \
7471
--project <your-gcp-project> \
7572
--job wordcount-$(date +'%Y%m%d-%H%M%S') \
@@ -80,10 +77,10 @@ python main.py \
8077

8178
## Running in Python
8279

83-
To run a Dataflow template from Python.
80+
* [`main.py`](main.py)
81+
* [REST API dataflow/projects.templates.launch](https://cloud.google.com/dataflow/docs/reference/rest/v1b3/projects.templates/launch)
8482

85-
> NOTE: To run locally, you'll need to [create a service account key] as a JSON file.
86-
> Then export an environment variable called `GOOGLE_APPLICATION_CREDENTIALS` pointing it to your service account file.
83+
To run a Dataflow template from Python.
8784

8885
```py
8986
import main as run_template
@@ -101,9 +98,12 @@ run_template.run(
10198

10299
## Running in Cloud Functions
103100

101+
* [`main.py`](main.py)
102+
* [REST API dataflow/projects.templates.launch](https://cloud.google.com/dataflow/docs/reference/rest/v1b3/projects.templates/launch)
103+
104104
To deploy this into a Cloud Function and run a Dataflow template via an HTTP request as a REST API.
105105

106-
```bash
106+
```sh
107107
PROJECT=$(gcloud config get-value project) \
108108
REGION=$(gcloud config get-value functions/region)
109109

@@ -121,17 +121,3 @@ curl -X POST "https://$REGION-$PROJECT.cloudfunctions.net/run_template" \
121121
-d inputFile=gs://apache-beam-samples/shakespeare/kinglear.txt \
122122
-d output=gs://<your-gcs-bucket>/wordcount/outputs
123123
```
124-
125-
[Apache Beam]: https://beam.apache.org/
126-
[Google Cloud Dataflow]: https://cloud.google.com/dataflow/docs/
127-
[`Word_Count` template]: https://github.com/GoogleCloudPlatform/DataflowTemplates/blob/master/src/main/java/com/google/cloud/teleport/templates/WordCount.java
128-
129-
[Cloud SDK]: https://cloud.google.com/sdk/docs/
130-
[Create a new project]: https://console.cloud.google.com/projectcreate
131-
[Enable billing]: https://cloud.google.com/billing/docs/how-to/modify-project
132-
[Create a service account key]: https://console.cloud.google.com/apis/credentials/serviceaccountkey
133-
[Creating and managing service accounts]: https://cloud.google.com/iam/docs/creating-managing-service-accounts
134-
[GCP Console IAM page]: https://console.cloud.google.com/iam-admin/iam
135-
[Granting roles to service accounts]: https://cloud.google.com/iam/docs/granting-roles-to-service-accounts
136-
137-
[Install Python and virtualenv]: https://cloud.google.com/python/setup

0 commit comments

Comments
 (0)