Skip to content

Conversation

@tizhou86
Copy link
Member

@tizhou86 tizhou86 commented Jan 5, 2017

@cxwangyi Tweaked some places, Guoyan is going through this tutorial step by step, we will make some tweaks later on after he complete the tutorial.

Copy link
Collaborator

@wangkuiyi wangkuiyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

赞修正!


```
$ aws kms --region=us-west-2 create-key --description="kube-aws assets"
$ aws kms --region=us-west-1 create-key --description="kube-aws assets"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里为什么都改成 us-west-1 呢?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coreos默认是west-1,之前debian默认是west-2。

You will use the `KeyMetadata.Arn` string to identify your KMS key in the init step.

And then you need to add several inline policies in your user permission.
And then you need to add several inline policies in your user permission, which located under `iam resources/users/your own account`, click the `add inline policy` at the right-botton corner.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这句话语法有问题。user permission 不能 locate 在屏幕某个地方。有些buttons/combo lists 可能locate在屏幕的某个地方。

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

恩,这个我周末再做一遍,加一张图进去。

]
}
```
NOTICE: you need to substitute `YOUR_CLUSTER_NAME` above for your own cluster name.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

substitute ==> replace ... by ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

####Upload Training Data File

Here we will use PaddlePaddle's official recommendation demo as the content for this training, we put the training data file into a directory named by job name, which located in EFS sharing volume, the tree structure for the directory looks like:
Here we will use PaddlePaddle's official recommendation demo as the content for this training, we put the training data file into a directory named by job name, which located under EFS sharing volume, the tree structure for the directory looks like:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

located under ==> in an

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

```

The `paddle-cluster-job` directory is the job name for this training, this training includes 3 PaddlePaddle node, we store the pre-divided data under `paddle-cluster-job/data` directory, directory 0, 1, 2 each represent 3 nodes' trainer_id. the training data in in recommendation directory, the training results and logs will be in the output directory.
The `paddle-cluster-job` directory is the job name for this training, this training includes 3 PaddlePaddle node, we store the pre-divided data under `paddle-cluster-job/data` directory, directory 0, 1, 2 each represent 3 nodes' trainer_id. the training configuration file is in recommendation directory, the training results and logs will be in the output directory after the training.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recommendation directory ==> directory recommendation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

```

The `paddle-cluster-job` directory is the job name for this training, this training includes 3 PaddlePaddle node, we store the pre-divided data under `paddle-cluster-job/data` directory, directory 0, 1, 2 each represent 3 nodes' trainer_id. the training data in in recommendation directory, the training results and logs will be in the output directory.
The `paddle-cluster-job` directory is the job name for this training, this training includes 3 PaddlePaddle node, we store the pre-divided data under `paddle-cluster-job/data` directory, directory 0, 1, 2 each represent 3 nodes' trainer_id. the training configuration file is in recommendation directory, the training results and logs will be in the output directory after the training.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我觉得不需要 after the training

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

恩,好

In yaml file, we describe the Docker image we use for this training, the node number we need to startup, the volume mounting information and all the necessary parameters we need for `paddle pserver` and `paddle train` processes.

The yaml file content is as follows:
The yaml file content is as follows, you need to fill in your own docker image uri:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docker ==> Docker 专有名词首字母大写
url ==> URL 缩写要大写

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

@wangkuiyi
Copy link
Collaborator

不好意思,这个PR好像是我没有及时跟进,导致一直还在这里?

看上去和develop branch有conflict。有多大了?

@helinwang
Copy link
Contributor

现在develop branch上的教程用的是k8s persistent volume直接指向EFS,而不是先mount到本地在用k8s host volume了。感觉这个PR已经不需要merge in了。谢谢@tizhou86提交的改进,以及@wangkuiyi的review。
综上考虑,我先把这个PR关闭了吧。

@helinwang helinwang closed this Feb 8, 2017
zhhsplendid pushed a commit to zhhsplendid/Paddle that referenced this pull request Sep 25, 2019
lizexu123 pushed a commit to lizexu123/Paddle that referenced this pull request Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants