-
Notifications
You must be signed in to change notification settings - Fork 97
install sm-spark-cli only if sagemaker workflows is enabled #915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,9 +1,18 @@ | ||
| #!/bin/bash | ||
| RESOURCE_METADATA_FILE=/opt/ml/metadata/resource-metadata.json | ||
| DZ_DOMAIN_ID=$(jq -r '.AdditionalMetadata.DataZoneDomainId' < $RESOURCE_METADATA_FILE) | ||
| DZ_PROJECT_ID=$(jq -r '.AdditionalMetadata.DataZoneProjectId' < $RESOURCE_METADATA_FILE) | ||
| DZ_DOMAIN_REGION=$(jq -r '.AdditionalMetadata.DataZoneDomainRegion' < $RESOURCE_METADATA_FILE) | ||
| DZ_ENDPOINT=$(jq -r '.AdditionalMetadata.DataZoneEndpoint' < $RESOURCE_METADATA_FILE) | ||
|
|
||
| # install sm-spark-cli | ||
| sudo curl -LO https://github.com/aws-samples/amazon-sagemaker-spark-ui/releases/download/v0.9.1/amazon-sagemaker-spark-ui.tar.gz && \ | ||
| sudo tar -xvzf amazon-sagemaker-spark-ui.tar.gz && \ | ||
| sudo chmod +x amazon-sagemaker-spark-ui/install-scripts/studio/install-history-server.sh && \ | ||
| sudo amazon-sagemaker-spark-ui/install-scripts/studio/install-history-server.sh && \ | ||
| rm -rf ~/.m2 && \ | ||
| sudo rm -rf amazon-sagemaker-spark-ui* | ||
| # install sm-spark-cli if workflows blueprint is enabled | ||
| if [ "$(python /etc/sagemaker-ui/workflows/workflow_client.py check-blueprint --region "$DZ_DOMAIN_REGION" --domain-id "$DZ_DOMAIN_ID" --endpoint "$DZ_ENDPOINT" --project-id "$DZ_PROJECT_ID")" = "True" ]; then | ||
| echo "Workflows blueprint is enabled. Installing sm-spark-cli." | ||
| # install sm-spark-cli | ||
| sudo curl -LO https://github.com/aws-samples/amazon-sagemaker-spark-ui/releases/download/v0.9.1/amazon-sagemaker-spark-ui.tar.gz && \ | ||
| sudo tar -xvzf amazon-sagemaker-spark-ui.tar.gz && \ | ||
| sudo chmod +x amazon-sagemaker-spark-ui/install-scripts/studio/install-history-server.sh && \ | ||
| sudo amazon-sagemaker-spark-ui/install-scripts/studio/install-history-server.sh && \ | ||
| rm -rf ~/.m2 && \ | ||
| sudo rm -rf amazon-sagemaker-spark-ui* | ||
| fi |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -47,23 +47,45 @@ def stop_local_runner(session: requests.Session, **kwargs): | |
| ) | ||
| return _validate_response("StopLocalRunner", response) | ||
|
|
||
|
|
||
| def check_blueprint(region: str, domain_id: str, endpoint: str, **kwargs): | ||
| def check_blueprint(region: str, domain_id: str, endpoint: str, project_id: str, **kwargs): | ||
| DZ_CLIENT = boto3.client("datazone") | ||
| # add correct endpoint for gamma env | ||
| if endpoint != "": | ||
| DZ_CLIENT = boto3.client("datazone", endpoint_url=endpoint) | ||
| try: | ||
| # check if workflows blueprint is enabled in project profile | ||
| project_profile_id = DZ_CLIENT.get_project( | ||
| domainIdentifier=domain_id, identifier=project_id | ||
| )["projectProfileId"] | ||
| project_blueprints = DZ_CLIENT.get_project_profile( | ||
| domainIdentifier=domain_id, identifier=project_profile_id | ||
| )['environmentConfigurations'] | ||
| proj_blueprint_ids = [proj_env_config["environmentBlueprintId"] for proj_env_config in project_blueprints] | ||
| blueprint_id = DZ_CLIENT.list_environment_blueprints( | ||
| managed=True, domainIdentifier=domain_id, name="Workflows" | ||
| )["items"][0]["id"] | ||
| blueprint_config = DZ_CLIENT.get_environment_blueprint_configuration( | ||
| domainIdentifier=domain_id, environmentBlueprintIdentifier=blueprint_id | ||
| ) | ||
| enabled_regions = blueprint_config["enabledRegions"] | ||
| print(str(region in enabled_regions)) | ||
|
|
||
| if blueprint_id in proj_blueprint_ids: | ||
| blueprint_config = DZ_CLIENT.get_environment_blueprint_configuration( | ||
| domainIdentifier=domain_id, environmentBlueprintIdentifier=blueprint_id | ||
| ) | ||
| enabled_regions = blueprint_config["enabledRegions"] | ||
| print(str(region in enabled_regions)) | ||
| else: | ||
| print("False") | ||
| except: | ||
| print("False") | ||
| # fallback to checking if only workflows blueprint exists | ||
| try: | ||
| blueprint_id = DZ_CLIENT.list_environment_blueprints( | ||
| managed=True, domainIdentifier=domain_id, name="Workflows" | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IIRC users can modify blueprint names, and we've had issues in the past relying on the blueprint name. Can we get the environment blueprint by looking at the type or another field instead of the name?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The workflows blueprint is a managed blueprint provided by sagemaker, customers won't be able to edit the blueprint. Unfortunately list environment blueprint only accepts searching by
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it, thanks for clarifying. But the benefit is at least we don't hang anymore, so while not 100% fullproof, does still fix this edge case. |
||
| )["items"][0]["id"] | ||
| blueprint_config = DZ_CLIENT.get_environment_blueprint_configuration( | ||
| domainIdentifier=domain_id, environmentBlueprintIdentifier=blueprint_id | ||
| ) | ||
| enabled_regions = blueprint_config["enabledRegions"] | ||
| print(str(region in enabled_regions)) | ||
| except: | ||
| print("False") | ||
|
|
||
|
|
||
| COMMAND_REGISTRY = { | ||
|
|
@@ -94,6 +116,9 @@ def main(): | |
| check_blueprint_parser.add_argument( | ||
| "--endpoint", type=str, required=True, help="Datazone endpoint for blueprint check" | ||
| ) | ||
| check_blueprint_parser.add_argument( | ||
| "--project-id", type=str, required=True, help="Datazone Project ID for blueprint check" | ||
| ) | ||
|
|
||
| args = parser.parse_args() | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,9 +1,18 @@ | ||
| #!/bin/bash | ||
| RESOURCE_METADATA_FILE=/opt/ml/metadata/resource-metadata.json | ||
| DZ_DOMAIN_ID=$(jq -r '.AdditionalMetadata.DataZoneDomainId' < $RESOURCE_METADATA_FILE) | ||
| DZ_PROJECT_ID=$(jq -r '.AdditionalMetadata.DataZoneProjectId' < $RESOURCE_METADATA_FILE) | ||
| DZ_DOMAIN_REGION=$(jq -r '.AdditionalMetadata.DataZoneDomainRegion' < $RESOURCE_METADATA_FILE) | ||
| DZ_ENDPOINT=$(jq -r '.AdditionalMetadata.DataZoneEndpoint' < $RESOURCE_METADATA_FILE) | ||
|
|
||
| # install sm-spark-cli | ||
| sudo curl -LO https://github.com/aws-samples/amazon-sagemaker-spark-ui/releases/download/v0.9.1/amazon-sagemaker-spark-ui.tar.gz && \ | ||
| sudo tar -xvzf amazon-sagemaker-spark-ui.tar.gz && \ | ||
| sudo chmod +x amazon-sagemaker-spark-ui/install-scripts/studio/install-history-server.sh && \ | ||
| sudo amazon-sagemaker-spark-ui/install-scripts/studio/install-history-server.sh && \ | ||
| rm -rf ~/.m2 && \ | ||
| sudo rm -rf amazon-sagemaker-spark-ui* | ||
| # install sm-spark-cli if workflows blueprint is enabled | ||
| if [ "$(python /etc/sagemaker-ui/workflows/workflow_client.py check-blueprint --region "$DZ_DOMAIN_REGION" --domain-id "$DZ_DOMAIN_ID" --endpoint "$DZ_ENDPOINT" --project-id "$DZ_PROJECT_ID")" = "True" ]; then | ||
| echo "Workflows blueprint is enabled. Installing sm-spark-cli." | ||
| # install sm-spark-cli | ||
| sudo curl -LO https://github.com/aws-samples/amazon-sagemaker-spark-ui/releases/download/v0.9.1/amazon-sagemaker-spark-ui.tar.gz && \ | ||
| sudo tar -xvzf amazon-sagemaker-spark-ui.tar.gz && \ | ||
| sudo chmod +x amazon-sagemaker-spark-ui/install-scripts/studio/install-history-server.sh && \ | ||
| sudo amazon-sagemaker-spark-ui/install-scripts/studio/install-history-server.sh && \ | ||
| rm -rf ~/.m2 && \ | ||
| sudo rm -rf amazon-sagemaker-spark-ui* | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we put this print statement in the install script to keep the post startup script clean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extending this one line is as clean as we can get to safe installing the script.