-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Update Notebook Testing script to do parallel processing #2102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @holtskinner, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
Hello team,
Gemini here, providing a summary of this pull request authored by @holtskinner. The primary goal of this PR, as indicated by the title, is to enhance the notebook testing script by enabling parallel execution of notebooks. Previously, the script processed notebooks sequentially, launching one, waiting for it to complete, and then moving to the next. This change refactors the script to launch all notebook executions concurrently and then monitor their status until all are finished. This approach should significantly reduce the overall execution time for the notebook test suite.
The changes involve a substantial rewrite of the main execution logic in the shell script, introducing functions to manage the launching, monitoring, and reporting phases of the test run.
Highlights
- Parallel Execution: The core change is the transition from sequential to parallel execution of notebooks. The script now launches all notebook tests using
gcloud colab executions create
without waiting for each to finish. - Refactored Script Structure: The script logic has been broken down into three distinct functions:
launch_notebook
(to start executions),monitor_executions
(to track their status), andpublish_results
(to report the final outcome via Pub/Sub). - Concurrent Monitoring: A new monitoring loop (
monitor_executions
) is introduced that periodically checks the status of all pending notebook executions until they have all completed (either successfully or failed). - Simplified Data Handling: The script now uses
readarray
to read the list of notebooks and<
for reading file contents into variables, which are generally more robust methods in Bash. - Improved Error Handling: The
set -e
command has been added at the beginning of the script to ensure that the script exits immediately if any command fails.
Changelog
- .cloud-build/executor/test_notebooks.sh
- Removed the sequential loop for executing notebooks.
- Introduced
launch_notebook
,monitor_executions
, andpublish_results
functions. - Implemented parallel launching of notebook executions using
gcloud colab executions create
. - Added a monitoring loop to check the status of all running executions concurrently.
- Used
readarray -t
to read notebooks into an array. - Used
<
for reading file contents into variables. - Added
set -e
for stricter error handling. - Removed
--verbosity=debug
flag fromgcloud colab executions create
. - Simplified the logic for collecting results and publishing the Pub/Sub message.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Scripts run side-by-side,
No longer must they hide,
Faster tests now fly,
Beneath the cloud-build sky,
Efficiency is our guide.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces parallel processing to the notebook testing script, a significant improvement for efficiency. The script is now more modular with the use of functions. While the core logic for parallelization is well-implemented, there are a few areas related to error handling and JSON report formatting that should be addressed to ensure robustness and correctness.
Summary of Findings
- JSON Formatting for Notebook Lists: The
failed_notebooks
andsuccessful_notebooks
fields in the Pub/Sub JSON message are formatted as single comma-separated strings instead of proper JSON arrays of strings. This should be corrected to ensure the consumer can parse these fields as lists (e.g.,["item1", "item2"]
or[]
). - Error Handling for
gcloud describe
: The script currently suppresses errors fromgcloud colab executions describe
and treats any failure of this command as if the notebook job itself failed. This can mask underlying infrastructure orgcloud
command issues. Error handling should be improved to distinguish these cases. - Robustness of Operation ID Usage: Accessing operation IDs from the
operation_map
should include a check to ensure an ID is actually found before attempting to use it withgcloud
commands, to prevent errors if a path is unexpectedly missing from the map. - Verbosity of
gcloud create
command: The--verbosity=debug
flag was removed from thegcloud colab executions create
command. Re-adding it could aid in debugging notebook launch issues.
Merge Readiness
The pull request introduces valuable parallelization to the notebook testing script. However, there are a few high
and medium
severity issues related to error handling and JSON reporting that should be addressed before merging to ensure the script's robustness and the correctness of its output. Specifically, the JSON formatting for lists of notebooks and the error handling for gcloud
commands need attention. I am unable to approve pull requests directly, but I recommend making these changes for a more reliable script. Other reviewers should then perform a final check.
for path in "${pending[@]}"; do | ||
local id="${operation_map["$path"]}" | ||
|
||
local status=$(gcloud colab executions describe "$id" --region="$REGION" --format="value(jobState)" 2>/dev/null || echo "JOB_STATE_FAILED") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current error handling for gcloud colab executions describe
might mask the true cause of failures and lead to misreporting.
Specifically, 2>/dev/null || echo "JOB_STATE_FAILED"
means:
- Any error output from
gcloud describe
(e.g., network issues, permission problems, invalid ID format before API query) is suppressed (2>/dev/null
). - If the
gcloud describe
command exits with a non-zero status for any reason, it's unconditionally treated as if the notebook job itselfJOB_STATE_FAILED
.
This prevents distinguishing between an actual failed notebook execution and an issue with the gcloud
command or environment (e.g., transient network error, quota issue for describe calls).
Consider explicitly checking the gcloud
command's exit status and logging its output/error for better diagnostics.
local status=$(gcloud colab executions describe "$id" --region="$REGION" --format="value(jobState)" 2>/dev/null || echo "JOB_STATE_FAILED") | |
local gcloud_status_cmd_output | |
# Try to get the job state. If gcloud command fails, log its output and set a failure status. | |
if ! gcloud_status_cmd_output=$(gcloud colab executions describe "$id" --region="$REGION" --format="value(jobState)" 2>&1); then | |
echo "Error: Failed to retrieve status for notebook '$path' (ID: $id). Marking as failed. gcloud output: $gcloud_status_cmd_output" | |
status="JOB_STATE_FAILED_OR_ERROR" # Use a distinct status or JOB_STATE_FAILED | |
else | |
status="$gcloud_status_cmd_output" | |
fi |
local fail_list=$(IFS=, ; echo "${completed_failure[*]}") | ||
local pass_list=$(IFS=, ; echo "${completed_success[*]}") | ||
|
||
local message="{\"total_count\":$total,\"failed_count\":$failed,\"failed_notebooks\":\"$fail_list\",\"successful_notebooks\":\"$pass_list\",\"successful_count\":$passed,\"execution_date\":\"$TIMESTAMP\"}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current construction of the failed_notebooks
and successful_notebooks
fields in the JSON message (line 85) does not produce a valid JSON array of strings.
Variables $fail_list
and $pass_list
(from lines 82-83) are simple comma-separated strings. For example, if completed_failure
contains (nb1.ipynb nb2.ipynb)
, then $fail_list
becomes nb1.ipynb,nb2.ipynb
. The resulting JSON will be "failed_notebooks":"nb1.ipynb,nb2.ipynb"
, which is a single JSON string, not an array.
A correct JSON array of strings should look like "failed_notebooks":["nb1.ipynb","nb2.ipynb"]
, or "failed_notebooks":[]
if the list is empty. This discrepancy can lead to parsing errors or incorrect data interpretation by the Pub/Sub message consumer.
To fix this, you'll need to modify how the list of failed and successful notebooks are formatted into JSON strings. This typically involves iterating through the completed_failure
and completed_success
arrays, quoting each item, separating them with commas, and enclosing the result in square brackets (e.g., ["item1","item2"]
). Special care is needed for empty arrays (resulting in []
) and proper escaping of characters within notebook names if they could conflict with JSON syntax (though unlikely for typical paths).
Once you have variables, say fail_list_json_array
and pass_list_json_array
, containing these correctly formatted JSON array strings, this line should be updated to use them without the surrounding quotes.
local message="{\"total_count\":$total,\"failed_count\":$failed,\"failed_notebooks\":\"$fail_list\",\"successful_notebooks\":\"$pass_list\",\"successful_count\":$passed,\"execution_date\":\"$TIMESTAMP\"}" | |
local message="{\"total_count\":$total,\"failed_count\":$failed,\"failed_notebooks\":$fail_list_json_array,\"successful_notebooks\":$pass_list_json_array,\"successful_count\":$passed,\"execution_date\":\"$TIMESTAMP\"}" |
--execution-timeout="1h30m" \ | ||
--format="value(name)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The --verbosity=debug
flag was previously used with the gcloud colab executions create
command (line 40 in the old code) but has been removed in this version. Was this removal intentional?
Retaining this flag can be very helpful for diagnosing issues that might occur during the notebook launch phase, providing more detailed logs from gcloud
.
If its removal wasn't a specific requirement, consider re-adding it.
--execution-timeout="1h30m" \ | |
--format="value(name)") | |
--execution-timeout="1h30m" \ | |
--verbosity=debug \ | |
--format="value(name)") |
local still_pending=() | ||
|
||
for path in "${pending[@]}"; do | ||
local id="${operation_map["$path"]}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a potential issue if operation_map["$path"]
does not yield a valid operation ID. If $id
is empty or unset (e.g., if launch_notebook
failed to populate operation_map
for some reason before set -e
could act, or if $path
is unexpectedly not in the map), the subsequent gcloud colab executions describe "$id" ...
command on line 52 would effectively become gcloud colab executions describe "" ...
, which will fail.
It would be more robust to check if $id
is non-empty before attempting to use it.
local id="${operation_map["$path"]}" | |
local id="${operation_map["$path"]}" | |
if [[ -z "$id" ]]; then | |
echo "Error: Could not find operation ID for path '$path'. Marking as failed and skipping status check." | |
completed_failure+=("$path (Error: Missing Operation ID)") | |
continue # Skip to the next path in the pending loop | |
fi |
2ef8241
to
40cdc86
Compare
No description provided.