-
Notifications
You must be signed in to change notification settings - Fork 537
Parallel pipelines can create entities in DB #2446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel pipelines can create entities in DB #2446
Conversation
Important Auto Review SkippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the To trigger a single review, invoke the WalkthroughThe updates focus on enhancing the artifact and model versioning system in ZenML, introducing a retry mechanism for creation processes, and improving error handling for entity existence. Changes include the addition of a new constant for maximum retries, refactoring of artifact creation logic, and implementation of efficient registration and reuse strategies for pipelines. Additionally, integration tests have been expanded to cover parallel creation scenarios for models and pipelines, ensuring robustness in heavily parallelized environments. Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
@coderabbitai review |
…il-to-create-artifacts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 10
Configuration used: .coderabbit.yaml
Files selected for processing (8)
- src/zenml/artifacts/utils.py (4 hunks)
- src/zenml/constants.py (1 hunks)
- src/zenml/model/model.py (5 hunks)
- src/zenml/new/pipelines/pipeline.py (3 hunks)
- src/zenml/zen_stores/sql_zen_store.py (4 hunks)
- tests/integration/functional/model/test_model_version.py (5 hunks)
- tests/integration/functional/pipelines/test_pipeline_parallel.py (1 hunks)
- tests/integration/functional/pipelines/util_parallel_pipeline_script.py (1 hunks)
Additional comments: 15
tests/integration/functional/pipelines/util_parallel_pipeline_script.py (3)
- 8-10: The
register_artifact
step function is correctly defined with caching disabled, which is suitable for testing parallel executions where caching could interfere with the test's integrity. The return value simulates an artifact registration process.- 13-16: The
parallel_
pipeline function iterates over a range ofsteps_count
and callsregister_artifact
for each iteration. This setup is appropriate for testing parallel executions of artifact registration. However, it's important to ensure that thesteps_count
andrun_id
parameters are correctly passed and used, especially in a parallel execution context.- 19-21: The script execution entry point correctly parses command line arguments to extract
run_prefix
,i
, andsteps_count
. It's crucial that these arguments are validated and correctly converted to their expected types (e.g.,steps_count
andi
should be integers) to avoid runtime errors.tests/integration/functional/pipelines/test_pipeline_parallel.py (1)
- 22-59: The test method
test_parallel_runs_can_register_same_artifact
is well-structured and follows a clear logic to test parallel artifact registration. It uses subprocesses to execute the pipeline script in parallel, which is a suitable approach for this test scenario. The assertions at the end of the test method are comprehensive, checking for the completion status of pipeline runs, the registration of all artifacts, their values, and unique versions. This thorough approach ensures that the parallel execution logic works as expected.src/zenml/constants.py (1)
- 318-320: The introduction of
MAX_RETRIES_FOR_VERSIONED_ENTITY_CREATION
with a value of 10 is a sensible addition to handle parallelized tests for versioned entity creation. The comment "empirical value to pass heavy parallelized tests" provides context for the choice of value, though it might be beneficial to include more detail on how this value was determined or any specific scenarios it addresses.tests/integration/functional/model/test_model_version.py (2)
- 14-14: The import of
multiprocessing
is necessary for the new test that validates parallel model version creation. This aligns with the PR's objective to improve parallel handling.- 119-120: The function
parallel_model_version_creation
is introduced to simulate the parallel creation of model versions. It directly calls a method on theModel
class to either get an existing model version or create a new one. This function is crucial for the new test that assesses the system's ability to handle parallel model version creation without conflicts or errors.src/zenml/model/model.py (2)
- 16-16: The import of the
time
module is correctly added to support the sleep functionality used in the retry mechanism. This is a necessary addition for implementing delays between retries.- 29-29: The import of
MAX_RETRIES_FOR_VERSIONED_ENTITY_CREATION
is correctly added and is essential for defining the maximum number of retries in the retry mechanism for creating model versions. This constant plays a crucial role in controlling the retry behavior.src/zenml/artifacts/utils.py (5)
- 20-20: The import of the
time
module is correctly added to support the sleep functionality used in the retry mechanism for artifact version creation. This is a necessary addition for implementing delays between retries.- 27-30: The addition of the
MAX_RETRIES_FOR_VERSIONED_ENTITY_CREATION
constant is correctly implemented. It's well-placed within the imports section, ensuring that it's available throughout the file. This constant is crucial for controlling the retry behavior in artifact and model version creation processes.- 38-41: The inclusion of the
EntityExistsError
in the imports section is appropriate, given its usage in the updatedsave_artifact
function to handle cases where an artifact version already exists. This change aligns with the PR's objective to improve error handling in parallel execution scenarios.- 118-118: The documentation for the
save_artifact
function has been updated to includeEntityExistsError
under theRaises
section. This accurately reflects the changes made to the function's implementation, ensuring that users are aware of the potential exceptions that can be raised.- 248-250: Raising
EntityExistsError
when the artifact version creation fails after all retries is appropriate and aligns with the PR's objectives to improve error handling. This ensures that the caller is informed of the failure to create a unique artifact version, which is crucial in parallel execution environments.src/zenml/new/pipelines/pipeline.py (1)
- 57-57: The import of
EntityExistsError
is correctly added to handle specific exceptions related to entity existence conflicts during pipeline registration. This aligns with the PR objectives of improving error handling for parallel operations.
…il-to-create-artifacts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Let's let the CAB member know whenever it's merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from a small nitpick (feel free to ignore), everything looks good!
* fix parallel artifacts registration * remove excessive warnings * parallel safe model versions * increase cool down a bit * coderabbitai * coderabbitai * update test signature * PR suggestions from Alex * kudos to windows * give some more retries for docker CIs * try to fix test case * fix parallel tests
Describe changes
This PR solve a few parallelization issues we had:
save_artifact
logic is improved, so it is now tolerant to parallel creation of Artifact and has a retry logic to create a new Artifact Versions for those without an explicit version nameMAX_RETRIES_FOR_VERSIONED_ENTITY_CREATION
constant introduced and set to10
reties for now with 0.2 seconds of cooldown growing by retry count (e.g.0.2 * retry_num
).10
is quite empirical and might need some further tunning.Tiny side improvement:
Pre-requisites
Please ensure you have done the following:
develop
and the open PR is targetingdevelop
. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.Types of changes
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Tests