-
Notifications
You must be signed in to change notification settings - Fork 86
NC | Concurrency & refactoring | Add delay, version move checks and GPFS refactoring #8419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6cd3731
to
15972de
Compare
15972de
to
fe34ee2
Compare
@romayalon Could you please add a short explanation about "Added to the retries a delay, the delay is the sum of a base of 70 + random(0,50)." in the PR description? (Why the delay solves the issue, Why we chose it, etc.). |
Delay is common when using retries mechanism, specifically in multithreading, if another request moved the files in a way that caused a failure to our current request, retrying right away might still have the issue if the second request didn't finish its work, therefore we wait a bit and let the other request finish its work. I saw that usually above 100 ms is good, and having an addition of random ms is nice to have. |
fe34ee2
to
46d8b9f
Compare
@romayalon Why the base is 70 and not 100? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…PFS refactoring Signed-off-by: Romy <[email protected]>
2c23a0c
to
bf7d5be
Compare
Explain the changes
1.1. Added to the retries a delay, the delay is the sum of a base of 70 + random(0,50). Delay is common when using retries mechanism, specifically in multithreading, if another request moved the files in a way that caused a failure to our current request, retrying right away might still have the issue if the second request didn't finish its work, therefore we wait a bit and let the other request finish its work. I saw that usually above 100 ms is good, and having an addition of random ms is nice to have.
1.2. moved is_gpfs ? check to inside _open_files_gpfs() function
1.3. _delete_single_object_versions() - added 2 calls to _check_version_moved() for checking if the version moved between the latest version location and .versions/ at the time of the deletion in order to make sure that we will delete the version even if it moved. the check will throw VERSION_MOVED error that will trigger a retry, in the next try we will locate the new location of the version and remove it.
1.4. _check_version_moved() function receives key, version_id and cur_path, if cur_path is the latest version location, we will check if it was moved to .versions/, if cur_path is in .versions/ we will check if the latest version has the same version_id as the version_id param.
Issues: Fixed #xxx / Gap #xxx
Testing Instructions:
sudo jest --testRegex=jest_tests/test_versioning_conc -t 'concurrent delete objects by version id/latest'