Skip to content

Conversation

@lucyhe
Copy link
Contributor

@lucyhe lucyhe commented Mar 27, 2024

This change is required to unblock https://github.com/Kaggle/kaggleazure/pull/28591, which updates the Kernel MT to only use short URLs.

Testing
Added unit test and ran ./test

@lucyhe lucyhe marked this pull request as ready for review March 29, 2024 19:54
@lucyhe lucyhe requested review from calderjo and djherbis March 29, 2024 19:55
@lucyhe lucyhe marked this pull request as draft March 29, 2024 21:18
@lucyhe lucyhe marked this pull request as ready for review March 29, 2024 23:01
layer = hub.KerasLayer("https://kaggle.com/models/foo/foomodule/frameworks/TensorFlow2/variations/barvar/versions/2")
layer = hub.KerasLayer(model_url)
self.assertEqual([1, 1], layer(test_inputs).shape)
# Delete the files that were created in KaggleJwtHandler's do_POST method
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was needed or else os.makedirs and os.symlink would fail when the new test tried to recreate the same files. (Alternative solution was to use a different model URL for the new test, but I thought it was cleaner to do the proper cleanup after each test.)


url_pattern = re.compile(r"https?://([a-z]+\.)?kaggle.com/models/(?P<owner>[^\\/]+)/(?P<model>[^\\/]+)/frameworks/(?P<framework>[^\\/]+)/variations/(?P<variation>[^\\/]+)/versions/(?P<version>[0-9]+)$")
short_url_pattern = re.compile(r"https?://([a-z]+\.)?kaggle.com/models/(?P<owner>[^\\/]+)/(?P<model>[^\\/]+)/(?P<framework>[^\\/]+)/(?P<variation>[^\\/]+)/(?P<version>[0-9]+)$")
long_url_pattern = re.compile(r"https?://([a-z]+\.)?kaggle.com/models/(?P<owner>[^\\/]+)/(?P<model>[^\\/]+)/frameworks/(?P<framework>[^\\/]+)/variations/(?P<variation>[^\\/]+)/versions/(?P<version>[0-9]+)$")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was considering if we should load these via an ENV var that we can mutate, but I suppose it is really rare that we mutate this, and it might be fine if only new images can support changes like this going forward.


url_pattern = re.compile(r"https?://([a-z]+\.)?kaggle.com/models/(?P<owner>[^\\/]+)/(?P<model>[^\\/]+)/frameworks/(?P<framework>[^\\/]+)/variations/(?P<variation>[^\\/]+)/versions/(?P<version>[0-9]+)$")
short_url_pattern = re.compile(r"https?://([a-z]+\.)?kaggle.com/models/(?P<owner>[^\\/]+)/(?P<model>[^\\/]+)/(?P<framework>[^\\/]+)/(?P<variation>[^\\/]+)/(?P<version>[0-9]+)$")
long_url_pattern = re.compile(r"https?://([a-z]+\.)?kaggle.com/models/(?P<owner>[^\\/]+)/(?P<model>[^\\/]+)/frameworks/(?P<framework>[^\\/]+)/variations/(?P<variation>[^\\/]+)/versions/(?P<version>[0-9]+)$")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a unified version of the two regex's if we wanted to still only need one:

https?://([a-z]+\.)?kaggle.com/models/(?P<owner>[^\\/]+)/(?P<model>[^\\/]+)(/frameworks)?/(?P<framework>[^\\/]+)(/variations)?/(?P<variation>[^\\/]+)(/versions)?/(?P<version>[0-9]+)$

https://regex101.com/r/JbWJNI/1

All I did was making the "extra" parts of the path optional, though I suppose this does make each part independently optional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind if we keep short_url_pattern and long_url_pattern? I lean towards that because it feels more readable and avoids the "independently optional parts" issue you mention

@lucyhe lucyhe merged commit 5afea0a into main Apr 2, 2024
@lucyhe lucyhe deleted the lh/short branch April 2, 2024 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants