-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Remove differences between init and preprocess kwargs for fast image processors #36186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove differences between init and preprocess kwargs for fast image processors #36186
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
Might be a bit early to review, but what do you think of using |
Yes that would be nice, although the kwargs are not exactly the same right now, like Will experiment with that and see if I breaks anything, but it might be too much scope for this PR? cc @ArthurZucker |
|
@yonigozlan Yeah, some kwargs are not there. For |
ArthurZucker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THanks, IMO let's focus on fast image processor first, making they are as efficient and fast + standardized as possible! 🤗
Definitely agree on defining all kwargs in one typed dict, it would also make things simpler and more coherent in processors, where the custom ImagesKwargs could just be imported from fast image processors/image processors, and so we'll have one source of truth, because I also noticed that a lot of processors should have custom ImagesKwargs but don't. Will address that in a another PR as well :)
Great! Then this should be ready for your review @ArthurZucker |
ArthurZucker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very much needed thanks!
| model_input_names = ["pixel_values"] | ||
| valid_init_kwargs = DefaultFastImageProcessorInitKwargs | ||
| valid_preprocess_kwargs = DefaultFastImageProcessorPreprocessKwargs | ||
| valid_kwargs = DefaultFastImageProcessorKwargs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a fan of having this here but not worries
What does this PR do?
In slow processors, accepted init and call kwargs are not always the same for no apparent reason, which can make things confusing for users. (see vllm-project/vllm#13143 (comment))
DefaultFastImageProcessorInitKwargsandDefaultFastImageProcessorPreprocessKwargswere used to follow this, but with the way fast image processors handle kwargs, they can easily be merged in oneDefaultFastImageProcessorKwargs, so that the init and call functions will accept the same kwargs.Propagating this change to slow image processors will require a lot more diffs however. As discussed with @hmellor , should we try to make this change for slow image processors as well, or focus on encouraging the use of fast image processors @ArthurZucker ?
P.S. Qwen2_VL and Qwen2_5_VL are a bit more difficult to fix, so I'll open a separate PR for these