-
Notifications
You must be signed in to change notification settings - Fork 29.6k
Add FAST #35476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add FAST #35476
Conversation
…ug in test file test_image_processing_fast
rename_key_mappings = { | ||
"module.backbone": "backbone.textnet", | ||
"first_conv": "stem", | ||
"bn": "batch_norm", | ||
"ver": "vertical", | ||
"hor": "horizontal", | ||
"module.neck": "neck", | ||
"module.det_head": "text_detection_head", | ||
"neck.reduce_layer1": "neck.reduce_layers.0", | ||
"neck.reduce_layer2": "neck.reduce_layers.1", | ||
"neck.reduce_layer3": "neck.reduce_layers.2", | ||
"neck.reduce_layer4": "neck.reduce_layers.3", | ||
"final.conv.weight": "final_conv.weight", | ||
"neck.reduce_layers.1.rbr_identity.weight": "neck.reduce_layers.1.identity.weight", | ||
"neck.reduce_layers.1.rbr_identity.bias": "neck.reduce_layers.1.identity.bias", | ||
"neck.reduce_layers.1.rbr_identity.running_mean": "neck.reduce_layers.1.identity.running_mean", | ||
"neck.reduce_layers.1.rbr_identity.running_var": "neck.reduce_layers.1.identity.running_var", | ||
"neck.reduce_layers.1.rbr_identity.num_batches_tracked": "neck.reduce_layers.1.identity.num_batches_tracked", | ||
} | ||
|
||
|
||
def get_model_config(model_config, model_type, size, min_area, bounding_box_type, loss_bg): | ||
model_config_map = { | ||
"tiny": { | ||
"config_url": tiny_config_url, | ||
"expected_logits": torch.tensor([-9.9181, -13.0701, -12.5045, -12.6523]), | ||
"expected_boxes": [(151, 151), (160, 56), (355, 74), (346, 169)], | ||
}, | ||
"small": { | ||
"config_url": small_config_url, | ||
"expected_logits": torch.tensor([-13.1852, -17.2011, -16.9553, -16.8269]), | ||
"expected_boxes": [(154, 151), (155, 61), (351, 63), (350, 153)], | ||
}, | ||
"base": { | ||
"config_url": base_config_url, | ||
"expected_logits": torch.tensor([-28.7481, -34.1635, -25.7430, -22.0260]), | ||
"expected_boxes": [(157, 149), (158, 66), (348, 68), (347, 151)], | ||
}, | ||
} | ||
|
||
if model_type not in model_config_map: | ||
raise ValueError(f"Unknown model type: {model_type}") | ||
|
||
logits_config = model_config_map[model_type] | ||
config = prepare_config( | ||
logits_config["config_url"], | ||
size, | ||
model_config["detection_head"]["pooling_size"], | ||
min_area, | ||
bounding_box_type, | ||
loss_bg, | ||
) | ||
|
||
return config, logits_config["expected_logits"], logits_config["expected_boxes"] | ||
|
||
|
||
def prepare_config(size_config_url, size, pooling_size, min_area, bounding_box_type, loss_bg): | ||
config_dict = json.loads(requests.get(size_config_url).text) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a much much simpler way to convert weigths now, see convert_mllama:
ORIGINAL_TO_CONVERTED_KEY_MAPPING = { |
ORIGINAL_TO_CONVERTED_KEY_MAPPING = {
that is now standard in transformers!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tested the convert file?
python src/transformers/models/fast/convert_fast_original_to_hf.py --checkpoint_url https://github.com/czczup/FAST/releases/download/release/fast_tiny_ic17mlt_640.pth --checkpoint_config_filename fast_tiny_ic17mlt_640.py
And for different sizes :
Replace ‘tiny’ with base and small to test
And have you tested the convert file works? on all three ckpts :)? |
"expected_boxes": [(148, 151), (157, 53), (357, 72), (347, 170)], | ||
}, | ||
"small": { | ||
"config_url": small_config_url, | ||
"expected_logits": torch.tensor([-13.1852, -17.2011, -16.9553, -16.8269]), | ||
"expected_boxes": [(154, 151), (155, 61), (351, 63), (350, 153)], | ||
"expected_boxes": [(151, 152), (152, 58), (352, 60), (351, 154)], | ||
}, | ||
"base": { | ||
"config_url": base_config_url, | ||
"expected_logits": torch.tensor([-28.7481, -34.1635, -25.7430, -22.0260]), | ||
"expected_boxes": [(157, 149), (158, 66), (348, 68), (347, 151)], | ||
"expected_boxes": [(154, 150), (155, 63), (349, 65), (349, 152)], | ||
}, | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why expected_boxes
are changed here ?, we must make sure the boxes match the origninal implementation, and I recall the one you changed used to match the og fast repo!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's mentioned on top of the PR:
Colab to replicate the author's logits: https://colab.research.google.com/drive/1bdkNiRI2bl7rBcgGYXe2UeobX78TUGYY?usp=sharing
[For maintainers] Suggested jobs to run (before merge) run-slow: auto, fast |
For each of the 3 types I am off by a little bit. The logits are correct, could it be the post processing and rounding is different at the end? Does the result have to match the results from Colab exactly? python src/transformers/models/fast/convert_fast_original_to_hf.py --checkpoint_url https://github.com/czczup/FAST/releases/download/release/fast_tiny_ic17mlt_640.pth --checkpoint_config_filename fast_tiny_ic17mlt_640.py |
If they don;t match, you should at the post processing logic! |
What does this PR do?
This PR adds FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation.
It should be merged after the first pr for its backbone, textnet, is merged: #34979
Colab to replicate the author's logits: https://colab.research.google.com/drive/1bdkNiRI2bl7rBcgGYXe2UeobX78TUGYY?usp=sharing
What's left: