Skip to content

Add FAST #35476

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 162 commits into
base: main
Choose a base branch
from
Open

Add FAST #35476

wants to merge 162 commits into from

Conversation

jadechoghari
Copy link
Contributor

@jadechoghari jadechoghari commented Jan 1, 2025

What does this PR do?

This PR adds FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation.

It should be merged after the first pr for its backbone, textnet, is merged: #34979

Colab to replicate the author's logits: https://colab.research.google.com/drive/1bdkNiRI2bl7rBcgGYXe2UeobX78TUGYY?usp=sharing

What's left:

  • Fix make quality failing due to a doc issue
  • Complete full model documentation

@jadechoghari jadechoghari requested a review from qubvel April 16, 2025 19:50
Comment on lines 35 to 93
rename_key_mappings = {
"module.backbone": "backbone.textnet",
"first_conv": "stem",
"bn": "batch_norm",
"ver": "vertical",
"hor": "horizontal",
"module.neck": "neck",
"module.det_head": "text_detection_head",
"neck.reduce_layer1": "neck.reduce_layers.0",
"neck.reduce_layer2": "neck.reduce_layers.1",
"neck.reduce_layer3": "neck.reduce_layers.2",
"neck.reduce_layer4": "neck.reduce_layers.3",
"final.conv.weight": "final_conv.weight",
"neck.reduce_layers.1.rbr_identity.weight": "neck.reduce_layers.1.identity.weight",
"neck.reduce_layers.1.rbr_identity.bias": "neck.reduce_layers.1.identity.bias",
"neck.reduce_layers.1.rbr_identity.running_mean": "neck.reduce_layers.1.identity.running_mean",
"neck.reduce_layers.1.rbr_identity.running_var": "neck.reduce_layers.1.identity.running_var",
"neck.reduce_layers.1.rbr_identity.num_batches_tracked": "neck.reduce_layers.1.identity.num_batches_tracked",
}


def get_model_config(model_config, model_type, size, min_area, bounding_box_type, loss_bg):
model_config_map = {
"tiny": {
"config_url": tiny_config_url,
"expected_logits": torch.tensor([-9.9181, -13.0701, -12.5045, -12.6523]),
"expected_boxes": [(151, 151), (160, 56), (355, 74), (346, 169)],
},
"small": {
"config_url": small_config_url,
"expected_logits": torch.tensor([-13.1852, -17.2011, -16.9553, -16.8269]),
"expected_boxes": [(154, 151), (155, 61), (351, 63), (350, 153)],
},
"base": {
"config_url": base_config_url,
"expected_logits": torch.tensor([-28.7481, -34.1635, -25.7430, -22.0260]),
"expected_boxes": [(157, 149), (158, 66), (348, 68), (347, 151)],
},
}

if model_type not in model_config_map:
raise ValueError(f"Unknown model type: {model_type}")

logits_config = model_config_map[model_type]
config = prepare_config(
logits_config["config_url"],
size,
model_config["detection_head"]["pooling_size"],
min_area,
bounding_box_type,
loss_bg,
)

return config, logits_config["expected_logits"], logits_config["expected_boxes"]


def prepare_config(size_config_url, size, pooling_size, min_area, bounding_box_type, loss_bg):
config_dict = json.loads(requests.get(size_config_url).text)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a much much simpler way to convert weigths now, see convert_mllama:

ORIGINAL_TO_CONVERTED_KEY_MAPPING = {

that is now standard in transformers!

Copy link
Contributor Author

@jadechoghari jadechoghari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested the convert file?

python src/transformers/models/fast/convert_fast_original_to_hf.py --checkpoint_url https://github.com/czczup/FAST/releases/download/release/fast_tiny_ic17mlt_640.pth --checkpoint_config_filename fast_tiny_ic17mlt_640.py

And for different sizes :
Replace ‘tiny’ with base and small to test

@jadechoghari
Copy link
Contributor Author

And have you tested the convert file works? on all three ckpts :)?

Comment on lines 84 to 97
"expected_boxes": [(148, 151), (157, 53), (357, 72), (347, 170)],
},
"small": {
"config_url": small_config_url,
"expected_logits": torch.tensor([-13.1852, -17.2011, -16.9553, -16.8269]),
"expected_boxes": [(154, 151), (155, 61), (351, 63), (350, 153)],
"expected_boxes": [(151, 152), (152, 58), (352, 60), (351, 154)],
},
"base": {
"config_url": base_config_url,
"expected_logits": torch.tensor([-28.7481, -34.1635, -25.7430, -22.0260]),
"expected_boxes": [(157, 149), (158, 66), (348, 68), (347, 151)],
"expected_boxes": [(154, 150), (155, 63), (349, 65), (349, 152)],
},
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why expected_boxes are changed here ?, we must make sure the boxes match the origninal implementation, and I recall the one you changed used to match the og fast repo!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mentioned on top of the PR:
Colab to replicate the author's logits: https://colab.research.google.com/drive/1bdkNiRI2bl7rBcgGYXe2UeobX78TUGYY?usp=sharing

Copy link
Contributor

github-actions bot commented Jul 4, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, fast

@TeddyLiang01
Copy link

For each of the 3 types I am off by a little bit. The logits are correct, could it be the post processing and rounding is different at the end? Does the result have to match the results from Colab exactly?

python src/transformers/models/fast/convert_fast_original_to_hf.py --checkpoint_url https://github.com/czczup/FAST/releases/download/release/fast_tiny_ic17mlt_640.pth --checkpoint_config_filename fast_tiny_ic17mlt_640.py
Traceback (most recent call last):
File "/Users/teddy/transformers/src/transformers/models/fast/convert_fast_original_to_hf.py", line 355, in
convert_fast_checkpoint(
File "/Users/teddy/transformers/src/transformers/models/fast/convert_fast_original_to_hf.py", line 311, in convert_fast_checkpoint
raise ValueError(f"Expected {expected_slice_boxes}, but got {text_locations[0]['boxes'][0]}")
ValueError: Expected [(151, 151), (160, 56), (355, 74), (346, 169)], but got [(148, 151), (157, 53), (357, 72), (347, 170)]

@jadechoghari
Copy link
Contributor Author

If they don;t match, you should at the post processing logic!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants