Add Got-OCR 2 Fast image processor and refactor slow one #36185

yonigozlan · 2025-02-13T21:42:25Z

What does this PR do?

Refactor slow image processor of Got-OCR 2 in order to make only one call to preprocess, and not a separate call to crop_to_patches
Also add a fast image processor, with some nice speedups :)

def benchmark_image_processor(image_processor, images,benchmark_it=10, warmup_it=10):
    # warm up
    for _ in range(warmup_it):
        _ = image_processor(images=images, return_tensors="pt", device=device)
    # benchmark
    start_time = time.time()
    for _ in range(benchmark_it):
        _ = image_processor(images=images, return_tensors="pt", device=device)
    end_time = time.time()

    return (end_time - start_time) / benchmark_it

image = Image.open(requests.get("http://images.cocodataset.org/val2017/000000039769.jpg", stream=True).raw)
checkpoint = "stepfun-ai/GOT-OCR-2.0-hf"
image_processor_fast = AutoImageProcessor.from_pretrained(checkpoint, use_fast=True)
image_processor_slow = AutoImageProcessor.from_pretrained(checkpoint)
device = "cuda"
batch_size = 4

slow_time_one = benchmark_image_processor(image_processor_slow, image, benchmark_it=10)
fast_time_one = benchmark_image_processor(image_processor_fast, image, benchmark_it=10)
slow_time_batch = benchmark_image_processor(image_processor_slow, [image]*batch_size, benchmark_it=10)
fast_time_batch = benchmark_image_processor(image_processor_fast, [image]*batch_size, benchmark_it=10)

print(f"slow_time_one: {slow_time_one}, fast_time_one: {fast_time_one}, speedup: {slow_time_one/fast_time_one}")
print(f"slow_time_batch: {slow_time_batch}, fast_time_batch: {fast_time_batch}, speedup: {slow_time_batch/fast_time_batch}")

CPU:

slow_time_one: 0.03414120674133301, fast_time_one: 0.007321953773498535, speedup: 4.662854723954348
slow_time_batch: 0.1300074815750122, fast_time_batch: 0.030838775634765624, speedup: 4.215714758417654

CUDA:

slow_time_one: 0.03233206272125244, fast_time_one: 0.0006550788879394531, speedup: 49.35598340369778
slow_time_batch: 0.1255408763885498, fast_time_batch: 0.0017477989196777344, speedup: 71.82798603153816

Would be great to merge soon as this image processor will also be used for InternVL!

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

HuggingFaceDocBuilderDev · 2025-02-13T22:09:16Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…/yonigozlan/transformers into add-image-processor-fast-got-ocr

ArthurZucker

Just 1 comment sorry that it took so long

src/transformers/models/got_ocr2/image_processing_got_ocr2.py

…#36185) * refactor image processor slow got ocr * add working image processor fast * fix fast image processor, update doc * use one big loop for processing patches

yonigozlan added 2 commits February 13, 2025 17:38

refactor image processor slow got ocr

fa0527c

add working image processor fast

0d2d319

yonigozlan requested a review from ArthurZucker February 13, 2025 22:13

yonigozlan and others added 5 commits February 14, 2025 18:14

Merge branch 'main' into add-image-processor-fast-got-ocr

19e8560

Merge branch 'main' into add-image-processor-fast-got-ocr

1bcee27

Merge branch 'main' into add-image-processor-fast-got-ocr

f792f54

fix fast image processor, update doc

09aa6e4

Merge branch 'add-image-processor-fast-got-ocr' of https://github.com…

6813c09

…/yonigozlan/transformers into add-image-processor-fast-got-ocr

ArthurZucker approved these changes Feb 28, 2025

View reviewed changes

src/transformers/models/got_ocr2/image_processing_got_ocr2.py Show resolved Hide resolved

use one big loop for processing patches

51e98b2

yonigozlan merged commit 2c5d038 into huggingface:main Mar 1, 2025
23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Got-OCR 2 Fast image processor and refactor slow one #36185

Add Got-OCR 2 Fast image processor and refactor slow one #36185

Uh oh!

yonigozlan commented Feb 13, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Feb 13, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add Got-OCR 2 Fast image processor and refactor slow one #36185

Add Got-OCR 2 Fast image processor and refactor slow one #36185

Uh oh!

Conversation

yonigozlan commented Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Feb 13, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yonigozlan commented Feb 13, 2025 •

edited

Loading