Skip to content

No speed advantage when using batches. #58

@Dario-Mantegazza

Description

@Dario-Mantegazza

I did some tests when using both detection+recognition with a set of 30 images and I've seen that there is no speed improvements when using batches.
So I checked the code and if I got it right in your implementation,

tamil_ocr/ocr_tamil/ocr.py

Lines 527 to 536 in 71a91db

# To handle multiple images
if isinstance(image,list):
text_list = []
if self.detect:
for img in image:
temp = self.read_image_input(img)
exported_regions,updated_prediction_result = self.craft_detect(temp)
inter_text_list,conf_list = self.text_recognize_batch(exported_regions)
final_result = self.output_formatter(inter_text_list,conf_list,updated_prediction_result)
text_list.append(final_result)
you split the batch into single images and then pass each image to craft, get the BB and pass those to ParSeq.

I'm not an expert in Parseq, but if it already can deal with batches of BB why not simply take all the BB from the all batch and pass those as a single input to parseq?

To recap my suggestion why don't you do something like the following:

bbs=[]
for image in batch:
     bb_preds=craft(image)
     bbs.appens(bb_preds)
texts=parseq_read_batch(bbs)

This should be faster as you call parseq only once per batch and not per image, albeit with a larger memory cost but that can be dealt by the batches size parameter.

Obviously even better would be to do something like:

bbs=craft_batch(batch)
texts=parseq_batch(bbs)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions