-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Description
I did some tests when using both detection+recognition with a set of 30 images and I've seen that there is no speed improvements when using batches.
So I checked the code and if I got it right in your implementation,
Lines 527 to 536 in 71a91db
| # To handle multiple images | |
| if isinstance(image,list): | |
| text_list = [] | |
| if self.detect: | |
| for img in image: | |
| temp = self.read_image_input(img) | |
| exported_regions,updated_prediction_result = self.craft_detect(temp) | |
| inter_text_list,conf_list = self.text_recognize_batch(exported_regions) | |
| final_result = self.output_formatter(inter_text_list,conf_list,updated_prediction_result) | |
| text_list.append(final_result) |
I'm not an expert in Parseq, but if it already can deal with batches of BB why not simply take all the BB from the all batch and pass those as a single input to parseq?
To recap my suggestion why don't you do something like the following:
bbs=[]
for image in batch:
bb_preds=craft(image)
bbs.appens(bb_preds)
texts=parseq_read_batch(bbs)
This should be faster as you call parseq only once per batch and not per image, albeit with a larger memory cost but that can be dealt by the batches size parameter.
Obviously even better would be to do something like:
bbs=craft_batch(batch)
texts=parseq_batch(bbs)
Metadata
Metadata
Assignees
Labels
No labels