-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
In the current matching strategy, a point on a polyline is associated with the smallest bounding box that contains it.
docling-eval/docling_eval/dataset_builders/cvat_dataset_builder.py
Lines 225 to 230 in b507977
if box["l"] <= point[0] <= box["r"] and box["t"] <= point[1] <= box["b"]: | |
current_area = (box["r"] - box["l"]) * (box["b"] - box["t"]) | |
if index == -1 or current_area < area: | |
area = current_area | |
index = i | |
box_result = box |
This approach works for certain link types, such as
to_footnote
, to_value
, and to_caption
. However, for links like reading_order
, merge
, or group
, we expect the points to be associated with the outermost bounding boxes under certain conditions. For example, in the case of a table
, the reading_order
should be attached to the table's
bounding box, not to the bounding box of an individual table_row
.
To ensure the validation methods defined in PR #102 work as intended, the find_box
function needs to be updated accordingly.
Metadata
Metadata
Assignees
Labels
No labels