Skip to content

Commit 6d4f896

Browse files
authored
Add polygons and mask fields (#1868)
This pull request introduces polygon and mask annotations. It also extends the label annotation to allow list of labels which can be used alongside polygons or bounding boxes. The PR adds conversion between polygons and masks, as well as conversion between legacy polygon datasets and new dataset class; however, I haven’t implemented this conversion yet for masks. ### Polygon and Mask Annotation Support * Added `PolygonField` and `MaskField` classes to `fields.py`, enabling robust representation and conversion of polygon and mask data, including normalization, format specification, and efficient serialization to Polars dataframes. * Implemented `PolygonToMaskConverter` in `converters.py`, allowing conversion of polygon annotations to rasterized masks using OpenCV. ### Label Handling Improvements * Enhanced `LabelField` to support both multi-label and list semantics via new `is_list` property, and updated serialization logic to accommodate these cases. The `label_field` factory now accepts `is_list` as a parameter. [[1]](diffhunk://#diff-15c4d88a12c4710b96d3ee5bc5ceed34002cd9e746d88e66e5e8005d0644bfb7R265-R290) [[2]](diffhunk://#diff-15c4d88a12c4710b96d3ee5bc5ceed34002cd9e746d88e66e5e8005d0644bfb7L289-R302) [[3]](diffhunk://#diff-15c4d88a12c4710b96d3ee5bc5ceed34002cd9e746d88e66e5e8005d0644bfb7R311-R468) ### Converter Registration and Instantiation Refactor * Refactored the annotation converter registry in `legacy.py` to store converter classes instead of instances, and introduced logic to instantiate converters based on dataset categories using a new `create_from_categories` class method. [[1]](diffhunk://#diff-aa35f06eaa7d35ff5ffa7077e181ed0b773549c22d42a92e78e076947f9b88f5L68-R84) [[2]](diffhunk://#diff-aa35f06eaa7d35ff5ffa7077e181ed0b773549c22d42a92e78e076947f9b88f5L79-R99) [[3]](diffhunk://#diff-aa35f06eaa7d35ff5ffa7077e181ed0b773549c22d42a92e78e076947f9b88f5L93-R117) * Updated the `ForwardBboxAnnotationConverter` to use the new instantiation pattern, including conditional support for label categories and improved schema attribute handling. ### Legacy Compatibility and Imports * Added `Polygon` to legacy annotation imports and updated usage to reflect new field and converter types for seamless integration with legacy datasets. [[1]](diffhunk://#diff-aa35f06eaa7d35ff5ffa7077e181ed0b773549c22d42a92e78e076947f9b88f5L19-R19) [[2]](diffhunk://#diff-aa35f06eaa7d35ff5ffa7077e181ed0b773549c22d42a92e78e076947f9b88f5L28-R33) --- - Polygon and mask annotation support: [[1]](diffhunk://#diff-15c4d88a12c4710b96d3ee5bc5ceed34002cd9e746d88e66e5e8005d0644bfb7R311-R468) [[2]](diffhunk://#diff-3f9f0dd688c31e641cd286586053d2fc7d0f4a479a20c943739dbe081119a1a3R359-R489) - Label handling improvements: [[1]](diffhunk://#diff-15c4d88a12c4710b96d3ee5bc5ceed34002cd9e746d88e66e5e8005d0644bfb7R265-R290) [[2]](diffhunk://#diff-15c4d88a12c4710b96d3ee5bc5ceed34002cd9e746d88e66e5e8005d0644bfb7L289-R302) [[3]](diffhunk://#diff-15c4d88a12c4710b96d3ee5bc5ceed34002cd9e746d88e66e5e8005d0644bfb7R311-R468) - Converter registration and instantiation refactor: [[1]](diffhunk://#diff-aa35f06eaa7d35ff5ffa7077e181ed0b773549c22d42a92e78e076947f9b88f5L68-R84) [[2]](diffhunk://#diff-aa35f06eaa7d35ff5ffa7077e181ed0b773549c22d42a92e78e076947f9b88f5L79-R99) [[3]](diffhunk://#diff-aa35f06eaa7d35ff5ffa7077e181ed0b773549c22d42a92e78e076947f9b88f5L93-R117) - ForwardBboxAnnotationConverter update: [src/datumaro/experimental/legacy.pyL122-R177](diffhunk://#diff-aa35f06eaa7d35ff5ffa7077e181ed0b773549c22d42a92e78e076947f9b88f5L122-R177) - Legacy compatibility and imports: [[1]](diffhunk://#diff-aa35f06eaa7d35ff5ffa7077e181ed0b773549c22d42a92e78e076947f9b88f5L19-R19) [[2]](diffhunk://#diff-aa35f06eaa7d35ff5ffa7077e181ed0b773549c22d42a92e78e076947f9b88f5L28-R33)<!-- Contributing guide: https://github.com/open-edge-platform/datumaro/blob/develop/CONTRIBUTING.md --> <!-- Please add a summary of changes. You may use Copilot to auto-generate the PR description but please consider including any other relevant facts which Copilot may be unaware of (such as design choices and testing procedure). Add references to the relevant issues and pull requests if any like so: Resolves #111 and #222. Depends on #1000 (for series of dependent commits). --> ### Checklist <!-- Put an 'x' in all the boxes that apply --> - [x] I have added tests to cover my changes or documented any manual tests. - [x] I have added the description of my changes into [CHANGELOG](https://github.com/open-edge-platform/datumaro/blob/develop/CHANGELOG.md). - [ ] I have updated the [documentation](https://github.com/open-edge-platform/datumaro/tree/develop/docs) accordingly
2 parents 8ef1108 + e3abe57 commit 6d4f896

File tree

8 files changed

+1152
-87
lines changed

8 files changed

+1152
-87
lines changed

CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ This release streamlines Datumaro by removing a number of lesser-used features,
5757

5858
### New features
5959
- Experimental dataset class
60-
(<https://github.com/open-edge-platform/datumaro/pull/1807>, <https://github.com/open-edge-platform/datumaro/pull/1810>, <https://github.com/open-edge-platform/datumaro/pull/1811>, <https://github.com/open-edge-platform/datumaro/pull/1834>, <https://github.com/open-edge-platform/datumaro/pull/1858>, <https://github.com/open-edge-platform/datumaro/pull/1845>, <https://github.com/open-edge-platform/datumaro/pull/1863>)
60+
(<https://github.com/open-edge-platform/datumaro/pull/1807>, <https://github.com/open-edge-platform/datumaro/pull/1810>, <https://github.com/open-edge-platform/datumaro/pull/1811>, <https://github.com/open-edge-platform/datumaro/pull/1834>, <https://github.com/open-edge-platform/datumaro/pull/1858>, <https://github.com/open-edge-platform/datumaro/pull/1845>, <https://github.com/open-edge-platform/datumaro/pull/1863>, <https://github.com/open-edge-platform/datumaro/pull/1868>)
6161

6262
### Enhancements
6363
- Mark several dependencies as optional

src/datumaro/experimental/converters.py

Lines changed: 142 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,22 @@
1111

1212
from typing import Any, Callable
1313

14+
import cv2
1415
import numpy as np
1516
import polars as pl
1617
from PIL import Image
1718

1819
from .converter_registry import AttributeSpec, Converter, converter
19-
from .fields import BBoxField, ImageField, ImagePathField
20+
from .fields import (
21+
BBoxField,
22+
ImageField,
23+
ImageInfoField,
24+
ImagePathField,
25+
LabelField,
26+
MaskField,
27+
PolygonField,
28+
)
29+
from .type_registry import polars_to_numpy_dtype
2030

2131

2232
@converter
@@ -346,3 +356,134 @@ def op(x: pl.Expr, y: pl.Expr) -> pl.Expr:
346356
result_df = result_df.drop([temp_width_col, temp_height_col])
347357

348358
return result_df
359+
360+
361+
@converter(lazy=True)
362+
class PolygonToMaskConverter(Converter):
363+
"""
364+
Converts polygon annotations to rasterized masks.
365+
366+
Transforms polygon coordinates into binary or indexed masks using
367+
OpenCV contour filling for efficient rasterization.
368+
"""
369+
370+
input_polygon: AttributeSpec[PolygonField]
371+
input_labels: AttributeSpec[LabelField]
372+
image_info: AttributeSpec[ImageInfoField]
373+
output_mask: AttributeSpec[MaskField]
374+
375+
# Configuration options
376+
background_index: int = 0 # Background value
377+
378+
def filter_output_spec(self) -> bool:
379+
"""
380+
Configure mask output specification.
381+
382+
Returns:
383+
True if the converter should be applied, False otherwise
384+
"""
385+
# Configure output for mask format
386+
self.output_mask = AttributeSpec(
387+
name=self.output_mask.name,
388+
field=MaskField(
389+
semantic=self.input_polygon.field.semantic,
390+
dtype=self.output_mask.field.dtype,
391+
),
392+
)
393+
394+
return True
395+
396+
def convert(self, df: pl.DataFrame) -> pl.DataFrame:
397+
"""
398+
Rasterize polygon coordinates into indexed masks.
399+
400+
Args:
401+
df: DataFrame with polygon coordinates, labels, and image info
402+
403+
Returns:
404+
DataFrame with mask data in output column
405+
"""
406+
input_column_name = self.input_polygon.name
407+
labels_column_name = self.input_labels.name
408+
image_info_column_name = self.image_info.name
409+
output_column_name = self.output_mask.name
410+
output_shape_column_name = self.output_mask.name + "_shape"
411+
412+
def polygons_to_mask(
413+
polygons_data: list, labels_data: list, img_info: dict
414+
) -> tuple[list[int], list[int]]:
415+
"""Rasterize polygons into indexed mask using OpenCV contour filling."""
416+
# Extract image dimensions
417+
image_width = img_info["width"]
418+
image_height = img_info["height"]
419+
420+
# Initialize mask with background index
421+
numpy_dtype = polars_to_numpy_dtype(self.output_mask.field.dtype)
422+
mask = np.full(
423+
shape=(image_height, image_width),
424+
fill_value=self.background_index,
425+
dtype=numpy_dtype,
426+
)
427+
428+
# Rasterize each polygon
429+
for i, polygon_data in enumerate(polygons_data):
430+
coords = polygon_data.to_numpy()
431+
class_index = labels_data[i]
432+
433+
# Denormalize coordinates if needed
434+
if self.input_polygon.field.normalize:
435+
coords = coords.copy()
436+
coords[:, 0] *= image_width
437+
coords[:, 1] *= image_height
438+
439+
# Convert to OpenCV contour format
440+
contour = coords.astype(np.int32)
441+
442+
# Fill polygon with class index
443+
cv2.drawContours(
444+
mask,
445+
[contour],
446+
0,
447+
int(class_index),
448+
thickness=cv2.FILLED,
449+
)
450+
451+
return mask.reshape(-1), [image_height, image_width]
452+
453+
# Apply conversion using map_batches
454+
def apply_conversion_batch(batch_df: pl.DataFrame) -> pl.DataFrame:
455+
"""Apply polygon-to-mask conversion for a batch."""
456+
batch_polygons = batch_df.struct["polygons"]
457+
batch_labels = batch_df.struct["labels"]
458+
batch_img_infos = batch_df.struct["img_info"]
459+
460+
results_batch_polygons = []
461+
results_batch_shape = []
462+
for polygons, labels, img_infos in zip(batch_polygons, batch_labels, batch_img_infos):
463+
mask_data, shape_data = polygons_to_mask(polygons, labels, img_infos)
464+
results_batch_polygons.append(pl.Series(mask_data))
465+
results_batch_shape.append(shape_data)
466+
467+
return pl.struct(
468+
pl.Series(results_batch_polygons).alias("mask"),
469+
pl.Series(results_batch_shape, dtype=pl.List(pl.Int32)).alias("shape"),
470+
eager=True,
471+
)
472+
473+
mask_data = pl.struct(
474+
[
475+
pl.col(input_column_name).alias("polygons"),
476+
pl.col(labels_column_name).alias("labels"),
477+
pl.col(image_info_column_name).alias("img_info"),
478+
]
479+
).map_batches(
480+
apply_conversion_batch,
481+
return_dtype=pl.Struct({"mask": pl.List(pl.UInt8), "shape": pl.List(pl.Int32)}),
482+
)
483+
484+
return df.with_columns(
485+
[
486+
mask_data.struct.field("mask").alias(output_column_name),
487+
mask_data.struct.field("shape").alias(output_shape_column_name),
488+
]
489+
)

src/datumaro/experimental/fields.py

Lines changed: 172 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -262,22 +262,32 @@ class LabelField(Field):
262262
semantic: Semantic
263263
dtype: Any
264264
multi_label: bool = False # Flag to indicate if this field should handle multi-labels
265+
is_list: bool = False
266+
267+
@property
268+
def _pl_type(self) -> pl.DataType:
269+
pl_type = self.dtype
270+
if self.multi_label:
271+
pl_type = pl.List(pl_type)
272+
if self.is_list:
273+
pl_type = pl.List(pl_type)
274+
return pl_type
265275

266276
def to_polars_schema(self, name: str) -> dict[str, pl.DataType]:
267277
"""Generate schema based on whether this is single or multi-label."""
268-
if self.multi_label:
269-
return {name: pl.List(self.dtype)}
270-
return {name: self.dtype}
278+
return {name: self._pl_type}
271279

272280
def to_polars(self, name: str, value: Any) -> dict[str, pl.Series]:
273281
"""Convert label(s) to Polars format for single or multi-label cases."""
282+
pl_type = self._pl_type
283+
274284
if value is None:
275-
return {name: pl.Series(name, [None], dtype=self.dtype)}
285+
return {name: pl.Series(name, [None], dtype=pl_type)}
276286

277287
if self.multi_label:
278288
return {name: pl.Series(name, [to_numpy(value)], dtype=pl.List(self.dtype))}
279289

280-
return {name: pl.Series(name, [value], dtype=self.dtype)}
290+
return {name: pl.Series(name, [value], dtype=pl_type)}
281291

282292
def from_polars(self, name: str, row_index: int, df: pl.DataFrame, target_type: type[T]) -> T:
283293
"""Reconstruct label(s) from Polars data."""
@@ -286,7 +296,10 @@ def from_polars(self, name: str, row_index: int, df: pl.DataFrame, target_type:
286296

287297

288298
def label_field(
289-
dtype: Any = pl.Int32(), semantic: Semantic = Semantic.Default, multi_label: bool = False
299+
dtype: Any = pl.Int32(),
300+
semantic: Semantic = Semantic.Default,
301+
multi_label: bool = False,
302+
is_list: bool = False,
290303
) -> Any:
291304
"""
292305
Create a LabelField instance with the specified parameters.
@@ -295,8 +308,160 @@ def label_field(
295308
dtype: Polars data type for label values (defaults to pl.Int32())
296309
semantic: Semantic tags describing the label purpose (optional)
297310
multi_label: Whether this field should handle multiple labels (defaults to False)
311+
is_list: Whether this field should be treated as a list type (defaults to False)
298312
299313
Returns:
300314
LabelField instance configured with the given parameters
301315
"""
302-
return LabelField(semantic=semantic, dtype=dtype, multi_label=multi_label)
316+
return LabelField(semantic=semantic, dtype=dtype, multi_label=multi_label, is_list=is_list)
317+
318+
319+
def convert_numpy_object_array_to_series(data: np.ndarray) -> pl.Series:
320+
"""
321+
Convert ragged numpy object arrays to Polars Series recursively.
322+
323+
Handles nested object arrays containing variable-length lists.
324+
325+
Example:
326+
>>> import numpy as np
327+
>>> ragged = np.array([
328+
... np.array([1, 2, 3]),
329+
... np.array([4, 5]),
330+
... np.array([6, 7, 8, 9])
331+
... ], dtype=object)
332+
>>> series = convert_numpy_object_array_to_series(ragged)
333+
>>> print(series)
334+
shape: (3,)
335+
Series: '' [list[i64]]
336+
[
337+
[1, 2, 3]
338+
[4, 5]
339+
[6, 7, … 9]
340+
]
341+
342+
# Compare with direct conversion which results
343+
# into an object Series instead of a list Series:
344+
>>> direct = pl.Series(ragged)
345+
>>> print(direct)
346+
shape: (3,)
347+
Series: '' [o][object]
348+
[
349+
[1 2 3]
350+
[4 5]
351+
[6 7 8 9]
352+
]
353+
"""
354+
if data.dtype == object:
355+
return pl.Series([convert_numpy_object_array_to_series(elem) for elem in data])
356+
return pl.Series(data)
357+
358+
359+
@dataclass(frozen=True)
360+
class PolygonField(Field):
361+
"""
362+
Represents a polygon field with format and normalization options.
363+
364+
Handles polygon data with support for different coordinate formats
365+
and optional normalization to [0,1] range. Polygons are stored as
366+
variable-length lists of coordinate pairs.
367+
368+
Attributes:
369+
semantic: Semantic tags describing the polygon purpose
370+
dtype: Polars data type for coordinate values
371+
format: Coordinate format (e.g., "xy", "yx")
372+
normalize: Whether coordinates are normalized to [0,1] range
373+
"""
374+
375+
semantic: Semantic
376+
dtype: Any
377+
format: str
378+
normalize: bool
379+
380+
def to_polars_schema(self, name: str) -> dict[str, pl.DataType]:
381+
"""Generate schema for polygon as list of coordinate values."""
382+
return {name: pl.List(pl.List(pl.Array(self.dtype, 2)))}
383+
384+
def to_polars(self, name: str, value: Any) -> dict[str, pl.Series]:
385+
"""Convert polygon tensor to Polars list format."""
386+
numpy_value = to_numpy(value, self.dtype)
387+
388+
series = convert_numpy_object_array_to_series(numpy_value)
389+
390+
return {name: pl.Series(name, [series], dtype=pl.List(pl.List(pl.Array(self.dtype, 2))))}
391+
392+
def from_polars(self, name: str, row_index: int, df: pl.DataFrame, target_type: type[T]) -> T:
393+
"""Reconstruct polygon tensor from Polars data."""
394+
polars_data = df[name][row_index]
395+
return from_polars_data(polars_data, target_type) # type: ignore
396+
397+
398+
def polygon_field(
399+
dtype: Any,
400+
format: str = "xy",
401+
normalize: bool = False,
402+
semantic: Semantic = Semantic.Default,
403+
) -> Any:
404+
"""
405+
Create a PolygonField instance with the specified parameters.
406+
407+
Args:
408+
dtype: Polars data type for coordinate values
409+
format: Coordinate format (defaults to "xy")
410+
normalize: Whether coordinates are normalized (defaults to False)
411+
semantic: Semantic tags describing the polygon purpose (optional)
412+
413+
Returns:
414+
PolygonField instance configured with the given parameters
415+
"""
416+
return PolygonField(semantic=semantic, dtype=dtype, format=format, normalize=normalize)
417+
418+
419+
@dataclass(frozen=True)
420+
class MaskField(Field):
421+
"""
422+
Represents a mask tensor field for binary or indexed segmentation masks.
423+
424+
Similar to TensorField but specialized for masks: handles single-channel
425+
2D arrays with no color format specification. Uses uint8 as the default
426+
data type suitable for binary masks, class masks, or instance masks.
427+
428+
Attributes:
429+
semantic: Semantic tags describing the mask purpose
430+
dtype: Polars data type for mask values (defaults to uint8)
431+
"""
432+
433+
semantic: Semantic
434+
dtype: Any
435+
436+
def to_polars_schema(self, name: str) -> dict[str, pl.DataType]:
437+
"""Generate Polars schema with separate columns for data and shape."""
438+
return {name: pl.List(self.dtype), name + "_shape": pl.List(pl.Int32())}
439+
440+
def to_polars(self, name: str, value: Any) -> dict[str, pl.Series]:
441+
"""Convert mask tensor to flattened data and shape information."""
442+
numpy_value = to_numpy(value, self.dtype)
443+
return {
444+
name: pl.Series(name, [numpy_value.reshape(-1)]),
445+
name + "_shape": pl.Series(name + "_shape", [numpy_value.shape]),
446+
}
447+
448+
def from_polars(self, name: str, row_index: int, df: pl.DataFrame, target_type: type[T]) -> T:
449+
"""Reconstruct mask tensor from flattened data using stored shape."""
450+
flat_data = df[name][row_index]
451+
shape = df[name + "_shape"][row_index]
452+
numpy_data = np.array(flat_data).reshape(shape)
453+
return from_polars_data(numpy_data, target_type) # type: ignore
454+
455+
456+
def mask_field(dtype: Any = pl.UInt8(), semantic: Semantic = Semantic.Default) -> Any:
457+
"""
458+
Create a MaskField instance with the specified parameters.
459+
460+
Args:
461+
dtype: Polars data type for mask values (defaults to pl.UInt8())
462+
semantic: Semantic tags describing the mask purpose (optional)
463+
464+
Returns:
465+
MaskField instance configured with the given parameters
466+
"""
467+
return MaskField(semantic=semantic, dtype=dtype)

0 commit comments

Comments
 (0)