Skip to content

Commit a037b33

Browse files
authored
Remove the FeatureVector annotation type, various fixes, and cleanup (#1839)
This pull request removes support for the deprecated `FeatureVector` annotation type and related functionality, add missing plugin definitions to spec.json. It also updates documentation and test configurations to reflect these changes. ### Removal of FeatureVector annotation type and related functionality * Removed the `FeatureVector` annotation class, its matcher and merger classes, and the corresponding `feature_vector` entry from the `AnnotationType` enum in `src/datumaro/components/annotation.py`, `src/datumaro/components/annotations/matcher.py`, and `src/datumaro/components/annotations/merger.py`. [[1]](diffhunk://#diff-7f2b49c4e168c9b76b6096e142d40788cdb5641cb210cc6351d7072d88313a58L53) [[2]](diffhunk://#diff-7f2b49c4e168c9b76b6096e142d40788cdb5641cb210cc6351d7072d88313a58L260-L271) [[3]](diffhunk://#diff-946419970a02797027e73d686303272fae7144813b7169985218935c52089f9fL36) [[4]](diffhunk://#diff-946419970a02797027e73d686303272fae7144813b7169985218935c52089f9fL354-L360) [[5]](diffhunk://#diff-e7d4644319a6bb820e963fff1e4a74d2a36ef659534b7f3199e456877ce4cd29L17) [[6]](diffhunk://#diff-e7d4644319a6bb820e963fff1e4a74d2a36ef659534b7f3199e456877ce4cd29L42) [[7]](diffhunk://#diff-e7d4644319a6bb820e963fff1e4a74d2a36ef659534b7f3199e456877ce4cd29L194-L198) [[8]](diffhunk://#diff-45c0fe95cb538e7a0d37c2c8a6ef2f9204b02f1fd598c3eb7bf2fe259623f8d7L25) [[9]](diffhunk://#diff-45c0fe95cb538e7a0d37c2c8a6ef2f9204b02f1fd598c3eb7bf2fe259623f8d7L450-L451) * Removed references to `FeatureVectorMerger` from the merge logic in `src/datumaro/components/merge/intersect_merge.py`. [[1]](diffhunk://#diff-45c0fe95cb538e7a0d37c2c8a6ef2f9204b02f1fd598c3eb7bf2fe259623f8d7L25) [[2]](diffhunk://#diff-45c0fe95cb538e7a0d37c2c8a6ef2f9204b02f1fd598c3eb7bf2fe259623f8d7L450-L451) * Updated tests to remove checks for `feature_vector` statistics in `tests/unit/operations/test_statistics.py`. [[1]](diffhunk://#diff-c9014cbc94767631e72b9105fb3339a75fd98d9aa1aea809c4fc28dccdc24ea1L305) [[2]](diffhunk://#diff-c9014cbc94767631e72b9105fb3339a75fd98d9aa1aea809c4fc28dccdc24ea1L430) ### Plugin additions for new formats * Added plugin definitions for Roboflow TFRecord and TensorFlow Detection API formats (DatasetBase, Importer, and Exporter) to `src/datumaro/plugins/specs.json`, including TensorFlow as an extra dependency. [[1]](diffhunk://#diff-49a19686affac07069c2b4dfe733cf196aae57dbd46c683ca74afcdb4acba950R1175-R1195) [[2]](diffhunk://#diff-49a19686affac07069c2b4dfe733cf196aae57dbd46c683ca74afcdb4acba950R1361-R1389) ### Documentation and compatibility updates * Removed deprecated usage examples for project-based workflows from `docs/source/docs/data-formats/media_formats.md` and deleted the deprecated `src/datumaro/project.py` module. [[1]](diffhunk://#diff-a35988ee55f8d9dfc6bd3222fba44d35635ceb3b224810db393bd7d97f59c3f0L20-L31) [[2]](diffhunk://#diff-2816b82cd5bcd290f30697a92d9eab8fc69661f926a2e32366b145874ec0bcd3L1-L9) * Cleaned up deprecated variable declarations from CLI and annotation modules. [[1]](diffhunk://#diff-6f1f61fbe373037fdf31139d07acfd57161d3557eaf7369807fea9f6ec65293fL64-L66) [[2]](diffhunk://#diff-77d278999778b6ae6f7aeb8a0ea8d5abbaf3e79a4ff53526648d343874eaf098L25-L26) [[3]](diffhunk://#diff-7f2b49c4e168c9b76b6096e142d40788cdb5641cb210cc6351d7072d88313a58L36) [[4]](diffhunk://#diff-fe06b77e577603a2f17b21c10f504a4da2dd597eb82acb465864c7a6de9ca89bL28) ### Test configuration updates * Updated `tox.ini` to include TensorFlow and TFDS for Python 3.12 environments.<!-- Contributing guide: https://github.com/open-edge-platform/datumaro/blob/develop/CONTRIBUTING.md --> <!-- Resolves #111 and #222. Depends on #1000 (for series of dependent commits). This PR introduces this capability to make the project better in this and that. - Added this feature - Removed that feature - Fixed the problem #1234 --> ### How to test <!-- Describe the testing procedure for reviewers, if changes are not fully covered by unit tests or manual testing can be complicated. --> ### Checklist <!-- Put an 'x' in all the boxes that apply --> - [ ] I have added unit tests to cover my changes.​ - [ ] I have added integration tests to cover my changes.​ - [ ] I have added the description of my changes into [CHANGELOG](https://github.com/open-edge-platform/datumaro/blob/develop/CHANGELOG.md).​ - [ ] I have updated the [documentation](https://github.com/open-edge-platform/datumaro/tree/develop/docs) accordingly ### License - [ ] I submit _my code changes_ under the same [MIT License](https://github.com/open-edge-platform/datumaro/blob/develop/LICENSE) that covers the project. Feel free to contact the maintainers if that's a concern. - [ ] I have updated the license header for each file (see an example below). ```python # Copyright (C) 2025 Intel Corporation # # SPDX-License-Identifier: MIT ```
2 parents b20670f + acf7b4a commit a037b33

File tree

13 files changed

+58
-71
lines changed

13 files changed

+58
-71
lines changed

docs/source/docs/data-formats/media_formats.md

Lines changed: 4 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -17,23 +17,11 @@ datum project import -f image_dir </path/to/directory/containing/images>
1717

1818
or, if you work with Datumaro API:
1919

20-
- for using with a project:
21-
22-
```python
23-
from datumaro.project import Project
24-
25-
project = Project.init('/path/to/project')
26-
project.import_source('source1', format='image_dir', url='/path/to/directory/containing/images')
27-
dataset = project.working_tree.make_dataset()
28-
```
29-
30-
- for using as a dataset:
31-
32-
```python
33-
from datumaro import Dataset
20+
```python
21+
from datumaro import Dataset
3422

35-
dataset = Dataset.import_from('/path/to/directory/containing/images', 'image_dir')
36-
```
23+
dataset = Dataset.import_from('/path/to/directory/containing/images', 'image_dir')
24+
```
3725

3826
This will search for images in the directory recursively and add
3927
them as dataset entries with names like `<subdir1>/<subsubdir1>/<image_name1>`.

setup.py

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -92,10 +92,6 @@ def parse_requirements(filename=CORE_REQUIREMENTS_FILE):
9292
],
9393
},
9494
cmdclass={"build_ext": build_ext},
95-
package_data={
96-
"datumaro.plugins.synthetic_data": ["background_colors.txt"],
97-
"datumaro.plugins.openvino_plugin.samples": ["coco.class", "imagenet.class"],
98-
},
9995
include_package_data=True,
10096
rust_extensions=[
10197
RustExtension(

src/datumaro/cli/__main__.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -61,9 +61,6 @@ def _define_loglevel_option(parser):
6161
return parser
6262

6363

64-
deprecated = "[DEPRECATED, will be removed in 1.12]"
65-
66-
6764
# TODO: revisit during CLI refactoring
6865
def _get_known_contexts():
6966
return [

src/datumaro/cli/commands/__init__.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,6 @@
2222
"get_non_project_commands",
2323
]
2424

25-
deprecated = "[DEPRECATED, will be removed in 1.12]"
26-
2725

2826
def get_non_project_commands():
2927
return [

src/datumaro/components/annotation.py

Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,6 @@
3333

3434
from datumaro.components.media import Image
3535
from datumaro.util.attrs_util import default_if_none, not_empty
36-
from datumaro.util.deprecation import deprecated
3736
from datumaro.util.points_util import normalize_points
3837

3938

@@ -50,7 +49,6 @@ class AnnotationType(IntEnum):
5049
super_resolution_annotation = 9
5150
depth_annotation = 10
5251
ellipse = 11
53-
feature_vector = 12
5452
tabular = 13
5553
rotated_bbox = 14
5654
cuboid_2d = 15
@@ -257,18 +255,6 @@ class Label(Annotation):
257255
label: int = field(converter=int)
258256

259257

260-
@deprecated(deprecated_version="1.11", removed_version="1.12")
261-
@attrs(eq=False, order=False)
262-
class FeatureVector(Annotation):
263-
_type = AnnotationType.feature_vector
264-
vector: np.ndarray = field(validator=attr.validators.instance_of(np.ndarray))
265-
266-
def __eq__(self, other):
267-
if not isinstance(other, __class__):
268-
return False
269-
return np.array_equal(self.vector, other.vector)
270-
271-
272258
RgbColor = Tuple[int, int, int]
273259

274260
Colormap = Dict[int, RgbColor]

src/datumaro/components/annotations/matcher.py

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,6 @@
3333
"CaptionsMatcher",
3434
"Cuboid3dMatcher",
3535
"ImageAnnotationMatcher",
36-
"FeatureVectorMatcher",
3736
"Cuboid2DMatcher",
3837
]
3938

@@ -351,13 +350,6 @@ def match_annotations(self, sources):
351350
raise NotImplementedError()
352351

353352

354-
@attrs
355-
@attrs
356-
class FeatureVectorMatcher(AnnotationMatcher):
357-
def match_annotations(self, sources):
358-
raise NotImplementedError()
359-
360-
361353
@attrs
362354
class TabularMatcher(AnnotationMatcher):
363355
def match_annotations(self, sources):

src/datumaro/components/annotations/merger.py

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,6 @@
1414
CaptionsMatcher,
1515
Cuboid2DMatcher,
1616
Cuboid3dMatcher,
17-
FeatureVectorMatcher,
1817
ImageAnnotationMatcher,
1918
LabelMatcher,
2019
LineMatcher,
@@ -39,7 +38,6 @@
3938
"Cuboid3dMerger",
4039
"ImageAnnotationMerger",
4140
"EllipseMerger",
42-
"FeatureVectorMerger",
4341
]
4442

4543

@@ -191,11 +189,6 @@ class EllipseMerger(_ShapeMerger, ShapeMatcher):
191189
pass
192190

193191

194-
@attrs
195-
class FeatureVectorMerger(AnnotationMerger, FeatureVectorMatcher):
196-
pass
197-
198-
199192
@attrs
200193
class TabularMerger(AnnotationMerger, TabularMatcher):
201194
pass

src/datumaro/components/hl_ops/__init__.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,6 @@
2525
from datumaro.components.transformer import Transform
2626
from datumaro.components.validator import TaskType, Validator
2727
from datumaro.util import parse_str_enum_value
28-
from datumaro.util.deprecation import deprecated
2928
from datumaro.util.scope import on_error_do, scoped
3029

3130
if TYPE_CHECKING:

src/datumaro/components/merge/intersect_merge.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,6 @@
2222
Cuboid2DMerger,
2323
Cuboid3dMerger,
2424
EllipseMerger,
25-
FeatureVectorMerger,
2625
ImageAnnotationMerger,
2726
LabelMerger,
2827
LineMerger,
@@ -447,8 +446,6 @@ def _for_type(t, **kwargs):
447446
return _make(ImageAnnotationMerger, **kwargs)
448447
elif t is AnnotationType.ellipse:
449448
return _make(EllipseMerger, **kwargs)
450-
elif t is AnnotationType.feature_vector:
451-
return _make(FeatureVectorMerger, **kwargs)
452449
elif t is AnnotationType.tabular:
453450
return _make(TabularMerger, **kwargs)
454451
elif t is AnnotationType.rotated_bbox:

src/datumaro/plugins/specs.json

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1172,6 +1172,27 @@
11721172
"plugin_name": "roboflow_yolo_obb",
11731173
"plugin_type": "DatasetBase"
11741174
},
1175+
{
1176+
"import_path": "datumaro.plugins.data_formats.roboflow.base_tfrecord.RoboflowTfrecordBase",
1177+
"plugin_name": "roboflow_tfrecord",
1178+
"plugin_type": "DatasetBase",
1179+
"extra_deps": [
1180+
"tensorflow"
1181+
]
1182+
},
1183+
{
1184+
"import_path": "datumaro.plugins.data_formats.roboflow.base_tfrecord.RoboflowTfrecordImporter",
1185+
"plugin_name": "roboflow_tfrecord",
1186+
"plugin_type": "Importer",
1187+
"extra_deps": [
1188+
"tensorflow"
1189+
],
1190+
"metadata": {
1191+
"file_extensions": [
1192+
".tfrecord"
1193+
]
1194+
}
1195+
},
11751196
{
11761197
"import_path": "datumaro.plugins.data_formats.roboflow.importer.RoboflowCocoImporter",
11771198
"plugin_name": "roboflow_coco",
@@ -1337,6 +1358,35 @@
13371358
]
13381359
}
13391360
},
1361+
{
1362+
"import_path": "datumaro.plugins.data_formats.tf_detection_api.base.TfDetectionApiBase",
1363+
"plugin_name": "tf_detection_api",
1364+
"plugin_type": "DatasetBase",
1365+
"extra_deps": [
1366+
"tensorflow"
1367+
]
1368+
},
1369+
{
1370+
"import_path": "datumaro.plugins.data_formats.tf_detection_api.base.TfDetectionApiImporter",
1371+
"plugin_name": "tf_detection_api",
1372+
"plugin_type": "Importer",
1373+
"extra_deps": [
1374+
"tensorflow"
1375+
],
1376+
"metadata": {
1377+
"file_extensions": [
1378+
".tfrecord"
1379+
]
1380+
}
1381+
},
1382+
{
1383+
"import_path": "datumaro.plugins.data_formats.tf_detection_api.exporter.TfDetectionApiExporter",
1384+
"plugin_name": "tf_detection_api",
1385+
"plugin_type": "Exporter",
1386+
"extra_deps": [
1387+
"tensorflow"
1388+
]
1389+
},
13401390
{
13411391
"import_path": "datumaro.plugins.data_formats.vgg_face2.VggFace2Base",
13421392
"plugin_name": "vgg_face2",

0 commit comments

Comments
 (0)