Releases: open-edge-platform/datumaro
Release v1.12.0
This release streamlines Datumaro by removing a number of lesser-used features, helping to simplify the tool and reduce its dependencies. These changes are part of an effort to keep Datumaro focused on its core strengths: dataset management and integration with machine learning frameworks. As part of this update, inference-related features have been removed. For inference tasks, we recommend using the OpenVINO model API. If you rely on a specific feature that is no longer available, you can still access it from the previous version of Datumaro.
Removed features
- CLI commmands:
- API features:
- Model inference
(#1831, #1825) - Model-based transformations
(#1826, #1839) - Crypter
(#1829) - Synthetic dataset generation
(#1815) - Data exploration
(#1814) - BBox to mask using SAM
(#1826) - Telemetry
(#1828) - Anchor generation
(#1832) - Missing annotation detection
(#1826) - Model inference explanation
(#1812) - Near-duplicate removal
(#1835) - Pruning
(#1813) - Pseudo-labels
(#1814) - Project
(#1824) - Noisy label detection
(#1833) - Data shift analysis
(#1827, #1892)
- Model inference
- SAM Docker image
(#1830)
New features
- Experimental dataset class
(#1807, #1810, #1811, #1834, #1858, #1845, #1863, #1868, #1876, #1877, #1879, #1881, #1891) - New OpenVino Accuracy Checker semantic segmentation format
(#1893)
Enhancements
- Mark several dependencies as optional
(#1849, #1862) - Removal of unneeded dependencies
(#1837) - Documentation tidy-up
(#1840) - DCO introduction; readme, PR template, and contribution guide tidy-up
(#1844, #1860, #1856, #1847) - Fix code coverage upload to Codecov in the CI
(#1861) - Fix crashes with certain datasets in the compare command
(#1892) - Added Semgrep security scan in the CI
(#1883)
Release v1.11.1
Release v1.11.0
This release includes a significant number of deprecations in the CLI and API.
This is a one-off action to remove unused features as well as features
such as inference which do not fit well in Datumaro. We intend to remove those
features in Datumaro 1.12.0.
New features
- Convert Cuboid2D annotation to/from 3D data
(#1639) - Add label groups for hierarchical classification in ImageNet
(#1645)
Enhancements
- Add non-strict mode to JsonPageMapper in rust API and enable it for COCO
(#1753) - Enhance 'id_from_image_name' transform to ensure each identifier is unique
(#1635) - Optimize path assignment to handle point cloud in JSON without images
(#1643) - Add documentation for framework conversion
(#1659)
Bug fixes
- Fix assertion to compare hashkeys against expected value
(#1641) - Mark pyemd as optional since it does not support Python 3.12
(#1770)
Deprecations
- Added deprecation to the following CLI commmands:
- explain, explore, generate, prune
- model: add, remove, run, info
- project: add, create, export, import, remove, checkout, commit, log, info, status
- source: import, add, remove
(#1792)
- Added deprecation notices to the following features that will soon be removed:
- Deprecation of the SAM Docker image
(#1783) - Deprecation of Project and related features (#1793)
Release v1.10.0
New features
- Add default position information to PointsCategories class
(#1702) - Support KITTI 3D format
(#1619)
(#1621) - Add PseudoLabeling transform for unlabeled dataset
(#1594)
Enhancements
- Raise an appropriate error when exporting a datumaro dataset if its subset name contains path separators.
(#1615) - Update docs for transform plugins
(#1599) - Update ov ir model for explorer openvino launcher with CLIP ViT-L/14@336px model
(#1603) - Optimize path assignment to handle point cloud in JSON without images
(#1643) - Set TabularTransform to process clean transform in parallel
(#1648) - Add support for Python 3.12
(#1701)
Bug fixes
- Fix datumaro format to load visibility information from Points annotations
(#1644)
Release v1.10.0rc1
What's Changed
- Update ov ir model for explorer openvino launcher with CLIP ViT-L/14@336px model by @sooahleex in #1603
- Fix datumaro keypoint loading by @jihyeonyi in #1644
- Update assets for explorer by @sooahleex in #1647
- Optimize path assignment to handle point cloud in JSON without images by @sooahleex in #1649
- Fix to get image of tabular data in FrameConverter by @sooahleex in #1650
- Set TabularTransform to process clean transform in parallel by @sooahleex in #1648
Full Changelog: v1.10.0rc0...v1.10.0rc1
Release v1.10.0rc0
What's Changed
- Fix bug where custom environment plugins were lost on dataset merge by @williamcorsel in #1582
- Bump github/codeql-action from 3.26.2 to 3.26.4 by @dependabot in #1589
- Support language dataset for DmTorchDataset by @sooahleex in #1588
- Revert "Support language dataset for DmTorchDataset" by @sooahleex in #1591
- Bump github/codeql-action from 3.26.4 to 3.26.6 by @dependabot in #1593
- Bump pypa/gh-action-pypi-publish from 1.9.0 to 1.10.1 by @dependabot in #1600
- Mergeback 1.9.0 to develop by @sooahleex in #1604
- Revert "Mergeback 1.9.0 to develop" by @sooahleex in #1605
- Mergeback 1.9.0 to develop by @sooahleex in #1606
- Add Cuboid2D annotation by @itrushkin in #1601
- Bump pypa/gh-action-pypi-publish from 1.10.1 to 1.10.2 by @dependabot in #1617
- Handle path separators in the subset when exporting a datumaro dataset by @jihyeonyi in #1615
- Add PseudoLabeling transform for unlabeled dataset by @sooahleex in #1594
- Update docs for transform plugins by @sooahleex in #1599
- Support KITTI 3D format by @sooahleex in #1619
- Mergeback 1.9.1 to develop by @yunchu in #1623
- Support subset for KITTI 3D format by @sooahleex in #1621
New Contributors
- @williamcorsel made their first contribution in #1582
Full Changelog: v1.9.1...v1.10.0rc0
Release v1.9.1
What's Changed - Brief version
Enhancements
- Support multiple labels for kaggle format
(#1607) - Use DataFrame.map instead of DataFrame.applymap
(#1613)
Bug fixes
- Fix StreamDataset merging when importing in eager mode
(#1609)
What's Changed - Full version
- Support multiple labels for kaggle format by @sooahleex in #1607
- Update version to 1.9.1rc0 by @yunchu in #1611
- Fix merging of stream datasets by @itrushkin in #1609
- Use DataFrame.map instead of DataFrame.applymap by @sooahleex in #1613
- Update for release 1.9.1 by @yunchu in #1622
Full Changelog: v1.9.0...v1.9.1
Release 1.9.0
What's Changed - Brief version
New features
Enhancements
- Change _Shape to Shape and add comments for subclasses of Shape
(#1568)
Bug fixes
- Fix KITTI-3D importer and exporter
(#1596)
What's Changed - Full version
- Add hierarchical ImageNet-like dataset format by @itrushkin in #1528
- Verify w and h input multiplication overflow to rleEncode() by @yunchu in #1544
- Add dtype argument when calling media.data by @wonjuleee in #1546
- Mergeback 1.8.0 to develop by @yunchu in #1566
- Bump aquasecurity/trivy-action from 0.23.0 to 0.24.0 by @dependabot in #1561
- Bump orjson from 3.10.5 to 3.10.6 by @dependabot in #1553
- Bump github/codeql-action from 3.25.10 to 3.25.11 by @dependabot in #1552
- Bump ipython from 8.25.0 to 8.26.0 by @dependabot in #1551
- Bump github/codeql-action from 3.25.11 to 3.25.12 by @dependabot in #1567
- Add a new CLI command: datum format by @vinnamkim in #1570
- Change _Shape to Shape and add comments for subclasses of Shape by @sooahleex in #1568
- Bump ossf/scorecard-action from 2.3.3 to 2.4.0 by @dependabot in #1574
- Bump github/codeql-action from 3.25.12 to 3.25.15 by @dependabot in #1575
- Make ImageNet data format streamable by @itrushkin in #1571
- Make image dirs data format streamable by @itrushkin in #1576
- Bump orjson from 3.10.6 to 3.10.7 by @dependabot in #1581
- Bump github/codeql-action from 3.25.15 to 3.26.2 by @dependabot in #1584
- Update issue_assignment.yml by @vinnamkim in #1585
- Change common semantic segmentation dataset detection rule by @itrushkin in #1572
- Support language dataset for DmTorchDataset by @sooahleex in #1592
- Fix KITTI-3D importer and exporter by @wonjuleee in #1596
- Update tpp for 1.9.0 by @yunchu in #1597
- update for release 1.9.0 by @yunchu in #1602
Full Changelog: v1.8.0...v1.9.0
Release 1.8.0
What's Changed - Brief version
New features
Enhancements
- Set label name with parents to avoid duplicates for AstypeAnnotations
(#1492) - Pass Keyword Argument to TabularDataBase
(#1522) - Support hierarchical structure for ImageNet dataset format
(#1528) - Enable dtype argument when calling media.data
(#1546)
Bug fixes
- Preserve end_frame information of a video when it is zero.
(#1541) - Changed the Datumaro format to ensure exported videos have relative paths and to prevent the same video from being overwritten.
(#1547)
What's Changed - Full version
- Set label name with parents to avoid duplicates for AstypeAnnotations by @sooahleex in #1492
- Add TabularValidator by @sooahleex in #1498
- Add TblStats in Configurable Validator by @sooahleex in #1504
- Bump github/codeql-action from 3.25.4 to 3.25.6 by @dependabot in #1502
- Bump aquasecurity/trivy-action from 0.20.0 to 0.21.0 by @dependabot in #1506
- Bump ruff from 0.4.3 to 0.4.5 by @dependabot in #1505
- Bump pozil/auto-assign-issue from 1.14.0 to 2.0.0 by @dependabot in #1500
- Bump ossf/scorecard-action from 2.3.1 to 2.3.3 by @dependabot in #1496
- Bump ruff from 0.4.5 to 0.4.6 by @dependabot in #1512
- Bump github/codeql-action from 3.25.6 to 3.25.7 by @dependabot in #1516
- Bump ruff from 0.4.6 to 0.4.7 by @dependabot in #1517
- Doc update to replace
--save-imagesis replaced with--save-mediaby @sooahleex in #1514 - Pass Keyword Argument to TabularDataBase by @sooahleex in #1522
- Add correct functionality for tabular data type by @sooahleex in #1513
- Add Clean Transform for tabular data type by @sooahleex in #1520
- Mergeback 1.7.0 to develop by @yunchu in #1538
- Bump aquasecurity/trivy-action from 0.21.0 to 0.23.0 by @dependabot in #1536
- Revert "Mergeback 1.7.0 to develop" by @yunchu in #1539
- Mergeback 1.7.0 to develop by @yunchu in #1540
- Bump pypa/gh-action-pypi-publish from 1.8.14 to 1.9.0 by @dependabot in #1535
- Bump ruff from 0.4.7 to 0.4.9 by @dependabot in #1532
- Bugfix when end_frame is zero by @jihyeonyi in #1541
- Bump github/codeql-action from 3.25.7 to 3.25.10 by @dependabot in #1531
- Bump opencv-python-headless from 4.9.0.80 to 4.10.0.84 by @dependabot in #1537
- Bump ruff from 0.4.9 to 0.4.10 by @dependabot in #1543
- Bump ipython from 8.24.0 to 8.25.0 by @dependabot in #1518
- Bump orjson from 3.10.3 to 3.10.5 by @dependabot in #1530
- Apply clean transform of updated annotations only for tabular annotation type by @sooahleex in #1533
- verify w and h input multiplication overflow to rleEncode() by @yunchu in #1548
- Video bug fix by @jihyeonyi in #1547
- Add notebook for data handling of kaggle dataset by @sooahleex in #1534
- Update pre-commit config to pin ruff dependency for nbqa-ruff by @yunchu in #1550
- Update for release 1.8.0rc0 by @yunchu in #1559
- Support hierarchical structure for ImageNet format by @itrushkin in #1562
- Fix typings in ImageNet format by @itrushkin in #1563
- Update tpp file by @yunchu in #1564
- Update version string to 1.8.0 by @yunchu in #1565
Full Changelog: v1.7.0...v1.8.0
Release 1.7.0
What's Changed - Brief Version
New features
- Support 'Video' media type in datumaro format
(#1491) - Add ann_types property for dataset
(#1422, #1479) - Add AnnotationType.rotated_bbox for oriented object detection
(#1459) - Add DOTA data format for oriented object detection task
(#1475) - Add AstypeAnnotations Transform
(#1484) - Enhance DatasetItem annotations for semantic segmentation model training use case
(#1503)
Enhancements
- Fix ambiguous COCO format detector
(#1442) - Get target information for tabular dataset
(#1471) - Add ExtractedMask and update importers who can use it to use it
(#1480) - Improve PIL and COLOR_BGR context image decode performance
(#1501) - Improve get_area() of Polygon through Shoelace formula
(#1507) - Improve _Shape point converter
(#1508)
Bug fixes
- Split the video directory into subsets to avoid overwriting
(#1485)
What's Changed - Full Version
- Bump github/codeql-action from 3.24.9 to 3.24.10 by @dependabot in #1418
- Bump ipython from 8.22.2 to 8.23.0 by @dependabot in #1413
- Bump lxml from 5.2.0 to 5.2.1 by @dependabot in #1414
- Bump pozil/auto-assign-issue from 1.13.0 to 1.14.0 by @dependabot in #1417
- Add task type information when importing by @wonjuleee in #1422
- Bump black from 24.3.0 to 24.4.0 by @dependabot in #1433
- Bump ruff from 0.3.5 to 0.3.7 by @dependabot in #1434
- Bump orjson from 3.10.0 to 3.10.1 by @dependabot in #1441
- Bump github/codeql-action from 3.24.10 to 3.25.0 by @dependabot in #1440
- Mergeback releases/1.6.0 to develop by @yunchu in #1428
- Fix ambiguous coco format detector by @wonjuleee in #1442
- Bump github/codeql-action from 3.25.0 to 3.25.1 by @dependabot in #1451
- Update dependabot config to prevent redundant PR creation by @yunchu in #1455
- Add new annotation type RotatedBbox by @wonjuleee in #1459
- Bump ruff from 0.3.7 to 0.4.1 by @dependabot in #1464
- Bump actions/checkout from 3 to 4 by @dependabot in #1462
- Update dependabot config by @yunchu in #1469
- Bump black from 24.4.0 to 24.4.1 by @dependabot in #1473
- Bump github/codeql-action from 3.25.1 to 3.25.2 by @dependabot in #1470
- Get target information for tabular dataset by @sooahleex in #1471
- Support DOTA data format for oriented object detection task by @wonjuleee in #1475
- Mergeback 1.6.1rc4 to develop by @yunchu in #1478
- Bump ruff from 0.4.1 to 0.4.2 by @dependabot in #1476
- Bump github/codeql-action from 3.25.2 to 3.25.3 by @dependabot in #1477
- Bump black from 24.4.1 to 24.4.2 by @dependabot in #1482
- Bump ipython from 8.23.0 to 8.24.0 by @dependabot in #1481
- Add ExtractedMask and update importers who can use it to use it by @vinnamkim in #1480
- Support annotation types instead of task type by @wonjuleee in #1479
- Split video directory by subset in datumaro format. by @jihyeonyi in #1485
- Update stability tests by @yunchu in #1483
- Add AstypeAnnotations Transform by @sooahleex in #1484
- Bump ruff from 0.4.2 to 0.4.3 by @dependabot in #1490
- Download Kaggle datasets by @itrushkin in #1487
- Bump orjson from 3.10.1 to 3.10.3 by @dependabot in #1489
- Bump github/codeql-action from 3.25.3 to 3.25.4 by @dependabot in #1493
- Bump aquasecurity/trivy-action from 0.19.0 to 0.20.0 by @dependabot in #1494
- Update pillow constraint to >=10.3.0 by @yunchu in #1495
- Enabled support for 'Video' media type in the datumaro format by @jihyeonyi in #1491
- Improve PIL and COLOR_BGR context image decode performance by @vinnamkim in #1501
- Enhance DatasetItem annotations for semantic segmentation model training use case by @vinnamkim in #1503
- Improve get_area() for polygons through Shoelace formula by @wonjuleee in #1507
- Improve Shape point converter by @wonjuleee in #1508
- Update codeql workflow by @yunchu in #1515
- Update for release 1.7.0 by @yunchu in #1526
Full Changelog: v1.6.1...v1.7.0