LuxonisML v0.8.1 (#375)

kozlov721 · web-flow · commit 1b4056adb1d7 · 2025-11-14T11:39:19.000+01:00
diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md
@@ -2,13 +2,13 @@ ______________________________________________________________________
 
 name: Bug report
 about: Create a bug report to help us improve
-title: "\[BUG_TITLE\]"
+title: "[BUG_TITLE]"
 labels: bug
 assignees: ''
 
 ______________________________________________________________________
 
-# \[BUG_TITLE\]
+# [BUG_TITLE]
 
 ## Description
 
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -1,6 +1,6 @@
 repos:
   - repo: https://github.com/astral-sh/ruff-pre-commit
-    rev: v0.9.9
+    rev: v0.14.5
     hooks:
       - id: ruff
         args: [--fix, --exit-non-zero-on-fix]
@@ -9,18 +9,18 @@ repos:
         types_or: [python, pyi, jupyter]
 
   - repo: https://github.com/PyCQA/docformatter
-    rev: v1.7.5
+    rev: v1.7.7
     hooks:
       - id: docformatter
         additional_dependencies: [tomli]
         args: [--in-place, --black, --style=epytext]
 
   - repo: https://github.com/executablebooks/mdformat
-    rev: 0.7.10
+    rev: 1.0.0
     hooks:
       - id: mdformat
         additional_dependencies:
-          - mdformat-gfm==0.3.6
+          - mdformat-gfm==1.0.0
 
   - repo: https://github.com/ComPWA/taplo-pre-commit
     rev: v0.9.3
@@ -29,7 +29,7 @@ repos:
       - id: taplo-format
 
   - repo: https://github.com/pre-commit/pre-commit-hooks
-    rev: v5.0.0
+    rev: v6.0.0
     hooks:
       - id: trailing-whitespace
       - id: check-docstring-first
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -29,7 +29,7 @@ Install the development dependencies by running `pip install -r requirements-dev
 pip install -e .[dev]
 ```
 
-> \[!NOTE\]
+> [!NOTE]
 > This will install the package in editable mode (`-e`),
 > so you can make changes to the code and run them immediately.
 
@@ -90,15 +90,15 @@ pytest tests --cov=luxonis_ml --cov-report=html -n auto
 
 This command will run all tests in parallel (`-n auto`) and will generate an HTML coverage report.
 
-> \[!TIP\]
+> [!TIP]
 > The coverage report will be saved to `htmlcov` directory.
 > If you want to inspect the coverage in more detail, open `htmlcov/index.html` in a browser.
 
-> \[!IMPORTANT\]
+> [!IMPORTANT]
 > If a new feature is added, a new test should be added to cover it.
 > There is no minimum coverage requirement for now, but minimal coverage will be enforced in the future.
 
-> \[!IMPORTANT\]
+> [!IMPORTANT]
 > All tests must be passing using the `-n auto` flag before merging a PR.
 
 ## GitHub Actions
@@ -108,10 +108,10 @@ Our GitHub Actions workflow is run when a new PR is opened.
 1. First, the [pre-commit](#pre-commit-hooks) hooks must pass and the [documentation](#documentation) must be built successfully.
 1. If all previous checks pass, the [tests](#tests) are run.
 
-> \[!TIP\]
+> [!TIP]
 > Review the GitHub Actions output if your PR fails.
 
-> \[!IMPORTANT\]
+> [!IMPORTANT]
 > Successful completion of all the workflow checks is required for merging a PR.
 
 ## Making and Reviewing Changes
diff --git a/README.md b/README.md
@@ -48,7 +48,7 @@ Additional dependencies for working with specific cloud services can be installe
 - `roboflow`: Dependencies for downloading datasets from Roboflow
 - `mlflow`: Dependencies for working with MLFlow
 
-> \[!NOTE\]
+> [!NOTE]
 > If some of the additional dependencies are required but not installed (_e.g._ attempting to use Google Cloud Storage without installing the `gcs` extra), then the missing dependencies will be installed automatically.
 
 **Example**:
diff --git a/luxonis_ml/__init__.py b/luxonis_ml/__init__.py
@@ -1,4 +1,9 @@
-__version__ = "0.8.0"
+from typing import Final
+
+from pydantic_extra_types.semantic_version import SemanticVersion
+
+__version__: Final[str] = "0.8.1"
+__semver__: Final[SemanticVersion] = SemanticVersion.parse(__version__)
 
 import os
 
diff --git a/luxonis_ml/data/README.md b/luxonis_ml/data/README.md
@@ -4,7 +4,7 @@
 
 LuxonisML Data is a library for creating and interacting with datasets in the LuxonisDataFormat (LDF).
 
-> \[!NOTE\]
+> [!NOTE]
 > For hands-on examples of how to prepare and iteract with `LuxonisML` datasets, check out [this guide](https://github.com/luxonis/ai-tutorials/tree/main/training#%EF%B8%8F-prepare-data-using-luxonis-ml).
 
 The lifecycle of an LDF dataset is as follows:
@@ -77,7 +77,7 @@ You can create as many datasets as you want, each with a unique name.
 
 Datasets can be stored locally or in one of the supported cloud storage providers.
 
-> \[!NOTE\]
+> [!NOTE]
 > 📚 For a complete list of all parameters and methods of the `LuxonisDataset` class, see the [datasets README.md](datasets/README.md).
 
 ### Dataset Creation
@@ -92,10 +92,10 @@ dataset_name = "parking_lot"
 dataset = LuxonisDataset(dataset_name)
 ```
 
-> \[!NOTE\]
+> [!NOTE]
 > By default, the dataset will be created locally. For more information on creating a remote dataset, see [this section](datasets/README.md#creating-a-dataset-remotely).
 
-> \[!NOTE\]
+> [!NOTE]
 > If there already is a dataset with the same name, it will be loaded instead of creating a new one.
 > If you want to always create a new dataset, you can pass `delete_local=True` to the `LuxonisDataset` constructor.\
 > For detailed information about how the luxonis-ml dataset is stored in both local and remote storage, please check the [datasets README.md](datasets/README.md#in-depth-explanation-of-luxonis-ml-dataset-storage)
@@ -254,7 +254,7 @@ Once you've defined your data source, pass it to the dataset's add method:
 dataset.add(generator())
 ```
 
-> \[!NOTE\]
+> [!NOTE]
 > The `add` method accepts any iterable, not only generators.
 
 ### Defining Splits
@@ -291,7 +291,7 @@ Calling `make_splits` with no arguments will default to an 80/10/10 split.
 In order for splits to be created, there must be some new data in the dataset. If no new data were added, calling `make_splits` will raise an error.
 If you wish to delete old splits and create new ones using all the data, pass `redefine_splits=True` to the method call.
 
-> \[!NOTE\]
+> [!NOTE]
 > There are no restrictions on the split names,
 > however for most cases one should stick to `"train"`, `"val"`, and `"test"`.
 
@@ -338,8 +338,8 @@ The available commands are:
 - `luxonis_ml data ls` - lists all datasets
 - `luxonis_ml data info <dataset_name>` - prints information about the dataset
 - `luxonis_ml data inspect <dataset_name>` - renders the data in the dataset on screen using `cv2`
-- `luxonis_ml data health <dataset_name>` -  checks the health of the dataset and logs and renders dataset statistics
-- `luxonis_ml data sanitize <dataset_name>` -  removes duplicate files and duplicate annotations from the dataset
+- `luxonis_ml data health <dataset_name>` - checks the health of the dataset and logs and renders dataset statistics
+- `luxonis_ml data sanitize <dataset_name>` - removes duplicate files and duplicate annotations from the dataset
 - `luxonis_ml data delete <dataset_name>` - deletes the dataset
 - `luxonis_ml data export <dataset_name>` - exports the dataset to a chosen format and directory
 - `luxonis_ml data push <dataset_name>` - pushes local dataset to remote storage
@@ -357,7 +357,7 @@ This guide covers the loading of datasets using the `LuxonisLoader` class.
 
 The `LuxonisLoader` class can also take care of data augmentation, for more info see [Augmentation](#augmentation).
 
-> \[!NOTE\]
+> [!NOTE]
 > 📚 For a complete list of all parameters of the `LuxonisLoader` class, see the [loaders README.md](loaders/README.md).
 
 ### Dataset Loading
@@ -609,7 +609,7 @@ The directory can also be a zip file containing the dataset.
 The `task_name` argument can be specified as a single string or as a dictionary. If a string is provided, it will be used as the task name for all records.
 Alternatively, you can provide a dictionary that maps class names to task names for better dataset organization. See the example below.
 
-> \[!NOTE\]
+> [!NOTE]
 > 📚 For a complete list of all parameters of the `LuxonisParser` class, see the [parsers README.md](parsers/README.md).
 
 ```python
@@ -664,7 +664,7 @@ A single class label for the entire image.
 }
 ```
 
-> \[!NOTE\]
+> [!NOTE]
 > The `classification` task is always added to the dataset.
 
 ### Bounding Box
@@ -794,10 +794,10 @@ The `counts` field contains either a **compressed byte string** or an **uncompre
 
 ```
 
-> \[!NOTE\]
+> [!NOTE]
 > The RLE format is not intended for regular use and is provided mainly to support datasets that may already be in this format.
 
-> \[!NOTE\]
+> [!NOTE]
 > Masks provided as numpy arrays are converted to RLE format internally.
 
 ### Array
@@ -993,7 +993,7 @@ The following example demonstrates a simple augmentation pipeline:
 
 ```
 
-> \[!NOTE\]
+> [!NOTE]
 > The augmentations are **not** applied in order. Instead, an optimal order is determined based on the type of the augmentations to minimize the computational cost.
 
 ### Usage with LuxonisLoader
diff --git a/luxonis_ml/data/datasets/README.md b/luxonis_ml/data/datasets/README.md
@@ -230,7 +230,7 @@ dataset.make_splits((0.8, 0.1, 0.1))
 
 [A remote dataset functions similarly to a local dataset](#in-depth-explanation-of-luxonis-ml-dataset-storage). When a remote dataset is created, the same folder structure appears locally, and the equivalent structure appears in the cloud. The media folder is empty locally but is filled with images on the remote storage, where filenames become UUIDs with the appropriate suffix.
 
-> \[!NOTE\]
+> [!NOTE]
 > **IMPORTANT:** Be careful when creating a remote dataset with the same name as an already existing local dataset, because corruption of datasets may occur if not handled properly.
 >
 > Use `delete_local=True` and `delete_remote=True` to create a new dataset (deleting both local and remote storage) before calling `dataset.add()`, or use `dataset.push_to_cloud()` to push an existing local dataset to the cloud. To append data to an existing dataset using `dataset.add()`, keep `delete_local=False` and `delete_remote=False`. In that case, ensure both local and remote datasets are healthy. If the local dataset might be corrupted but the remote version is healthy, use `delete_local=True` and `delete_remote=False` so that the local dataset is deleted, while the remote stays intact.
diff --git a/luxonis_ml/data/datasets/utils.py b/luxonis_ml/data/datasets/utils.py
@@ -16,7 +16,7 @@ def get_file(
     remote_path: PosixPathType,
     local_path: PathType,
     mlflow_instance: ModuleType | None = ...,
-    default: Literal[None] = ...,
+    default: None = ...,
 ) -> Path | None: ...
 
 
@@ -126,7 +126,7 @@ def get_dir(
     local_dir: PathType,
     mlflow_instance: ModuleType | None = ...,
     *,
-    default: Literal[None] = None,
+    default: None = None,
 ) -> Path | None: ...
 
 
diff --git a/luxonis_ml/data/parsers/yolov8_parser.py b/luxonis_ml/data/parsers/yolov8_parser.py
@@ -202,7 +202,7 @@ def from_dir(
         if not yaml_file:
             raise ValueError("Exactly one yaml file is expected")
         classes_path = dataset_dir / yaml_file.name
-        dir_format, splits = self._detect_dataset_dir_format(dataset_dir)
+        dir_format, _splits = self._detect_dataset_dir_format(dataset_dir)
         added_train_imgs = self._parse_split(
             image_dir=(
                 dataset_dir / "images" / "train"
diff --git a/luxonis_ml/tracker/tracker.py b/luxonis_ml/tracker/tracker.py
@@ -322,7 +322,7 @@ def experiment(
                         max(
                             (
                                 int(f.split("_")[-1])
-                                for f in os.listdir(log_dir)
+                                for f in os.listdir(log_dir)  # noqa: PTH208
                                 if f.startswith("trial_")
                             ),
                             default=0,
diff --git a/luxonis_ml/utils/requirements.txt b/luxonis_ml/utils/requirements.txt
@@ -1,8 +1,10 @@
 pydantic~=2.7
+pydantic-extra-types~=2.10
 pydantic-settings~=2.1
 PyYAML~=6.0
 fsspec>=2023.3.0
 rich~=13.6
+semver~=3.0
 typer~=0.12
 typeguard~=4.1
 aiobotocore<2.18
diff --git a/pyproject.toml b/pyproject.toml
@@ -173,10 +173,9 @@ exclude = ["luxonis_ml/tracker", "examples", "__main__.py"]
 
 [tool.pytest]
 doctest_optionflags = "NORMALIZE_WHITESPACE IGNORE_EXCEPTION_DETAIL NUMBER"
-
-[tool.pytest.ini_options]
 testpaths = ["tests"]
-addopts = "--disable-warnings"
+addopts = ["--disable-warnings"]
+
 
 [tool.coverage.run]
 omit = ["luxonis_ml/utils/logging.py", "luxonis_ml/tracker/*", "**/__main__.py"]
diff --git a/tests/test_data/test_annotations.py b/tests/test_data/test_annotations.py
@@ -481,7 +481,7 @@ def test_array_annotation(subtests: SubTests, tempdir: Path):
             ArrayAnnotation(path=Path("non_existent.npy"))
 
         cv2.imwrite(str(tempdir / "image.png"), np.zeros((100, 100, 3)))
-        with pytest.raises(ValueError, match="must be a .npy file"):
+        with pytest.raises(ValueError, match=r"must be a .npy file"):
             ArrayAnnotation(path=tempdir / "image.png")
 
 
diff --git a/tests/test_data/test_dataset.py b/tests/test_data/test_dataset.py
@@ -259,7 +259,7 @@ def generator(step: int = 15) -> DatasetIterator:
     with pytest.raises(ValueError, match="Splits cannot be empty"):
         dataset.make_splits({})
 
-    with pytest.raises(ValueError, match="Ratios must sum to 1.0"):
+    with pytest.raises(ValueError, match=r"Ratios must sum to 1.0"):
         dataset.make_splits((0.7, 0.1, 1))
 
     with pytest.raises(ValueError, match="must be a tuple of 3 floats"):
@@ -268,7 +268,7 @@ def generator(step: int = 15) -> DatasetIterator:
     with pytest.raises(ValueError, match="Cannot provide both splits and"):
         dataset.make_splits((0.7, 0.1, 0.2), definitions=definitions)
 
-    with pytest.raises(ValueError, match="Ratios must sum to 1.0"):
+    with pytest.raises(ValueError, match=r"Ratios must sum to 1.0"):
         dataset.make_splits({"train": 1.5})
 
     dataset.add(generator(10))

Original file line number	Diff line number	Diff line change
`@@ -230,7 +230,7 @@ dataset.make_splits((0.8, 0.1, 0.1))`
`230`	`230`
`231`	`231`	`[A remote dataset functions similarly to a local dataset](#in-depth-explanation-of-luxonis-ml-dataset-storage). When a remote dataset is created, the same folder structure appears locally, and the equivalent structure appears in the cloud. The media folder is empty locally but is filled with images on the remote storage, where filenames become UUIDs with the appropriate suffix.`
`232`	`232`
`233`		`-> \[!NOTE\]`
	`233`	`+> [!NOTE]`
`234`	`234`	`> IMPORTANT: Be careful when creating a remote dataset with the same name as an already existing local dataset, because corruption of datasets may occur if not handled properly.`
`235`	`235`	`>`
`236`	`236`	> Use `delete_local=True` and `delete_remote=True` to create a new dataset (deleting both local and remote storage) before calling `dataset.add()`, or use `dataset.push_to_cloud()` to push an existing local dataset to the cloud. To append data to an existing dataset using `dataset.add()`, keep `delete_local=False` and `delete_remote=False`. In that case, ensure both local and remote datasets are healthy. If the local dataset might be corrupted but the remote version is healthy, use `delete_local=True` and `delete_remote=False` so that the local dataset is deleted, while the remote stays intact.
Original file line number	Diff line number	Diff line change
`@@ -322,7 +322,7 @@ def experiment(`
`322`	`322`	`max(`
`323`	`323`	`(`
`324`	`324`	`int(f.split("_")[-1])`
`325`		`- for f in os.listdir(log_dir)`
	`325`	`+ for f in os.listdir(log_dir) # noqa: PTH208`
`326`	`326`	`if f.startswith("trial_")`
`327`	`327`	`),`
`328`	`328`	`default=0,`