Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
e06ea6f
multi input df support
JSabadin Jun 4, 2025
2df70ca
merge main into 'feat/multi-input-support' and resolve conflicts
JSabadin Jun 4, 2025
8779363
group_id support
JSabadin Jun 4, 2025
fd26f8f
merge main into 'feat/multi-input-support' and resolve conflicts
JSabadin Jun 4, 2025
bc9bb2c
multi-input support
JSabadin Jun 5, 2025
dd650cd
var renaming
JSabadin Jun 5, 2025
34092f9
add tests
JSabadin Jun 5, 2025
4a87116
fix docs
JSabadin Jun 5, 2025
ab6e87e
fix type-check
JSabadin Jun 5, 2025
f868fa0
fix type-check
JSabadin Jun 5, 2025
5621f19
fix type-check
JSabadin Jun 5, 2025
253ba53
fix type-check
JSabadin Jun 5, 2025
3e7827a
backward compatibillity
JSabadin Jun 7, 2025
6021bef
fix type-check
JSabadin Jun 7, 2025
df71283
fix type-check
JSabadin Jun 7, 2025
9b7b4ea
fix type-check
JSabadin Jun 7, 2025
d90a16a
fix tests
JSabadin Jun 7, 2025
9867562
support for missing sources
JSabadin Jun 9, 2025
d13f6d3
minor fixes
JSabadin Jun 9, 2025
06b11db
minor fix
JSabadin Jun 11, 2025
1d22571
fix augs readme
JSabadin Jun 13, 2025
05ec2fc
fix ls bug
JSabadin Jun 13, 2025
dfc162d
fix bug
JSabadin Jun 13, 2025
d529da3
add docs for absolute paths
JSabadin Jun 13, 2025
ce5b4be
relative or absolute paths
JSabadin Jun 13, 2025
1f9c2a1
fix tests
JSabadin Jun 13, 2025
06d0511
merge 'main' into feat/multi-input-support and resolve conflicts
JSabadin Jul 17, 2025
979fddc
fix failing test
JSabadin Jul 18, 2025
6ae6a73
merge 'main' into feat/multi-input-support and resolve conflicts
JSabadin Jul 18, 2025
7f3a996
Merge branch 'main' into feat/multi-input-support
JSabadin Jul 18, 2025
9ad1ac7
Merge branch 'main' into feat/multi-input-support
JSabadin Jul 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 42 additions & 13 deletions luxonis_ml/data/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,16 +106,49 @@ After creating a dataset, the next step is to populate it with images and their

#### Data Format

Each data entry should be a dictionary with the following structure:
Each data entry should be a dictionary with one of the following structures, depending on whether you're using a single input or multiple inputs:

##### Single-Input Format

```python
{
"file": str, # path to the image file
"task_name": Optional[str], # task type for this annotation
"task_name": Optional[str], # task for this annotation
"annotation": Optional[dict] # annotation of the instance in the file
}
```

##### Multi-Input Format

```python
{
"files": dict[str, str], # mapping from input source name to file path
"task_name": Optional[str], # task for this annotation
"annotation": Optional[dict] # annotation of the instance in the files
}
```

In the multi-input format, the keys in the `files` dictionary are arbitrary strings that describe the role or modality of the input (e.g., `img_rgb`, `img_ir`, `depth`, etc.). These keys are later used to retrieve the corresponding images during data loading.

```python
{
"files": {
"img_rgb": "path/to/rgb_image.png",
"img_ir": "path/to/infrared_image.png"
},
"task_name": "detection",
"annotation": {
"class": "person",
"boundingbox": {
"x": 0.1,
"y": 0.1,
"w": 0.3,
"h": 0.4
}
}
}
```

Luxonis Data Format supports **annotations optionally structured into different tasks** for improved organization. Tasks can be explicitly named or left unset - if none are specified, all annotations will be grouped under a single `task_name` set by default to `""` . The [example below](#adding-data-with-a-generator-function) demonstrates this with instance keypoints and segmentation tasks.

The content of the `"annotation"` field depends on the task type and follows the [Annotation Format](#annotation-format) described later in this document.
Expand Down Expand Up @@ -558,33 +591,29 @@ The mask is a binary 2D numpy array.

#### Run-Length Encoding

The mask is described using the [Run-Length Encoding](https://en.wikipedia.org/wiki/Run-length_encoding) compression.
The mask is represented using [Run-Length Encoding (RLE)](https://en.wikipedia.org/wiki/Run-length_encoding), a lossless compression method that stores alternating counts of background and foreground pixels in **row-major order**, beginning from the top-left pixel. The first count always represents background pixels, even if that count is 0.

Run-length encoding compresses data by reducing the physical size
of a repeating string of characters.
This process involves converting the input data into a compressed format
by identifying and counting consecutive occurrences of each character.

The RLE is composed of the height and width of the mask image and the counts of the pixels belonging to the positive class.
The `counts` field contains either a **compressed byte string** or an **uncompressed list of integers**. We use the **COCO RLE format** via the `pycocotools` library to encode and decode masks.

```python
{
# name of the class this mask belongs to
"class": str,

"segmentation":
{
"segmentation": {
# height of the mask
"height": int,

# width of the mask
"width": int,

# counts of the pixels belonging to the positive class
# run-length encoded pixel counts in row-major order,
# starting with background. Can be a list[int] (uncompressed)
# or a compressed byte string
"counts": list[int] | bytes,
},

}

```

> \[!NOTE\]
Expand Down
120 changes: 78 additions & 42 deletions luxonis_ml/data/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,21 +107,24 @@ def print_info(dataset: LuxonisDataset) -> None:
task_table.add_row(", ".join(task_types))

splits = dataset.get_splits()
source_names = dataset.get_source_names()

@group()
def get_sizes_panel() -> Iterator[RenderableType]:
if splits is not None:
total_files = len(dataset)
for split, files in splits.items():
split_size = len(files)
total_groups = len(dataset) / len(source_names)
for split, group in splits.items():
split_size = len(group)
percentage = (
(split_size / total_files * 100) if total_files > 0 else 0
(split_size / total_groups * 100)
if total_groups > 0
else 0
)
yield f"[magenta b]{split}: [not b cyan]{split_size:,} [dim]({percentage:.1f}%)[/dim]"
else:
yield "[red]No splits found"
yield Rule()
yield f"[magenta b]Total: [not b cyan]{len(dataset)}"
yield f"[magenta b]Total: [not b cyan]{int(total_groups)}"

@group()
def get_panels() -> Iterator[RenderableType]:
Expand Down Expand Up @@ -188,11 +191,13 @@ def delete(
):
raise typer.Exit

dataset = LuxonisDataset(name, bucket_storage=bucket_storage)
dataset.delete_dataset(
dataset = LuxonisDataset(
name,
bucket_storage=bucket_storage,
delete_local=local,
delete_remote=remote,
)
dataset.delete_dataset(delete_local=local)

print(
f"Dataset '{name}' deleted from: "
Expand Down Expand Up @@ -343,7 +348,14 @@ def inspect(
)

if aug_config is not None:
h, w, _ = loader[0][0].shape
sample_img = loader[0][0]
img = (
next(iter(sample_img.values()))
if isinstance(sample_img, dict)
else sample_img
)
h, w = img.shape[:2]

loader.augmentations = loader._init_augmentations(
augmentation_engine="albumentations",
augmentation_config=aug_config,
Expand All @@ -357,13 +369,18 @@ def inspect(
raise ValueError(f"Dataset '{name}' is empty.")

classes = dataset.get_classes()
for image, labels in loader:
image = image.astype(np.uint8)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
prev_windows = set()

for img, labels in loader:
if isinstance(img, dict):
images_dict = img
else:
images_dict = {"image": img}

current_windows = set(images_dict.keys())
for stale_window in prev_windows - current_windows:
cv2.destroyWindow(stale_window)

h, w, _ = image.shape
new_h, new_w = int(h * size_multiplier), int(w * size_multiplier)
image = cv2.resize(image, (new_w, new_h))
instance_keys = [
"/boundingbox",
"/keypoints",
Expand All @@ -372,35 +389,54 @@ def inspect(
matched_instance_keys = [
k for k in labels if any(k.endswith(ik) for ik in instance_keys)
]
if per_instance and matched_instance_keys:
extra_keys = [k for k in labels if k not in matched_instance_keys]
if extra_keys:
print(
f"[yellow]Warning: Ignoring non-instance keys in labels: {extra_keys}[/yellow]"
)
n_instances = len(labels[matched_instance_keys[0]])
for i in range(n_instances):
instance_labels = {
k: np.expand_dims(v[i], axis=0)
for k, v in labels.items()
if k in matched_instance_keys and len(v) > i
}
instance_image = visualize(
image.copy(), instance_labels, classes, blend_all=blend_all
)
cv2.imshow("image", instance_image)
if cv2.waitKey() == ord("q"):
break
else:
if per_instance:
print(
"[yellow]Warning: Per-instance mode is not supported for this dataset. "
"Showing all labels in one window.[/yellow]"

for source_name, image in images_dict.items():
image = image.astype(np.uint8)
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
h, w = image.shape[:2]
new_h, new_w = int(h * size_multiplier), int(w * size_multiplier)
image = cv2.resize(image, (new_w, new_h))

if per_instance and matched_instance_keys:
extra_keys = [
k for k in labels if k not in matched_instance_keys
]
if extra_keys:
print(
f"[yellow]Warning: Ignoring non-instance keys in labels: {extra_keys}[/yellow]"
)
n_instances = len(labels[matched_instance_keys[0]])
for i in range(n_instances):
instance_labels = {
k: np.expand_dims(v[i], axis=0)
for k, v in labels.items()
if k in matched_instance_keys and len(v) > i
}
instance_image = visualize(
image.copy(),
source_name,
instance_labels,
classes,
blend_all=blend_all,
)
cv2.imshow(source_name, instance_image)
if cv2.waitKey() == ord("q"):
break
else:
if per_instance:
print(
"[yellow]Warning: Per-instance mode is not supported for this dataset. "
f"Showing all labels in one window for '{source_name}'.[/yellow]"
)
labeled_image = visualize(
image, source_name, labels, classes, blend_all=blend_all
)
image = visualize(image, labels, classes, blend_all=blend_all)
cv2.imshow("image", image)
if cv2.waitKey() == ord("q"):
break
cv2.imshow(source_name, labeled_image)

prev_windows = current_windows

if cv2.waitKey() == ord("q"):
break


@app.command()
Expand Down
67 changes: 66 additions & 1 deletion luxonis_ml/data/augmentations/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,72 @@

## `AlbumentationsEngine`

The default engine used with `LuxonisLoader`. It is powered by the [Albumentations](https://albumentations.ai/) library and should be satisfactory for most use cases. Apart from the albumentations transformations, it also supports custom transformations registered in the `TRANSFORMATIONS` registry.
The default engine used with `LuxonisLoader`. It is powered by the [Albumentations](https://albumentations.ai/) library and should be satisfactory for most use cases. In addition to the built-in Albumentations transformations, it also supports custom transformations registered in the `TRANSFORMATIONS` registry.

### Creating and Registering a Custom Augmentation

The process of creating custom augmentations follows the same principles as described in the [Albumentations custom transform guide](https://albumentations.ai/docs/4-advanced-guides/creating-custom-transforms/#creating-custom-albumentations-transforms). You can subclass from their base classes such as `DualTransform`, `ImageOnlyTransform`, or others depending on the target types you want to support.

The example below shows how to define, register, and use a custom transform:

```python
import numpy as np
from typing import Any, Sequence
from albumentations import DualTransform

from luxonis_ml.data import LuxonisDataset, LuxonisLoader
from luxonis_ml.data.augmentations.custom import TRANSFORMATIONS

class CustomTransform(DualTransform):
def __init__(self, p: float = 1.0):
super().__init__(p)

def apply(self, image: np.ndarray, **_: Any) -> np.ndarray:
return image

def apply_to_mask(self, mask: np.ndarray, **_: Any) -> np.ndarray:
return mask

def apply_to_bboxes(self, bboxes: Sequence[Any], **_: Any) -> Sequence[Any]:
return bboxes

def apply_to_keypoints(self, keypoints: Sequence[Any], **_: Any) -> Sequence[Any]:
return keypoints

# Register the transform
TRANSFORMATIONS.register(module=CustomTransform)

# Use it in the config
augmentation_config = [{
"name": "CustomTransform",
"params": {"p": 1},
}]

loader = LuxonisLoader(
LuxonisDataset("coco_test"),
augmentation_config=augmentation_config,
view="train",
height=640,
width=640,
)

for data in loader:
pass
```

### Examples of Custom Augmentations

- [`letterbox_resize.py`](./custom/letterbox_resize.py)
- [`symetric_keypoints_flip.py`](./custom/symetric_keypoints_flip.py)

### Batch-Level Augmentations

We also support **batch-level transformations**, built on top of the `BatchTransform` base class. These follow the same creation and registration pattern as standard custom transforms but operate on batches of data. This allows you to construct augmentations that combine multiple images and labels.

Examples:

- [`mosaic.py`](./custom/mosaic.py)
- [`mixup.py`](./custom/mixup.py)

### Configuration Format

Expand Down
Loading
Loading