You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: luxonis_ml/data/README.md
+42-13Lines changed: 42 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -106,16 +106,49 @@ After creating a dataset, the next step is to populate it with images and their
106
106
107
107
#### Data Format
108
108
109
-
Each data entry should be a dictionary with the following structure:
109
+
Each data entry should be a dictionary with one of the following structures, depending on whether you're using a single input or multiple inputs:
110
+
111
+
##### Single-Input Format
110
112
111
113
```python
112
114
{
113
115
"file": str, # path to the image file
114
-
"task_name": Optional[str], # task type for this annotation
116
+
"task_name": Optional[str], # task for this annotation
115
117
"annotation": Optional[dict] # annotation of the instance in the file
116
118
}
117
119
```
118
120
121
+
##### Multi-Input Format
122
+
123
+
```python
124
+
{
125
+
"files": dict[str, str], # mapping from input source name to file path
126
+
"task_name": Optional[str], # task for this annotation
127
+
"annotation": Optional[dict] # annotation of the instance in the files
128
+
}
129
+
```
130
+
131
+
In the multi-input format, the keys in the `files` dictionary are arbitrary strings that describe the role or modality of the input (e.g., `img_rgb`, `img_ir`, `depth`, etc.). These keys are later used to retrieve the corresponding images during data loading.
132
+
133
+
```python
134
+
{
135
+
"files": {
136
+
"img_rgb": "path/to/rgb_image.png",
137
+
"img_ir": "path/to/infrared_image.png"
138
+
},
139
+
"task_name": "detection",
140
+
"annotation": {
141
+
"class": "person",
142
+
"boundingbox": {
143
+
"x": 0.1,
144
+
"y": 0.1,
145
+
"w": 0.3,
146
+
"h": 0.4
147
+
}
148
+
}
149
+
}
150
+
```
151
+
119
152
Luxonis Data Format supports **annotations optionally structured into different tasks** for improved organization. Tasks can be explicitly named or left unset - if none are specified, all annotations will be grouped under a single `task_name` set by default to `""` . The [example below](#adding-data-with-a-generator-function) demonstrates this with instance keypoints and segmentation tasks.
120
153
121
154
The content of the `"annotation"` field depends on the task type and follows the [Annotation Format](#annotation-format) described later in this document.
@@ -558,33 +591,29 @@ The mask is a binary 2D numpy array.
558
591
559
592
#### Run-Length Encoding
560
593
561
-
The mask isdescribed using the [Run-Length Encoding](https://en.wikipedia.org/wiki/Run-length_encoding)compression.
594
+
The mask isrepresented using [Run-Length Encoding (RLE)](https://en.wikipedia.org/wiki/Run-length_encoding), a lossless compression method that stores alternating counts of background and foreground pixels in**row-major order**, beginning from the top-left pixel. The first count always represents background pixels, even if that count is0.
562
595
563
-
Run-length encoding compresses data by reducing the physical size
564
-
of a repeating string of characters.
565
-
This process involves converting the input data into a compressed format
566
-
by identifying and counting consecutive occurrences of each character.
567
-
568
-
The RLEis composed of the height and width of the mask image and the counts of the pixels belonging to the positive class.
596
+
The `counts` field contains either a **compressed byte string**or an **uncompressed list of integers**. We use the **COCORLEformat** via the `pycocotools` library to encode and decode masks.
569
597
570
598
```python
571
599
{
572
600
# name of the class this mask belongs to
573
601
"class": str,
574
602
575
-
"segmentation":
576
-
{
603
+
"segmentation": {
577
604
# height of the mask
578
605
"height": int,
579
606
580
607
# width of the mask
581
608
"width": int,
582
609
583
-
# counts of the pixels belonging to the positive class
610
+
# run-length encoded pixel counts in row-major order,
611
+
# starting with background. Can be a list[int] (uncompressed)
Copy file name to clipboardExpand all lines: luxonis_ml/data/augmentations/README.md
+66-1Lines changed: 66 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,72 @@
2
2
3
3
## `AlbumentationsEngine`
4
4
5
-
The default engine used with `LuxonisLoader`. It is powered by the [Albumentations](https://albumentations.ai/) library and should be satisfactory for most use cases. Apart from the albumentations transformations, it also supports custom transformations registered in the `TRANSFORMATIONS` registry.
5
+
The default engine used with `LuxonisLoader`. It is powered by the [Albumentations](https://albumentations.ai/) library and should be satisfactory for most use cases. In addition to the built-in Albumentations transformations, it also supports custom transformations registered in the `TRANSFORMATIONS` registry.
6
+
7
+
### Creating and Registering a Custom Augmentation
8
+
9
+
The process of creating custom augmentations follows the same principles as described in the [Albumentations custom transform guide](https://albumentations.ai/docs/4-advanced-guides/creating-custom-transforms/#creating-custom-albumentations-transforms). You can subclass from their base classes such as `DualTransform`, `ImageOnlyTransform`, or others depending on the target types you want to support.
10
+
11
+
The example below shows how to define, register, and use a custom transform:
12
+
13
+
```python
14
+
import numpy as np
15
+
from typing import Any, Sequence
16
+
from albumentations import DualTransform
17
+
18
+
from luxonis_ml.data import LuxonisDataset, LuxonisLoader
19
+
from luxonis_ml.data.augmentations.custom importTRANSFORMATIONS
We also support **batch-level transformations**, built on top of the `BatchTransform` base class. These follow the same creation and registration pattern as standard custom transforms but operate on batches of data. This allows you to construct augmentations that combine multiple images and labels.
0 commit comments