Skip to content

Conversation

@FSAFTik
Copy link

@FSAFTik FSAFTik commented Oct 1, 2025

Purpose

Change is introduced to fix problem with Colinear points exception breaking the pipeline

Specification

Added guard form making zero area crops and edge pins

Dependencies & Potential Impact

Added changes to generate_script_content, what may influence pipelines using this script,
however it shouldn't interact with correct detections and only change the invalid ones to not break the pipeline run

Deployment Plan

None / not applicable

Testing & Validation

Tested on several pipelines using generate_script_content and different NN and didn't see any problems.
Also it was tested with manual change of detections to invalid ones and the pipeline was not breaking but continued its run

@FSAFTik FSAFTik requested a review from PetrNovota October 1, 2025 15:24
@FSAFTik FSAFTik self-assigned this Oct 1, 2025
@Copilot Copilot AI review requested due to automatic review settings October 1, 2025 15:24
@github-actions github-actions bot added the fix Fixing a bug label Oct 1, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a pipeline-breaking issue caused by colinear points exceptions and zero-area crops in detection processing. The solution introduces robust coordinate validation and normalization to ensure all generated crops have valid dimensions and stay within frame boundaries.

  • Replaces simple coordinate copying with comprehensive validation logic that handles edge cases
  • Adds pixel-aware minimum dimensions to prevent zero-area crops
  • Implements coordinate ordering, clamping, and padding with boundary checks

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@FSAFTik FSAFTik marked this pull request as draft October 1, 2025 15:27
@codecov-commenter
Copy link

codecov-commenter commented Oct 2, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 55.11%. Comparing base (314c5af) to head (97767e4).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #246      +/-   ##
==========================================
+ Coverage   53.46%   55.11%   +1.65%     
==========================================
  Files          91       91              
  Lines        5905     5913       +8     
==========================================
+ Hits         3157     3259     +102     
+ Misses       2748     2654      -94     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@FSAFTik FSAFTik marked this pull request as ready for review October 2, 2025 11:44
@FSAFTik FSAFTik requested review from klemen1999 and removed request for jkbmrz, kkeroo, klemen1999 and tersekmatija October 2, 2025 12:11
@kkeroo
Copy link
Collaborator

kkeroo commented Oct 15, 2025

Nice catch. Could you test the script with one example with the full pipeline? E.g. human-pose. Just make sure you have the dai-nodes version from your source.

@FSAFTik
Copy link
Author

FSAFTik commented Oct 15, 2025

Nice catch. Could you test the script with one example with the full pipeline? E.g. human-pose. Just make sure you have the dai-nodes version from your source.

I did test it with several pipelines, mainly gaze estimation and also with the pipeline it was primarily created for:
https://github.com/luxonis/oak-examples/tree/focused_vision_poc/performant_PoCs/focused-vision

@kkeroo
Copy link
Collaborator

kkeroo commented Oct 15, 2025

Ok, but gaze-estimation does not use generate_script_content.

@FSAFTik
Copy link
Author

FSAFTik commented Oct 15, 2025

Ok, but gaze-estimation does not use generate_script_content.

Oh, sry I did mean human pose just messed up when was writing the message

Copy link
Contributor

@aljazkonec1 aljazkonec1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went over the PR and looks good. But I am wondering why do we even want to crop a bbox that has 0 area? Do we have situations where this is useful? IMHO I would either filter such detections inside the generate_script_content function or better yet filter detections beforehand to keep a clean list of all detections.

@klemen1999
Copy link
Collaborator

klemen1999 commented Oct 15, 2025

IMHO I would either filter such detections inside the generate_script_content function or better yet filter detections beforehand to keep a clean list of all detections.

You can't really control how people will use this script node. So filtering before hand would be the "suggested" way to do it but if user doesn't filter then this script by itself shouldn't fail because of it.
I think we can add an optional parameter to the function though through which you control if you want to filter out empty bboxes or not (and default to filtering). Reason behind having the option to turn this off is if you for some reason use this in a pipeline where number of outputs has to match number of detections explictily so that the app logic works.

@aljazkonec1
Copy link
Contributor

You can't really control how people will use this script node. So filtering before hand would be the "suggested" way to do it but if user doesn't filter then this script by itself shouldn't fail because of it.

Yes I agree the node should not fail or crash the pipeline but we also dont want unexpected behaviour.

I think we can add an optional parameter to the function though through which you control if you want to filter out empty bboxes or not (and default to filtering). Reason behind having the option to turn this off is if you for some reason use this in a pipeline where number of outputs has to match number of detections explictily so that the app logic works.

In depthai the practice is that each node should have one specific task. If it is a config creator, it should only do that and if it gets some wierd / invalid inputs, it should either error out or skip invalid input and print an informative warning saying it dropped it. The benifits of this approach are:

  1. allows for fine grained control of each step of the pipeline. eg. if we add optional parameters for min area filtering, what happens if user wants to filter by max area or by label or by confidence? Adding all possible optional parameters to this node, we would just re-implement filter node. Looking at the bigger picture; adding optional filtering parameters to all other similar nodes would drastically complicate development, maintenance and usage of entire library.
  2. All logic is in one place, eg. if all filtering is in Filter node, we need to update just one node instead of finding all filtering logic across multiple node.
  3. Combining multiple nodes into node groups (ParsingNeuralNetwork is a node group) is easier as there are minimal overlapping parameters and updating the groups is simple as you can look at logic of only specific part.

All in all, in this situation I would make it such that for each detection, check if valid => create config, else => skip and print warning.

@klemen1999
Copy link
Collaborator

All in all, in this situation I would make it such that for each detection, check if valid => create config, else => skip and print warning.

But in that case you are actually doing the filtering and there is no way to disable this behaviour (other than writing your own "script node generator" function). Following the logic of "each node should have one responsibility" I would then leave as is right now, without filtering. And if you want to filter do it with "Filter" node before this script node.

@aljazkonec1
Copy link
Contributor

But in that case you are actually doing the filtering and there is no way to disable this behaviour (other than writing your own "script node generator" function).

No reason to disable the behaviour as it would just crash ImageManipNode.

Following the logic of "each node should have one responsibility" I would then leave as is right now, without filtering. And if you want to filter do it with "Filter" node before this script node.

I'm OK with this approach as well, but would still validate the detection and fail gracefully (eg error out with "Detection with bounding box area = 0 encountered, stopping. Please filter invalid detections beforehand." ). This might be even better as it makes it much easier to later sync the detections and the cropped images together as there is always the same number of cropped images produced per input ImgDetections message.

@klemen1999
Copy link
Collaborator

Made adjustments based on the discussion. New expected flow:

  • crop config script node will crash the pipeline if there is a zero-area detection encountered
    • might seem PITA at first glance (why not just pad or remove) but for actual apps this is better approach because zero-area detections should be properly delt with (e.g. removing them at some point)
  • We follow the paradigm "One node has one responsibility". That is why we don't ommit such detections in the config generator script node and instead raise ValueError.
  • ImgDetectionsFilter now supports filtering by min_area (normalized). This makes ImgDetectionFilter and config generator script node a good pair where we first put detections through the filter node and then make the configs from filtered list.

Downside of this approach:

  • If we have pipeline like 1st NN -> ImgDetectionsFilter -> config generator script node -> 2nd NN we'll do device->host->device data move because ImgDetectionsFilter is a HostNode. But I'd argue that if someone wants to optimize their pipeline for speed they can quite easily adapt the config script to drop or pad the detections and handle those separately.
    • Another approach could be to have multiple operating models for the script node generator e.g. "raise error" or "pad" or "filter out".

Thoughts? @aljazkonec1 @kkeroo @FSAFTik @PetrNovota

Copy link
Contributor

@aljazkonec1 aljazkonec1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, left one comment to remove the label filter from config sender script node. Also left a second comment regarding Filter node as the node works with both dai-nodes message types and dai message types, while the config sender script only works with dai message types. We need to make this explicitly clear as sending ImgDetectionsExtended to the config creator will crash the pipeline.

Regarding the downside of this approach:
The additional sending from device->host when using filter node is only true if the NN is a supported YOLO type as then the parser is already implemented on device side. If the NN is anything besides YOLO, the NN output will be sent to host (because of parser) no matter what. The additional filter node will therefore not take any additional bandwidth.

resize_mode: str = "STRETCH",
padding: float = 0,
padding: float = 0.0,
valid_labels: Optional[List[int]] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could remove label filer in the detection config generator as Filter node already handles this and would be in line with single node - single responsibility.

if area < self._min_area:
continue

filtered_detections.append(detection)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

detection is of type dai.ImgDetection, nodes ImgDetectionExtended or dai.SpatialImgDetection. For a host node this is OK, but we need to be careful if using it with the config sender script as it only works with dai message types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix Fixing a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants