Skip to content

Conversation

@tadeas0
Copy link
Contributor

@tadeas0 tadeas0 commented Oct 24, 2025

Purpose

Create ExtendedNeuralNetwork node that wraps ParsingNeuralNetwork and adds the following capabilities:

  • Automatic input resizing to the neural network input size
  • Remapping of detection coordinates from neural network output to input frame coordinates
  • Neural network output filtering based on confidence threshold and labels (Only supported for ImgDetectionsExtended and ImgDetections messages)
  • Input tiling

Specification

https://app.clickup.com/t/86c5mz8er

Dependencies & Potential Impact

None / not applicable

Deployment Plan

None / not applicable

Testing & Validation

Manually tested on RVC4. Currently does not work on RVC2 due to missing transformation bindings in DAI Script node.

Example usage

extended_neural_network_example.py

Copy link
Contributor

@aljazkonec1 aljazkonec1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a quick look and left some comments. Mostly looks good, just a couple of minor fixes. Thanks!

Should we add tests aswell?

"""

IMG_FRAME_TYPES = {
dai.Platform.RVC2: dai.ImgFrame.Type.BGR888p,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to add the frame type as RVC2 is not supported either way.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is RVC2 not supported?

Copy link
Contributor

@aljazkonec1 aljazkonec1 Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a check for RVC2 here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RVC2 will be supported in the future, we are just waiting on a DAI fix. I would like to keep this. It's used in 2 places to set proper frame types for ImageManip nodes here and here. We would be readding this exact logic in the future anyways

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment for entire PR: Use logger to log the stages of the class. Something like logger.info("Building TilingPipeline") and logger.info("Building Tiles patcher") etc etc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, check the parsers implementation for reference.

assert isinstance(
nn_msg, self.SUPPORTED_MESSAGES
), f"Message type {type(nn_msg)} is not supported."
if nn_msg.getTimestamp() != img.getTimestamp():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be > instead of != as the new nn_mseeage will have larger timestamp. If new message has lower timesptamp we would error out as that means we got an older message from previous frame.

messages: List[GMessage],
) -> GMessage:
if len(messages) == 0:
raise ValueError("Not enough messages to merge")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a valid state, it just means no detections / keypoints / ... are found in the frame. Should not error out, but just return a message with empty list of detections / keypoints / ...

Copy link
Contributor Author

@tadeas0 tadeas0 Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't infer the message type, thus cannot determine, what message type to return for empty lists.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing something, but if no detections are in a frame the node will error out and crash the pipeline? Could we not just return an empty list [] in this case?

return new_lines


def merte_predictions(predictions: List[Predictions]) -> Predictions:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo.

@tadeas0
Copy link
Contributor Author

tadeas0 commented Oct 29, 2025

Should we add tests aswell?

Unit tests would require mocking Script and ImageManip nodes. We can add integration tests, however, I would create a separate PR for those to get this merged ASAP, because it's blocking Stage2NeuralNetwork node PR.

@tadeas0 tadeas0 requested a review from aljazkonec1 October 29, 2025 13:12
@PetrNovota
Copy link

@tadeas0 FYI I am testing the new 1st and 2nd stage nodes in the focused vision PoC and have encountered an issue so we will likely not be merging either PR very quickly. I will let you know what the issue is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants