Skip to content

Conversation

@clinton-encord
Copy link
Contributor

Introduction and Explanation

Implement VideoSpace and ImageSpace.

JIRA

Link ticket(s)

Documentation

There should be enough internal documentation for a product owner to write customer-facing documentation or a separate PR linked if writing the customer documentation directly. Link all that are relevant below.

  • Internal: notion link
  • Customer docs PR: link
  • OpenAPI/SDK
    • Generated docs: link to example if possible
    • Command to generate: here

Tests

Make a quick statement and post any relevant links of CI / test results. If the testing infrastructure isn’t yet in-place, note that instead.

  • What are the critical unit tests?
  • Explain the Integration Tests such that it’s clear Correctness is satisfied. Link to test results if possible.

Known issues

If there are any known issues with the solution, make a statement about what they are and why they are Ok to leave unsolved for now. Make tickets for the known issues linked to the original ticket linked above

@github-actions
Copy link

github-actions bot commented Nov 24, 2025

Unit test report (Python 3.9.24, Pydantic 1.10.22)

357 tests   331 ✅  9s ⏱️
  1 suites    0 💤
  1 files     26 ❌

For more details on these failures, see this check.

Results for commit 7c2743d.

♻️ This comment has been updated with latest results.

@github-actions
Copy link

github-actions bot commented Nov 24, 2025

Unit test report (Python 3.9.24, Pydantic 2.12.3)

357 tests   331 ✅  10s ⏱️
  1 suites    0 💤
  1 files     26 ❌

For more details on these failures, see this check.

Results for commit 7c2743d.

♻️ This comment has been updated with latest results.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @clinton-encord, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the concept of 'Spaces' to the Encord platform, allowing for better organization and management of annotations within video and image data. It includes the implementation of VideoSpace and ImageSpace classes, along with corresponding changes to annotation objects and data structures to support space-specific labeling and attribute management. This enhancement enables more complex annotation workflows and improves the overall flexibility of the platform.

Highlights

  • New Enum: Introduced a SpaceType enum with VIDEO and IMAGE values to represent different space types.
  • Dynamic Answers: Modified Answer objects to include a spaceId for dynamic answers, enabling space-specific attribute management.
  • Space Management: Implemented VideoSpace and ImageSpace classes for managing objects and classifications within specific video or image spaces.
  • Annotation Handling: Enhanced ObjectInstance and ClassificationInstance to support assignment and removal from spaces, ensuring proper data association.
  • Coordinate Handling: Added coordinate conversion utilities and classes for various geometric shapes, improving annotation data processing.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant new feature, "Spaces", which allows for more complex data organization like video and image groups. The implementation is extensive, adding new classes for Space, VideoSpace, ImageSpace, and various Annotation types. The use of TypedDict for better type safety is a great improvement.

However, there are several areas that need attention:

  • There is some code duplication that should be refactored.
  • Some parts of the implementation seem incomplete, especially around parsing labels for ImageSpace.
  • There are potential bugs in how object instances are shared and filtered across different spaces.
  • Some leftover debug code needs to be cleaned up.

Overall, this is a good step forward, but the identified issues should be addressed to ensure correctness and maintainability.

Comment on lines +596 to +607
GeometricCoordinates = Union[
BoundingBoxCoordinates,
RotatableBoundingBoxCoordinates,
PointCoordinate,
PolygonCoordinates,
PolylineCoordinates,
SkeletonCoordinates,
BitmaskCoordinates,
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The GeometricCoordinates union type is missing PointCoordinate3D. This causes a type inconsistency in get_two_dimensional_coordinates_from_frame_object_label, which is typed to return GeometricCoordinates but can return a PointCoordinate3D instance. Please add PointCoordinate3D to the GeometricCoordinates union to fix this.

Suggested change
GeometricCoordinates = Union[
BoundingBoxCoordinates,
RotatableBoundingBoxCoordinates,
PointCoordinate,
PolygonCoordinates,
PolylineCoordinates,
SkeletonCoordinates,
BitmaskCoordinates,
]
GeometricCoordinates = Union[
BoundingBoxCoordinates,
RotatableBoundingBoxCoordinates,
PointCoordinate,
PointCoordinate3D,
PolygonCoordinates,
PolylineCoordinates,
SkeletonCoordinates,
BitmaskCoordinates,
]

Comment on lines 939 to 974
# Objects in space
# TODO: FIX THIS! How to properly get all object_instances
for space in self._space_map.values():
for object_ in space._objects_map.values():
# filter by ontology object
if not (
filter_ontology_object is None
or object_.ontology_item.feature_node_hash == filter_ontology_object.feature_node_hash
):
continue

# filter by frame
if filter_frames is None:
append = True
else:
append = False
for frame in filtered_frames_list:
hashes = self._frame_to_hashes.get(frame, set())
if object_.object_hash in hashes:
append = True
break

if append:
ret.append(object_)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The logic for filtering objects in spaces by frame seems incorrect. It uses self._frame_to_hashes, which is populated for objects on the main label row, not for objects within spaces. This will likely result in objects on spaces not being correctly filtered by frame. The TODO comment also indicates this is an issue. You should use a mechanism internal to the space to check if an object is on a given frame.

Comment on lines 2308 to 2408
def _parse_space_labels(
self,
spaces_info: dict[str, SpaceInfo],
object_answers: dict,
classification_answers: dict,
) -> None:
res: dict[str, Space] = dict()

for space_id, space_info in spaces_info.items():
if space_info["space_type"] == SpaceType.VIDEO:
video_space = self.get_space_by_id(space_id=space_id, type_=VideoSpace)
video_space._parse_space_dict(
space_info, object_answers=object_answers, classification_answers=classification_answers
)
res[space_id] = video_space
elif space_info["space_type"] == SpaceType.IMAGE:
image_space = ImageSpace(
space_id=space_id,
title=space_info["title"],
parent=self,
width=space_info["width"],
height=space_info["height"],
)
# image_space._parse_space_dict(
# space_info, object_answers=object_answers, classification_answers=classification_answers
# )
res[space_id] = image_space

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In _parse_space_labels, the logic for ImageSpace seems incomplete as the call to image_space._parse_space_dict is commented out. This will lead to labels on image spaces not being parsed. Also, a new ImageSpace is instantiated here, but it should be retrieved using get_space_by_id as the spaces are already initialized. Finally, the res dictionary is initialized but never used and can be removed.

    def _parse_space_labels(
        self,
        spaces_info: dict[str, SpaceInfo],
        object_answers: dict,
        classification_answers: dict,
    ) -> None:
        for space_id, space_info in spaces_info.items():
            if space_info["space_type"] == SpaceType.VIDEO:
                video_space = self.get_space_by_id(space_id=space_id, type_=VideoSpace)
                video_space._parse_space_dict(
                    space_info, object_answers=object_answers, classification_answers=classification_answers
                )
            elif space_info["space_type"] == SpaceType.IMAGE:
                image_space = self.get_space_by_id(space_id=space_id, type_=ImageSpace)
                image_space._parse_space_dict(
                    space_info, object_answers=object_answers, classification_answers=classification_answers
                )

def _parse_frame_label_dict(self, frame_label: LabelBlob, classification_answers: dict):
for frame_object_label in frame_label["objects"]:
object_hash = frame_object_label["objectHash"]
object = self.parent._space_objects_map.get(object_hash)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In _parse_frame_label_dict, you are trying to find an existing object instance from self.parent._space_objects_map. However, this map is never populated in LabelRowV2. A similar logic in video_space.py iterates through all spaces to find the object. This inconsistency should be resolved, and a proper mechanism to share object instances across spaces should be implemented, likely by populating _space_objects_map in LabelRowV2.

Comment on lines 693 to 736
def get_two_dimensional_coordinates_from_frame_object_label(
frame_object_label: FrameObject,
) -> GeometricCoordinates:
if frame_object_label["shape"] == Shape.BOUNDING_BOX:
return BoundingBoxCoordinates.from_dict(frame_object_label)
elif frame_object_label["shape"] == Shape.ROTATABLE_BOUNDING_BOX:
return RotatableBoundingBoxCoordinates.from_dict(frame_object_label)
elif frame_object_label["shape"] == Shape.POLYGON:
return PolygonCoordinates.from_dict(frame_object_label)
elif frame_object_label["shape"] == Shape.POINT:
coords = frame_object_label["point"]["0"]
if "x" in coords and "y" in coords and "z" in coords:
return PointCoordinate3D.from_dict(frame_object_label) # type: ignore
elif "x" in coords and "y" in coords:
return PointCoordinate.from_dict(frame_object_label) # type: ignore
else:
raise ValueError(f"Invalid point coordinates in {frame_object_label}")
elif frame_object_label["shape"] == Shape.POLYLINE:
return PolylineCoordinates.from_dict(frame_object_label)
elif "skeleton" in frame_object_label:

def _with_visibility_enum(point: dict):
if point.get(Visibility.INVISIBLE.value):
point["visibility"] = Visibility.INVISIBLE
elif point.get(Visibility.OCCLUDED.value):
point["visibility"] = Visibility.OCCLUDED
elif point.get(Visibility.SELF_OCCLUDED.value):
point["visibility"] = Visibility.SELF_OCCLUDED
elif point.get(Visibility.VISIBLE.value):
point["visibility"] = Visibility.VISIBLE
return point

values = [_with_visibility_enum(pnt) for pnt in frame_object_label["skeleton"].values()]
skeleton_frame_object_label = {
"name": frame_object_label["name"],
"values": values,
}
return SkeletonCoordinates.from_dict(skeleton_frame_object_label)
elif "bitmask" in frame_object_label:
return BitmaskCoordinates.from_dict(frame_object_label)
elif "cuboid" in frame_object_label:
raise NotImplementedError("Cuboid is not a two dimensional coordinate.")
else:
raise NotImplementedError(f"Getting coordinates for `{frame_object_label}` is not supported yet.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The function get_two_dimensional_coordinates_from_frame_object_label duplicates most of its logic from get_coordinates_from_frame_object_label. This can be refactored to improve maintainability by calling get_coordinates_from_frame_object_label and then handling the cases specific to two-dimensional coordinates.

For example:

def get_two_dimensional_coordinates_from_frame_object_label(
    frame_object_label: FrameObject,
) -> GeometricCoordinates:
    coordinates = get_coordinates_from_frame_object_label(frame_object_label)
    if isinstance(coordinates, CuboidCoordinates):
        raise NotImplementedError("Cuboid is not a two dimensional coordinate.")
    
    # This would require ensuring GeometricCoordinates includes all possible return types.
    return coordinates

Additionally, the nested helper function _with_visibility_enum is duplicated in both functions and could be extracted to a module-level helper to be reused.

Comment on lines 153 to 159
class MyDict(TypedDict):
something: str


def hello(my_dict: MyDict) -> None:
yo = my_dict.get("something")
print(yo)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The MyDict class and hello function appear to be leftover debugging or testing code. They should be removed before merging.

@github-actions
Copy link

github-actions bot commented Nov 24, 2025

SDK integration test report

285 tests  ±0   241 ✅  - 36   14m 28s ⏱️ - 3m 44s
  1 suites ±0     4 💤 ± 0 
  1 files   ±0    40 ❌ +36 

For more details on these failures, see this check.

Results for commit 7c2743d. ± Comparison against base commit dbabc87.

♻️ This comment has been updated with latest results.

@clinton-encord clinton-encord force-pushed the clinton/ed-1658/video-image-space branch from f82d178 to da48601 Compare November 26, 2025 20:22
@clinton-encord clinton-encord force-pushed the clinton/ed-1658/video-image-space branch from 2aeeb05 to 7c2743d Compare November 27, 2025 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants