Skip to content

Conversation

@parikshit14
Copy link

Before submitting a pull request (PR), please read the contributing guide.

Please fill out as much of this template as you can, but if you have any problems or questions, just leave a comment and we will help out :)

Description

What is this PR

  • Bug fix
  • Addition of a new feature
  • Other

Why is this PR needed?
First step to solve the issue #54

What does this PR do?

  • created base tracker class
  • added an example folder with python script to run cotracker
  • added submodule cotracker to get required functionalities from the official repo

References

Please reference any existing issues/PRs that relate to this PR.

How has this PR been tested?

Please explain how any new code has been tested, and how you have ensured that no existing functionality has changed.

Is this a breaking change?

If this PR breaks any existing functionality, please explain how and why.

Does this PR require an update to the documentation?

If any features have changed, or have been added. Please explain how the
documentation has been updated.

Checklist:

  • The code has been tested locally
  • Tests have been added to cover all new functionality
  • The documentation has been updated to reflect any changes
  • The code has been formatted with pre-commit

TODO:

  • add unit tests
  • update documentation

@kwokkenton
Copy link

kwokkenton commented Mar 9, 2025

Great contribution and I can confirm that it works on the provided test video :)

image

I had to make a couple changes to make it run on my local and have documented it in my fork here.

Additionally, I was trying to see how to integrate it with the movement workflow, and it seems that the most straightforward thing is to load it as a 'pose' dataset directly.

# Output is in shape [n_batch=1, n_frames, n_keypoints, n_space=2]
trajectories = tracker.track(video_source, query_points, save_results=True)

# Movement needs arrays shaped (n_frames, n_space, n_keypoints, n_individuals)
# Assume n_batch = 1 = n_individuals
ds = load_poses.from_numpy(
    trajectories.numpy().transpose(1, 3, 2, 0), source_software="cotracker"
)
print(ds)

Out:

<xarray.Dataset> Size: 3kB
Dimensions:      (time: 50, space: 2, keypoints: 4, individuals: 1)
Coordinates:
  * time         (time) int64 400B 0 1 2 3 4 5 6 7 8 ... 42 43 44 45 46 47 48 49
  * space        (space) <U1 8B 'x' 'y'
  * keypoints    (keypoints) <U10 160B 'keypoint_0' ... 'keypoint_3'
  * individuals  (individuals) <U12 48B 'individual_0'
Data variables:
    position     (time, space, keypoints, individuals) float32 2kB 400.0 ... ...
    confidence   (time, keypoints, individuals) float32 800B nan nan ... nan nan
Attributes:
    fps:              None
    time_unit:        frames
    source_software:  cotracker
    source_file:      None
    ds_type:          poses

@parikshit14
Copy link
Author

parikshit14 commented Mar 10, 2025

Added unit tests and can add more unit tests based on some feedback. Moving PR from draft PR

@parikshit14 parikshit14 marked this pull request as ready for review March 10, 2025 15:49
@parikshit14
Copy link
Author

@sfmig, can you look at this PR and provide some initial comments?

Copy link
Member

@sfmig sfmig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @parikshit14 , thanks for having a go at this!

I left you specific comments in the code, mainly two to highlight:

First, could you add cotracker as a dependency, instead of as a submodule? We don't expect to do development in the cotracker codebase, and it makes it easier for us to maintain as a dependency.

Second, I think cotracker, tapir and other "track-any-point" models that we may want to support in the future may live better under ethology/tap_models. This is mainly because there may be confusion with "traditional" trackers, whose aim is to link existing points across frames. The same would apply for the classes names etc.

Hope this helps!

@parikshit14
Copy link
Author

Hi @sfmig, I have made the changes requested.

  • Updated the structure as mentioned.
  • Added cotracker as dependency instead of submodule.
  • Added another test that shows proper working, using a mock video as suggested.
  • removed the example and transfered the gist of it as a test case.

TODO:

Ideally, for interoperability with our sister project movement, it would return the trajectories as a movement dataset. This can be done for example in a convert_to_movement_dataset method in this class.

This may be out of scope for this PR, but if you want to have a go it could be cool.

  • I'll work on returing the trajectory as movements_dataset, let me know if you want it to be addressed within this PR or another one. I think it can be done here.

Let me know what you think. Thanks for the references, they were precise and helped speed up the modification.

@sfmig sfmig force-pushed the trackers/cotracker branch from 9499d65 to 82c7ec2 Compare March 20, 2025 18:39
@sfmig sfmig mentioned this pull request Mar 20, 2025
7 tasks
Copy link
Member

@sfmig sfmig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for having a go @parikshit14, good job!

The tests are nicely done. I left some more specific comments in line with the code, but please have a look at the failing CI checks as well. The review also includes some suggestions on how this could be taken forward.

Hope this helps!

@codecov
Copy link

codecov bot commented Apr 7, 2025

Codecov Report

Attention: Patch coverage is 91.02564% with 7 lines in your changes missing coverage. Please review.

Project coverage is 96.99%. Comparing base (b3ee710) to head (9b62fba).

Files with missing lines Patch % Lines
ethology/tap_models/track_any_point.py 90.90% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #58      +/-   ##
==========================================
- Coverage   98.82%   96.99%   -1.83%     
==========================================
  Files           5        7       +2     
  Lines         255      333      +78     
==========================================
+ Hits          252      323      +71     
- Misses          3       10       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@parikshit14
Copy link
Author

parikshit14 commented Apr 26, 2025

I think we can conclude this PR. Other changes I have in mind are

  • changing the save_video method to plot trajectories on our own instead of relying on Cotracker's visualizer since we now support multi-keypoint support as well
  • Customize the configurations from a config.yaml file

but they feel out of the scope of this one.

Copy link
Member

@sfmig sfmig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @parikshit14, thanks for the update.

I noticed the latest CI checks are failing - would you mind having a look?

I also had a quick look at the latest state and left some in-line comments.

I think you make a good point re visualisation in your last comment, and would actually like to know your views on the following suggestion:

I think we could do away with the visualisation bits actually. What I have been doing lately as a user in a similar setup is exporting the predictions as a movement pose dataset, which I can then export in a few formats and importantly, I can load in napari. This removes a lot of the faff involved in exporting custom videos and could make our code (and in particular this PR) much simpler. Would you like to have a go at this?

If so, I would recommend having a look at this guide first to get a feel for it. In movement, I am currently working on exporting the bboxes dataset to VIA format, which is also a format importable in napari, so that should help in this too.

Let me know thoughts. If you feel you don't have capacity to continue this work further, I understand and I would be happy to take it from here (your contributions will still be logged and acknowledged of course). I realise it may not feel the best to keep iterating on this PR, but unfortunately I think that is the nature of a package in such early stages, in which we are still figuring out how to best implement things and don't have much codebase to use as reference.

@parikshit14
Copy link
Author

I think we could do away with the visualisation bits actually. What I have been doing lately as a user in a similar setup is exporting the predictions as a movement pose dataset, which I can then export in a few formats and importantly, I can load in napari. This removes a lot of the faff involved in exporting custom videos and could make our code (and in particular this PR) much simpler. Would you like to have a go at this?

If so, I would recommend having a look at this guide first to get a feel for it. In movement, I am currently working on exporting the bboxes dataset to VIA format, which is also a format importable in napari, so that should help in this too.

I am not sure if you had a look at my recent commits, but they address this exactly using the function convert_to_movement_dataset (I haven't checked with napari as of now).

Let me know thoughts. If you feel you don't have capacity to continue this work further, I understand and I would be happy to take it from here (your contributions will still be logged and acknowledged of course).

I'm disappointed that I made you feel this way, since even after the contribution period ended, I still made regular commits and waited for your reviews.

@sfmig
Copy link
Member

sfmig commented May 20, 2025

Hi @parikshit

Some answers to your points below.

I am not sure if you had a look at my recent commits, but they address this exactly using the function convert_to_movement_dataset (I haven't checked with napari as of now).

Yes, I remember this came from a PR review comment and I am aware this is in the code. I should have been more clear about my suggestion: what I meant is that we can remove any video export utilities from the PR (such as save_video) and rely on napari support for visualisation instead (via movement for now). It would mean something like replacing save_video by a "save tracked output" function, that wraps a movement export function (maybe as a SLEAP file and/or as a DLC file). As long as we export the tracks in a movement-compatible format that is loadable in napari (which we are), it should be good.

I would address that as a next step, along with the pending CI checks that are currently failing. Another bit for improvement imo would be the out-of-memory error handling, trying to downscale the input image for example, but that requires a bit more thinking and we could address that in a future separate PR.

I'm disappointed that I made you feel this way, since even after the contribution period ended, I still made regular commits and waited for your reviews.

I'm sorry to hear that you are disappointed by my previous message. I wonder if it was misinterpreted. Just to clarify, with "capacity" I meant "time capacity", referring to the total number of hours a person can potentially work on something over a given period. I didn't mean it as a synonym for "ability". It is a slightly business-y term that we use quite a bit in the team but I realise it may be uncommon in other contexts.

I hope this helps, I want to be clear that I appreciate your efforts and interest in pushing this forward and have enjoyed working collaboratively on this.

@parikshit14
Copy link
Author

Hi @sfmig, sorry for the delay. I'll work on it and push commits soon.

I'm sorry to hear that you are disappointed by my previous message. I wonder if it was misinterpreted. Just to clarify, with "capacity" I meant "time capacity", referring to the total number of hours a person can potentially work on something over a given period. I didn't mean it as a synonym for "ability". It is a slightly business-y term that we use quite a bit in the team but I realise it may be uncommon in other contexts.

What I meant was that I was disappointed that I gave you any hint that made you feel I wouldn't contribute to this PR anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants