-
Notifications
You must be signed in to change notification settings - Fork 0
Object detection example #55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
jwmandeville
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good and I tested and it works on my system as well.
A few suggestions:
- Add a link to the dataset download page: https://www.cis.upenn.edu/~jshi/ped_html/
- Change the dataset paths and checkpoint paths to variables
- Change the paths to something generic like
path/to/dataset/PennFudanPedinstead of your PC's location
Thanks Jack! Just made the changes. |
xt_training/runner.py
Outdated
| y_tmp = [] | ||
| for y_i in y: | ||
| if isinstance(y_i, dict): | ||
| y_i = {k: v.to(device) for k, v in y_i.items()} | ||
| else: | ||
| y_i.to(device) | ||
| y_tmp.append(y_i) | ||
| y = y_tmp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a comment describing what is happening here and what the use case is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment added.
xt_training/runner.py
Outdated
| try: | ||
| y_pred = model(x) | ||
| loss_batch = loss_fn(y_pred, y) | ||
| # The output of Object Detection is a tuple of (loss, y_pred), and when | ||
| # not in training mode (model.training==False), loss is an empty dictionary. | ||
| if isinstance(y_pred, tuple): | ||
| y_pred = y_pred[-1] | ||
| except ValueError: | ||
| # Object Detection training process requires (x, y) as input. | ||
| loss, y_pred = model(x, y) | ||
| # During training, y_pred is an empty list with len==0. | ||
| # Assign a length to it to make the PooledMean metrics work. | ||
| y_pred = [None] | ||
| # The Object Detection model already calculated the loss, | ||
| # so loss_fn should be declared to work with it. | ||
| loss_batch = loss_fn(y_pred=y_pred, y=loss) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should rely on error handling to do non-erroneous control flow.
Also, in general, we shouldn't let the structure of a single object detection model impact the structure of our package so much. I think it would be better to modify the model so that it behaves normally and outputs only the predictions all the time and move the loss calculation to a separate loss function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I made some modification and discarded the error handling.
| elif isinstance(y, Iterable): | ||
| y = [y_i.to(device) for y_i in y] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure this will error for the data you are expecting. If y_i is a dict, then it won't have a to method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better practice for object detection would be to put the to calls inside the forward method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion Tim. I wrote an ODInterface based on the SKInterface, and moved the to calls inside the forward method.
…on the scikit learn wrappers.
…on the scikit learn wrappers.
|
@timesler The modifications I made:
|
timesler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I still think this departs too much from our standard workflow in xt-training, because we are trying to bend over backwards to reproduce a specific tutorial, or to allow the use of the torchvision object detection structure.
In essence, object detection should have no problems conforming to the normal process:
- A dataset returns a tuple of x and y
- The model forward function accepts a single argument (x) and always returns a single variable (y_pred)
- Loss and metric functions accept y and y_pred and return scalar values
The way it's done in torchvision is different from the torchvision classification models but there's not good reason for that as far I know. I think we should build our example using a modified version of the model that doesn't do anything with loss inside the model itself.
| self.value_sum += self.latest_value.detach() * self.latest_num_samples | ||
| tmp = self.value_sum + self.latest_value.detach() * self.latest_num_samples | ||
| # self.value_sum += self.latest_value.detach() * self.latest_num_samples | ||
| self.value_sum = tmp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason for this? I think it's equivalent
| """ | ||
| cls: The class that is goint to calculate the average precision. | ||
| iouthreshold: The IOU threshold to determine whether a detection is true/false positive. | ||
| method: The interpolation method to calculate the average precision. Can choose between | ||
| every point interpolation or eleven point interpolation. | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't follow our docstring style. I recommend installing the "Python Docstring Generator" extension in VSCode and configuring it to the correct style. That will make it really easy to add new docstrings with the correct format
| return newBoundingBox | ||
|
|
||
|
|
||
| class BoundingBoxes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is intended to be imported by users, we should add a docstring. If not, it should probably be named _BoundingBoxes to indicate it's private
This PR consists of 3 parts:
yused in object detection.Change type:
Checklist: