Skip to content

Conversation

@AngranLi
Copy link
Contributor

@AngranLi AngranLi commented May 31, 2021

This PR consists of 3 parts:

  1. The change of xt-training itself to make it work with the dictionary label yused in object detection.
  2. The new metric of average precision, which is based on the code in https://github.com/rafaelpadilla/Object-Detection-Metrics.
  3. The example in Jupyter Notebook. The structure and description is based on Jack's Image Classification example.

Change type:

  • Bug fix
  • Feature
  • Documentation

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code and added docstrings to all exposed functions and classes
  • I have made corresponding changes to the documentation
  • Any relevant changes to third party licenses have been updated in the README

jwmandeville
jwmandeville previously approved these changes Jun 2, 2021
Copy link
Contributor

@jwmandeville jwmandeville left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and I tested and it works on my system as well.

A few suggestions:

  • Add a link to the dataset download page: https://www.cis.upenn.edu/~jshi/ped_html/
  • Change the dataset paths and checkpoint paths to variables
  • Change the paths to something generic like path/to/dataset/PennFudanPed instead of your PC's location

@AngranLi
Copy link
Contributor Author

AngranLi commented Jun 2, 2021

Looks good and I tested and it works on my system as well.

A few suggestions:

  • Add a link to the dataset download page: https://www.cis.upenn.edu/~jshi/ped_html/
  • Change the dataset paths and checkpoint paths to variables
  • Change the paths to something generic like path/to/dataset/PennFudanPed instead of your PC's location

Thanks Jack! Just made the changes.

jwmandeville
jwmandeville previously approved these changes Jun 3, 2021
Comment on lines 168 to 175
y_tmp = []
for y_i in y:
if isinstance(y_i, dict):
y_i = {k: v.to(device) for k, v in y_i.items()}
else:
y_i.to(device)
y_tmp.append(y_i)
y = y_tmp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment describing what is happening here and what the use case is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment added.

Comment on lines 177 to 192
try:
y_pred = model(x)
loss_batch = loss_fn(y_pred, y)
# The output of Object Detection is a tuple of (loss, y_pred), and when
# not in training mode (model.training==False), loss is an empty dictionary.
if isinstance(y_pred, tuple):
y_pred = y_pred[-1]
except ValueError:
# Object Detection training process requires (x, y) as input.
loss, y_pred = model(x, y)
# During training, y_pred is an empty list with len==0.
# Assign a length to it to make the PooledMean metrics work.
y_pred = [None]
# The Object Detection model already calculated the loss,
# so loss_fn should be declared to work with it.
loss_batch = loss_fn(y_pred=y_pred, y=loss)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should rely on error handling to do non-erroneous control flow.

Also, in general, we shouldn't let the structure of a single object detection model impact the structure of our package so much. I think it would be better to modify the model so that it behaves normally and outputs only the predictions all the time and move the loss calculation to a separate loss function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I made some modification and discarded the error handling.

Comment on lines 167 to 168
elif isinstance(y, Iterable):
y = [y_i.to(device) for y_i in y]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this will error for the data you are expecting. If y_i is a dict, then it won't have a to method

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a better practice for object detection would be to put the to calls inside the forward method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion Tim. I wrote an ODInterface based on the SKInterface, and moved the to calls inside the forward method.

@AngranLi AngranLi requested a review from timesler June 23, 2021 22:32
@AngranLi
Copy link
Contributor Author

@timesler The modifications I made:

  1. modified runner.py so the error handling is discarded
  2. add the object detection wrappers so I can move the to calls inside the forward method.

Copy link
Contributor

@timesler timesler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I still think this departs too much from our standard workflow in xt-training, because we are trying to bend over backwards to reproduce a specific tutorial, or to allow the use of the torchvision object detection structure.

In essence, object detection should have no problems conforming to the normal process:

  • A dataset returns a tuple of x and y
  • The model forward function accepts a single argument (x) and always returns a single variable (y_pred)
  • Loss and metric functions accept y and y_pred and return scalar values

The way it's done in torchvision is different from the torchvision classification models but there's not good reason for that as far I know. I think we should build our example using a modified version of the model that doesn't do anything with loss inside the model itself.

Comment on lines -200 to +206
self.value_sum += self.latest_value.detach() * self.latest_num_samples
tmp = self.value_sum + self.latest_value.detach() * self.latest_num_samples
# self.value_sum += self.latest_value.detach() * self.latest_num_samples
self.value_sum = tmp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason for this? I think it's equivalent

Comment on lines +311 to +316
"""
cls: The class that is goint to calculate the average precision.
iouthreshold: The IOU threshold to determine whether a detection is true/false positive.
method: The interpolation method to calculate the average precision. Can choose between
every point interpolation or eleven point interpolation.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't follow our docstring style. I recommend installing the "Python Docstring Generator" extension in VSCode and configuring it to the correct style. That will make it really easy to add new docstrings with the correct format

return newBoundingBox


class BoundingBoxes:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is intended to be imported by users, we should add a docstring. If not, it should probably be named _BoundingBoxes to indicate it's private

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants