Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redid the YOLO object detection algorithm #176

Closed
wants to merge 37 commits into from

Conversation

hiddentn
Copy link
Contributor

@hiddentn hiddentn commented Jul 9, 2018

redid the YOLO object detection algorithm
more simple & understandable
no more memory leaks
performance boost (i think )

a normal usage example should look like this

let options = { url: '...', }
let yolo = new ml5.YOLO(options); 
let loaded = await yolo.loadModel()
if(loaded){
   let results = await yolo.detect(image ||  video || canvas)
}

here's a working example(image) https://codepen.io/hiddentn/pen/NBjPKR
another one with a video element https://codepen.io/hiddentn/pen/jpmbRy

@mw108
Copy link

mw108 commented Jul 22, 2018

./YOLO/index.js Line 12:

import { iou } from './utils';

Isn't this supposed to be import iou from './utils';, because iou is a default export? Otherwise iou is unknown: Uncaught (in promise) TypeError: (0 , p.iou) is not a function

./YOLO/index.js Line 175:

const boxClassProbMask = tf.greaterEqual(boxScores, ...);

Isn't this supposed to be tf.greaterEqual(boxScores1, ...);?

@mw108
Copy link

mw108 commented Jul 22, 2018

p5 images and video are not supported anymore?

@hiddentn
Copy link
Contributor Author

hiddentn commented Jul 23, 2018

@mw108 thank you , i guess i missed those mistakes :)

concerning p5 images & video i am not really sure how to handle them , anyone is welcome to join in and help me with a pr

@hiddentn
Copy link
Contributor Author

hiddentn commented Jul 23, 2018

here's a working example(image) https://codepen.io/hiddentn/pen/NBjPKR
another one with a video element https://codepen.io/hiddentn/pen/jpmbRy
another one with a webcam https://codepen.io/hiddentn/pen/OwmMPM

@mw108
Copy link

mw108 commented Jul 24, 2018

@TheHidden1 Thanks for the quick fixes. :)

I fiddled around a bit with p5 and it turned out that supporting p5 images and webcam is actually pretty easy.

For images, you need to pass img.canvas to the yolo.detect() function.
Example: https://codepen.io/mw-108/pen/gjWGWX

And for webcam, you need to pass video.elt to the yolo.detect() function.
Example: https://codepen.io/mw-108/pen/XBRaQG

@cvalenzuela
Copy link
Member

Thanks @TheHidden1 and @mw108! this is great work!
We will need to update way the method is created. This has changed since you forked the library. See: https://github.com/ml5js/ml5-library/blob/master/src/YOLO/index.js#L146

@cvalenzuela
Copy link
Member

Thanks for the hard work @TheHidden1! There are issue with the tests. Let's fix them and move to get this merge!

@shiffman
Copy link
Member

shiffman commented Aug 8, 2018

For integration with p5 receiving the bounding box as x,y,w,h is probably ideal since those are the defaults with rect(). Thanks for your contributions!

@cvalenzuela
Copy link
Member

  • Following @shiffman, x,y,w,h makes sense!

  • For managing different YOLO versions, I think we can support two cases:

const yolo = new ml5.yolo('v2')

and also this:

const options = { 
  version: '3'
 // other options
} 
const yolo = new ml5.yolo(options)

@shiffman
Copy link
Member

shiffman commented Aug 8, 2018

Aligning with our API though we would wrap in a function (avoiding new) like so? And YOLO all caps?

const yolo = ml5.YOLO('v2')

@cvalenzuela
Copy link
Member

ups! yes, missed that!

@hiddentn
Copy link
Contributor Author

hiddentn commented Aug 15, 2018

i think yolov2 is ready for a merge , it output [x y w h] now.(x,y being to coordinates for the center point not the top-left point )
tiny-yolov2:
v2-test2
tiny-yolov3:
v3-test2

EDIT : this is on :

 { IOUThreshold: 0.4, filterBoxesThreshold: 0.01, classProbThreshold: 0.4,}

someone help with the test 😭 😭 😭 😭 😭 🤣

@@ -25,14 +25,15 @@ describe('YOLO', () => {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test error is:

TypeError: Cannot read property 'filterBoxesThreshold' of undefined

can you try defining the variable here first. Like this: https://github.com/ml5js/ml5-library/blob/master/src/ImageClassifier/index_test.js#L20

@joeyklee
Copy link
Contributor

joeyklee commented Jan 24, 2019

Hello @TheHidden1
Thanks for all your work on this PR. As the ml5 crew going back through the existing PRs after the busy last months, we're curious to know whether or not you're still interested in integrating refactor of the YOLO for ml5. If so, would you be able to update your PR and check for any breaking changes or issues?

I've tried to run your re-implementation and I get the following error at the .predict() function in YOLO. It seems that there's an issue with the incoming image data and for whatever reason the .predict() function isn't returning anything.

(in firefox)
screen shot 2019-01-24 at 12 47 08
(in chrome)
screen shot 2019-01-24 at 12 47 19

If I run the YOLO detection on a simple example as it is currently implemented in ml5.js v0.1.3, then this is what I see:

screen shot 2019-01-24 at 13 00 12

Let us know if you're interested to explore this further. Many thanks!

@hiddentn
Copy link
Contributor Author

i will check it out right away

@hiddentn
Copy link
Contributor Author

hiddentn commented Jan 27, 2019

it seems to be working for me (i think you missed yolo.load() in your example )
here is a working codepen : https://codepen.io/hiddentn/full/gqrBeO

screenshot 36
screenshot 37
screenshot 38

( i am not sure about the firefox error though)

@shiffman
Copy link
Member

Thanks for jumping back into this @TheHidden1! 🎉🎉🎉

@joeyklee happy to look at this together sometime this week if that would be helpful, perhaps merging pull requests could be a good activity for our Friday sessions! 🌈

@joeyklee
Copy link
Contributor

@TheHidden1 - Thanks so much for the updates! @shiffman and I will have a look this week and let you know if/when we merge! Many thanks!

added support for yolo v3 models & made some changed to the post proccessing alogorithm

now wen can chang the model input size on the fly
@hiddentn
Copy link
Contributor Author

@shiffman @joeyklee i cleaned up some stuff and added support for yolo v3 models :

g

  • I restructured the post processing algorithm to be more clear & efficient (i think) , there is also a V2
    that implements tf.image.nonMaxSuppression() but it seems to be slowing things down a bit (10 - 20 ms)

  • this how the new config shoud be :

     // this an example for the tiny yolov2 model
 let config = {
     version: 'v2', /* 'v2' || 'v3' */

     /*
       128 || 144 || 224 || 256 || 320 || 416 (or any multiple of 32 rly)
       we can change this on the fly  now wohoo!
      */
     modelSize: 416 ,   

     URL: '',

     /* inference parameters */
     IOUThreshold: 0.5,
     classProbThreshold: 0.4,

     /* this mask defines  witch anchors go to witch layer 
         eg : for tiny yolo v3 :  ( hast 2 output layers at different scales ) 
                masks: [ [3, 4, 5], [0, 1, 2] ],
                anchors: [[10, 14], [23, 27], [37, 58], [81, 82], [135, 169], [344, 319] ],
     */
     masks: [ [0, 1, 2, 3, 4] ],
     anchors: [ [0.57273, 0.677385], [1.87446, 2.06253], [3.33843, 5.47434], [7.88282, 3.52778], [9.77052, 9.16828]],

     /* class names array  */
     classes: CLASS_NAMES_COCO,
}
  • i did some benchmarks & here are some of the results :
    • Chrome : 71.0.3578.98 (Official Build) (64-bit)
    • CPU : Intel® Core™ i7-7700HQ Processor
    • GPU0 : Intel® HD Graphics 630
    • GPU1 : NVIDIA GeForce GTX 1050 Ti
    • Backend : webgl
ml5.js + tfjs 0.13
CPU* GPU**
Min Avg Max Min Avg Max
416x416 522.99 538.13 556.40 98.89 101.68 114.79
320x320 325.30 332.64 351.39 58.69 61.40 71.79
224x224 186.10 191.98 207.30 33.90 36.58 89.29
128 83.55 78.39 104.00 15.60 16.95 23.90
tfjs 0.14.2
CPU* GPU**
Min Avg Max Min Avg Max
416x416 464.39 476.80 525.49 83.30 87.24 126.70
320x320 275.59 283.80 296.49 54.99 56.86 67.00
224x224 140.79 144.96 150.40 33.89 36.27 45.40
128x128 76.89 79.98 92.90 20.20 22.40 32.00
* Intel® HD Graphics 630
** NVIDIA GeForce GTX 1050 Ti

here is an updated demo where you can try out all of this stuff : https://codepen.io/hiddentn/full/gqrBeO

note : there is a big time difference when using detect() vs detectAsync() (+100ms in inference time) ,so please can anyone test things out ( there is a detect & a detect Sync buttons in the DEMO) and report their findings thank you

@joeyklee
Copy link
Contributor

joeyklee commented Feb 1, 2019

Hi @TheHidden1 Wow! This is wild. I didn't manage to have a deep look at your contributions yet, but I hope to do some checks with @shiffman soon. I just wanted to check in to let you know it's still very much on our minds! Thanks!

@joeyklee
Copy link
Contributor

joeyklee commented Feb 22, 2019

Hi @TheHidden1 - apologies for the radio silence. @yining1023 and I had a deeper look into your code and we have the feeling that your current proposal diverges a bit too far from the current API structure. We recognize that we don't have good (or any?) documentation yet about the API conventions for ml5.js but that is high on our list.

We would like to propose the following:

  1. Please submit a simple working example of your refactor in action, preferably in p5.js and make a PR to the ml5-examples repo.
  2. Submit some documentation on your new structure so we can better understand how this works.
  3. If you're still interested to work on adapting your proposal closer to the current API structure of YOLO, you're welcome to do this and we can revisit the review.

Thanks again for all your work on this! cc/ @shiffman

@hiddentn
Copy link
Contributor Author

hiddentn commented Feb 24, 2019

i'll try & do my best

@hiddentn
Copy link
Contributor Author

hiddentn commented Mar 23, 2019

i added a small demo ml5js/ml5-examples#107

@hiddentn
Copy link
Contributor Author

hiddentn commented Mar 23, 2019

you know what, let me start this again. this has become too messy

@hiddentn hiddentn closed this Mar 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants