29 thoughts on “ml5.js: What is a Convolutional Neural Network Part 2 – Max Pooling

  1. Hey, Daniel, could you, please, talk about building an OCR solution for handwritten text in the wild (not isolated characters)? Or do you have any recommendations for where to learn from about building a CNN for OCR?

  2. Wont the total pixel which goes to fully connected layers be same as original image (if you dont consider convolution layer pixel subtraction)… So how can a CNN be used with a image with millions of pixels…

  3. For anyone interested in the subject I strongly recommend Andrew Ng's course on Deep Neural Networks and Convolutional Neural Networks. He exaplains it very clearly with a lot of programming exercises.

  4. Fabulous Content Dan! at 8:24 'do I take the rgb values of the brightest pixel or do I just take the highest r, then highest g…' – Regular maxpool applies the max function per channel. If you are using maxpool on RGB it is highest r then highest g and so forth, because each colour is in its own channel. Further in the CNN on hidden layers you most likely have more than 3 channels, possibly 7 or 31 or something else. Deeper in the CNN each channel effectively becomes a feature detector, and since maxpool applies the max function per channel, it is effectively passing through the highest feature signal for the patch. I haven't seen maxpool used on RGB. It's usually used after your RBG signal is processed by at least 1 CNN layer and hence already converted into features, like you showed in your previous video where you had horizontal and vertical feature detectors! Great stuff, keep it up!

  5. The only other teacher I know of who covers these topics as well although very differently is Joel Grus. You'll no doubt enjoy his videos.

Leave a Reply

Your email address will not be published. Required fields are marked *