search
Search
Publish
menu
menu search toc more_vert
Robocat
Guest 0reps
Thanks for the thanks!
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
help Ask a question
Share on Twitter
search
keyboard_voice
close
Searching Tips
Search for a recipe: "Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
A
A
share
thumb_up_alt
bookmark
arrow_backShare
Twitter
Facebook
thumb_up
0
thumb_down
0
chat_bubble_outline
0
auto_stories new
settings

Comprehensive Guide on Convolutional Neural Network

Machine Learning
chevron_right
Neural Networks
schedule Mar 10, 2022
Last updated
local_offer
Tags

What is convolutional neural network?

Convolution neural network, or CNN for short, is a variant of the standard artificial neural network that is often applied for image processing. Contrary to popular belief, the standard neural networks can also handle images, but they do so by flattening the input array such that a 28-pixel by 28-pixel two-dimensional array is flattened to an one-dimensional vector of size 784. This one-dimensional vector is then fed into the neural network for training.

The problem with this is that we are discarding positional information. For instance, pixels that are near one another should be similar, while pixels that are far away should not be related. For CNN, we preserve the shape, that is, the input as well as the corresponding output are 3-dimensional arrays. The three dimensions are as follows:

For convolutional neural networks, we deal with 3-dimensional data:

  • height

  • width

  • color channel (3 for RGB and 1 for grayscale)

Diagram of convolutional neural network

Here are some terminology:

    The input and output of a convolution layer is called feature map

  • The input of a convolution layer is called an input feature map

  • The output of a convolution layer is called an output feature map

Filters

Consider the following parameters:

  • $H$ is the height of the original input

  • $W$ is the width of the original input

  • $P$ is the padding

  • $FH$ is the height of the filter

  • $FW$ is the width of the filter

  • $S$ is the stride

The new size would be $(OH, OW)$:

$$\begin{align*} OH=\frac{H+2P-FH}{S}+1\\ OW=\frac{W+2P-FW}{S}+1\\ \end{align*}$$
WARNING

You have to select the value of the parameters such that OH and OW are integers. In some deep learning frameworks, OH and OW are simply rounded without throwing an error.

Here, we are using just one filter, and therefore, we end up with a two-dimensional output data. We could apply multiple filters.

The completed block of size (FN, OH, OW) is passed on to the next layer. As for the notation, we can define the shape of the filter like so:

(FN, C, FH, FW)

For instance, if the channel size is 3, and we have fifty 5x5 filter, the shape would be as follows:

(50, 3, 5, 5)

Just like for traditional neural networks, convolutional layers also incorporate bias in the arithmetics:

The bias holds a single value per channel. Therefore, if we are dealing with 3 channels, the bias would be a vector of size 3 with possibly three different values.

Just like for standard neural networks, we can apply the notion of batch processing for CNN. Now the input data would be 4-dimensional since we now consider multiple input data at the same time as a batch.

robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Ask a question or leave a feedback...
thumb_up
0
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!