Convolutional Operations

Padding

regression-example

Why Padding is Needed

When applying convolution, the output image shrinks unless we pad it. This is a problem when building deep networks where spatial dimensions shrink after each convolution.

Without Padding:

Where:

  • : input size
  • : filter size
Example
regression-example

In this image:

  • : input size =
  • : filter size =

With Padding ():

Where:

  • : input size
  • : filter size
  • : padding size
Example
regression-example

In this image:

  • : input size =
  • : filter size =
  • : padding size =

Types of Padding

  • Valid Padding (no padding): Output is smaller.
  • Same Padding (zero padding): Output size equals input size.

Real-World Analogy

Imagine scanning a photo with a magnifying glass: without padding, you can’t examine the borders. Padding extends the image so that every pixel gets equal attention.






Strided Convolutions

What is Stride?

Stride is the number of pixels the filter moves at each step.

regression-example
  • Stride = 1: Normal convolution (moves 1 pixel at a time)
  • Stride = 2: Downsampling (moves 2 pixels at a time)

Output Size Formula

Where:

  • : input size
  • : filter size
  • : stride
  • : padding

Visual Example

If stride = 2, the filter skips every alternate pixel, effectively reducing the spatial size of the output.






Convolutions Over Volume

From 2D to 3D

regression-example

In RGB images, we have 3 channels: Red, Green, and Blue. Thus, a convolutional layer operates over 3D volumes.

regression-example

Input Dimensions:

  • : Height
  • : Width
  • : Channels (e.g., 3 for RGB)

Filter Dimensions:

  • Number of filters:

Output Volume:

  • Each filter creates a 2D activation map, stacked together to form the output volume.

Practical Example

Let’s say you have a (6, 6, 3) image, and you apply 2 filters of size (3, 3, 3):

regression-example
  • Output shape: (4, 4, 2) (assuming valid padding, stride=1)