Pooling arithmetic

In a neural network, pooling layers provide invariance to small translations of the input. The most common kind of pooling is max pooling, which consists in splitting the input in (usually non-overlapping) patches and outputting the maximum value of each patch. Other kinds of pooling exist, e.g., mean or average pooling, which all share the same idea of aggregating the input locally by applying a non-linearity to the content of some patches.

Since pooling does not involve zero padding, the relationship describing the general case is as follows:

For any \( i \), \( k \) and \( s \),

$$ \begin{equation*} o = \left\lfloor \frac{i - k}{s} \right\rfloor + 1. \end{equation*} $$