[ConvMixer] Patches Are All You Need?
Paper: https://openreview.net/pdf?id=TVHS5Y4dNvM
Paper: https://openreview.net/pdf?id=TVHS5Y4dNvM
0) Motivation, Object and Related works:
Motivation:
Objectives:
ConvMixer is very similar to the MLP-Mixer, model with the following key differences:
Instead of using fully-connected layers, it uses standard convolution layers.
Instead of LayerNorm (which is typical for ViTs and MLP-Mixers), it uses BatchNorm.
Two types of convolution layers are used in ConvMixer.
(1): Depthwise convolutions, for mixing spatial locations of the images,
(2): Pointwise convolutions (which follow the depthwise convolutions), for mixing channel-wise information across the patches. Another keypoint is the use of larger kernel sizes to allow a larger receptive field.
References:
https://keras.io/examples/vision/convmixer/