MLP-Mixer: An all-MLP Architecture for Vision

In this post, MLP-Mixer is presented that is able to replace the Convolution and Self-Attention methods in Computer Vision tasks. MLP-Mixer attains competitive scores on image classification benchmarks, with pre-training and inference costs comparable to state-of-the-art models.

MLP refers to Multi-layered perceptrons (MLPs), introduced by Google Brain (Original VIT team).

Paper: https://arxiv.org/pdf/2105.01601.pdf 

Code: https://github.com/google-research/vision_transformer