[VAN] Visual Attention Network
{Segmentation, Large Kernel Attention (LKA)}
Paper: https://arxiv.org/abs/2202.09741
Code: https://github.com/Visual-Attention-Network/VAN-Classification
{Segmentation, Large Kernel Attention (LKA)}
Paper: https://arxiv.org/abs/2202.09741
Code: https://github.com/Visual-Attention-Network/VAN-Classification
1) Motivation, Objectives and Related Works:
Motivation:
Objectives:
Related Works:
Contribution:
2) Methodology:
Method 1:
LKA: Ta có thể phân tách Large Kernel Conv thành 3 thành phần:
Spatial Local Conv (Conv cho các vùng nhỏ) sử dụng Depth-wise Conv (DWConv).
Spatial Long-range Conv (Conv để liên kết các phần xa) sử dụng Depth-wise Dilated Conv (DW-DConv).
Channel Conv/Point Wise Conv (PWConv) (1×1 Conv).
Phân tách một Large Kernel Conv thành DWConv, DW-DConv và PWConv
Attention=Conv(1×1) (DW-DConv(DWConv(x)))
Output=Attention ⋅ x
The structure of different modules: (a) the proposed Large Kernel Attention (LKA); (b) non-attention module; (c) the self-attention module (d) a stage of our Visual Attention Network (VAN). CFF means convolutional feed-forward network. The difference between (a) and (b) is the element-wise multiply. It is worth noting that (c) is designed for 1D sequences.
3) Experimental Results:
Experimental Results:
Ablations:
n2 n0
θ