[VAN] Visual Attention Network

{Segmentation, Large Kernel Attention (LKA)}

Paper: https://arxiv.org/abs/2202.09741

Code: https://github.com/Visual-Attention-Network/VAN-Classification

1) Motivation, Objectives and Related Works:

Motivation:

Objectives:

Related Works:

Contribution:

2) Methodology:

Method 1:

LKA: Ta có thể phân tách Large Kernel Conv thành 3 thành phần:
1. Spatial Local Conv (Conv cho các vùng nhỏ) sử dụng Depth-wise Conv (DWConv).
2. Spatial Long-range Conv (Conv để liên kết các phần xa) sử dụng Depth-wise Dilated Conv (DW-DConv).
3. Channel Conv/Point Wise Conv (PWConv) (1×1 Conv).

Phân tách một Large Kernel Conv thành DWConv, DW-DConv và PWConv

Attention=Conv(1×1) (DW-DConv(DWConv(x)))
Output=Attention ⋅ x

The structure of different modules: (a) the proposed Large Kernel Attention (LKA); (b) non-attention module; (c) the self-attention module (d) a stage of our Visual Attention Network (VAN). CFF means convolutional feed-forward network. The difference between (a) and (b) is the element-wise multiply. It is worth noting that (c) is designed for 1D sequences.

3) Experimental Results:

Experimental Results:

Ablations:

References:

https://viblo.asia/p/mot-chut-ve-co-che-attention-trong-computer-vision-x7Z4D622LnX

- n2 n0
- θ

Page updated

Google Sites

Report abuse

[VAN] Visual Attention Network

About Me: