CVPR-2021 Papers Collection

in Some Related Rearching Fields.

1. Classification/ Backbone Enhancement

2. Object Detection

3. Segmentation

4. Transformer in Visual

5. Tracking

6. Anomaly/ Defect Detection

7. Data augmentation

8. GAN

9. Medical Imaging

10. Image Clustering

1. Classification/ Backbone Enhancement

- ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network [Paper] [Code]

- Involution: Inverting the Inherence of Convolution for Visual Recognition [Paper] [Code]

- Coordinate Attention for Efficient Mobile Network Design [Paper] [Code]

- Inception Convolution with Efficient Dilation Search [Paper] [Code]

- RepVGG: Making VGG-style ConvNets Great Again [Paper] [Code]

1.1 Fine-grained classification

Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels [Paper]

Differentiable Patch Selection for Image Recognition [Paper]

Fine-grained Angular Contrastive Learning with Coarse Labels [Paper]

Few-Shot Classification with Feature Map Reconstruction Networks [Paper]

A Realistic Evaluation of Semi-Supervised Learning for Fine-Grained Classification [Paper]

1.2 Image Classification

MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition [Paper]

PML: Progressive Margin Loss for Long-tailed Age Classification [Paper]

Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification [Paper]

Capsule Network is Not More Robust than Convolutional Network [Paper]

Model-Contrastive Federated Learning [Paper]

Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets [Paper]

Correlated Input-Dependent Label Noise in Large-Scale Image Classification [Paper]

1.3 Semi-supervised image classification

SimPLE: Similar Pseudo Label Exploitation for Semi-Supervised Classification [Paper]

1.4 Long-tail visual recognition

Distribution Alignment: A Unified Framework for Long-tail Visual Recognition [Paper]

Improving Calibration for Long-Tailed Recognition [Paper]

Adversarial Robustness under Long-Tailed Distribution [Paper]

2. Object Detection

2.1 Object Detection on COCO

- VarifocalNet: An IoU-aware Dense Object Detector [Paper]

- You Only Look One-level Feature [Paper]

- Multiple Instance Active Learning for Object Detection [Paper] [Code]

- Positive-Unlabeled Data Purification in the Wild for Object Detection [Paper]

- Depth from Camera Motion and Object Detection [Paper]

- Towards Open World Object Detection [Paper] [Code]

- General Instance Distillation for Object Detection [Paper]

- Distilling Object Detectors via Decoupled Features [Paper]

- MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection [Paper]

- Informative and Consistent Correspondence Mining for Cross-Domain Weakly Supervised Object Detection [Paper]

- Sparse R-CNN: End-to-End Object Detection with Learnable Proposals [Paper]

- OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection [Paper] [Code]

- End-to-End Object Detection with Fully Convolutional Network [Paper]

- Robust and Accurate Object Detection via Adversarial Learning [Paper]

- I^3Net: Implicit Instance-Invariant Network for Adapting One-Stage Object Detectors [Paper]

- Distilling Object Detectors via Decoupled Features [Paper]

- OTA: Optimal Transport Assignment for Object Detection [Paper]

- Scale-aware Automatic Augmentation for Object Detection [Paper]

- A Closer Look at Fourier Spectrum Discrepancies for CNN-generated Images Detection [Paper]

- Group Collaborative Learning for Co-Salient Object Detection [Paper]

- IQDet: Instance-wise Quality Distribution Sampling for Object Detection [Paper]

- Domain-Specific Suppression for Adaptive Object Detection [Paper]

2.2 Small Object Detection

- Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection [Paper]

- Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection [Paper]

- FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding [Paper]

- Generalized Few-Shot Object Detection without Forgetting [Paper]

2.3 Multi-target Detection

- There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge [Paper]

2.4 3D Target Detection

- Categorical Depth Distribution Network for Monocular 3D Object Detection [Paper]

- 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection [Paper]

- ST3D: Self-training for Unsupervised Domain Adaptation on 3D ObjectDetection [Paper]

- Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection [Paper]

- MonoRUn: Monocular 3D Object Detection by Self-Supervised Reconstruction and Uncertainty Propagation [Paper]

- M3DSSD: Monocular 3D Single Stage Object Detector [Paper]

- GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection [Paper]

- LiDAR R-CNN: An Efficient and Universal 3D Object Detection [Paper]

- Exploring intermediate representation for monocular vehicle pose estimation [Paper]

- Delving into Localization Errors for Monocular 3D Object Detection [Paper]

- HVPR: Hybrid Voxel-Point Representation for Single-stage 3D Object Detection [Paper]

- Objects are Different: Flexible Monocular 3D Object Detection [Paper]

- Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds [Paper]

- PointAugmenting: Cross-Modal Augmentation for 3D Object Detection [Paper]

- SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud [Paper]

2.5 Rotating target detection

- Dense Label Encoding for Boundary Discontinuity Free Rotation Detection [Paper]

2.6 Target setting

- Unveiling the Potential of Structure-Preserving for Weakly Supervised Object Localization [Paper]

2.7 Dense object detection

- Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection [Paper] [Code]

2.8 Salient object detection

- Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion [Paper]

- Weakly Supervised Video Salient Object Detection [Paper]

- Uncertainty-aware Joint Salient Object and Camouflaged Object Detection [Paper]

2.9 Semi-supervised/ Weakly supervised target detection

- Data-Uncertainty Guided Multi-Phase Learning for Semi-Supervised Object Detection [Paper]

- Points as Queries: Weakly Semi-supervised Object Detection by Points [Paper]

- DAP: Detection-Aware Pre-training with Weak Supervision [Paper]

2.10 Long-tail target detection

- Adaptive Class Suppression Loss for Long-Tail Object Detection [Paper]

2.11 OOD Detection

- MOS: Towards Scaling Out-of-distribution Detection for Large Semantic Space [Paper]

- MOOD: Multi-level Out-of-distribution Detection [Paper]

3. Segmentation

- Information-Theoretic Segmentation by Inpainting Error Maximization [Paper]

- Simultaneously Localize, Segment and Rank the Camouflaged Objects [Paper]

- Capturing Omni-Range Context for Omnidirectional Segmentation [Paper]

- Boundary IoU: Improving Object-Centric Image Segmentation Evaluation [Paper]

- Locate then Segment: A Strong Pipeline for Referring Image Segmentation [Paper]

- InverseForm: A Loss Function for Structured Boundary-Aware Segmentation [Paper]

- Omnimatte: Associating Objects and Their Effects in Video [Paper]

3.1 Panoptic/ Panorama Segmentation

- Fully Convolutional Networks for Panoptic Segmentation [Paper] [Code]

- Cross-View Regularization for Domain Adaptive Panoptic Segmentation [Paper]

- 4D Panoptic LiDAR Segmentation [Paper]

- Cross-View Regularization for Domain Adaptive Panoptic Segmentation [Paper]

- Panoptic-PolarNet: Proposal-free LiDAR Point Cloud Panoptic Segmentation [Paper]

- Panoptic Segmentation Forecasting [Paper]

- Exemplar-Based Open-Set Panoptic Segmentation Network [Paper]

3.2 Instance segmentation

- Zero-Shot Instance Segmentation [Paper]

- Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [Paper]

- Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency [Paper]

- FAPIS: A Few-shot Anchor-free Part-based Instance Segmenter [Paper]

- Weakly-supervised Instance Segmentation via Class-agnostic Learning with Salient Images [Paper]

- Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation [Paper]

- RefineMask: Towards High-Quality Instance Segmentation with Fine-Grained Features [Paper]

- A^2-FPN: Attention Aggregation based Feature Pyramid Network for Instance Segmentation [Paper]

- Incremental Few-Shot Instance Segmentation [Paper]

3.3 Semantic segmentation

- PLOP: Learning without Forgetting for Continual Semantic Segmentation [Paper]

- Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges [Paper]

- Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation [Paper]

- Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation [Paper]

- Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing [Paper]

- Learning Statistical Texture for Semantic Segmentation [Paper]

- MetaCorrection: Domain-aware Meta Loss Correction for Unsupervised Domain Adaptation in Semantic Segmentation [Paper]

- Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations [Paper]

- Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion [Paper]

- Rethinking BiSeNet For Real-time Semantic Segmentation [Paper]

- BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation [Paper]

- Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation [Paper]

- Cross-Dataset Collaborative Learning for Semantic Segmentation [Paper]

- Coarse-to-Fine Domain Adaptive Semantic Segmentation with Photometric Alignment and Category-Center Regularization [Paper]

- Non-Salient Region Object Mining for Weakly Supervised Semantic Segmentation [Paper]

- Source-Free Domain Adaptation for Semantic Segmentation [Paper]

- PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in Clustering [Paper]

- Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation [Paper]

- Progressive Semantic Segmentation [Paper]

- Semantic Segmentation with Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization [Paper]

- DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation [Paper] [Code]

- Self-supervised Augmentation Consistency for Adapting Semantic Segmentation [Paper] [Code]

- Railroad is not a Train: Saliency as Pseudo-pixel Supervision for Weakly Supervised Semantic Segmentation [Paper]

3.4 Scene understanding/scene analysis

- Exploring Data Efficient 3D Scene Understanding with Contrastive Scene Contexts [Paper]

- Monte Carlo Scene Search for 3D Scene Understanding [Paper]

- Bidirectional Projection Network for Cross Dimension Scene Understanding [Paper]

- RobustNet: Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective Whitening [Paper]

- CoCoNets: Continuous Contrastive 3D Scene Representations [Paper]

- Exploiting Edge-Oriented Reasoning for 3D Point-based Scene Graph Analysis [Paper]

- SceneGraphFusion: Incremental 3D Scene Graph Prediction from RGB-D Sequences [Paper]

- Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation [Paper]

- SceneGraphFusion: Incremental 3D Scene Graph Prediction from RGB-D Sequences [Paper]

- Fully Convolutional Scene Graph Generation [Paper]

- Bipartite Graph Network with Adaptive Message Passing for Unbiased Scene Graph Generation [Paper]

3.5 Cutout

- Real-Time High Resolution Background Matting [Paper]

3.6 Action segmentation

- Global2Local: Efficient Structure Search for Video Action Segmentation [Paper]

- Temporal Action Segmentation from Timestamp Supervision [Paper]

- Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation [Paper]

- Action Shuffle Alternating Learning for Unsupervised Action Segmentation [Paper]

- Anchor-Constrained Viterbi for Set-Supervised Action Segmentation [Paper]

3.7 Radar segmentation

- Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation [Paper]

3.8 Video segmentation

- Modular Interactive Video Object Segmentation:Interaction-to-Mask, Propagation and Difference-Aware Fusion [Paper]

- Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild [Paper]

- Efficient Regional Memory Network for Video Object Segmentation [Paper]

- Learning Position and Target Consistency for Memory-based Video Object Segmentation [Paper]

- Guided Interactive Video Object Segmentation Using Reliability-Based Attention Maps [Paper]

- Target-Aware Object Discovery and Association for Unsupervised Video Multi-Object Segmentation [Paper]

- SG-Net: Spatial Granularity Network for One-Stage Video Instance Segmentation [Paper]

- Spatial Feature Calibration and Temporal Fusion for Effective One-stage Video Instance Segmentation [Paper]

- Self-Guided and Cross-Guided Learning for Few-Shot Segmentation [Paper]

- Adaptive Prototype Learning and Allocation for Few-Shot Segmentation [Paper]

- Camouflaged Object Segmentation with Distraction Mining [Paper]

- Deep Video Matting via Spatio-Temporal Alignment and Aggregation [Paper]

- Omni-supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning [Paper]

4. Transformer in Visual

- Transformer Interpretability Beyond Attention Visualization [Paper] [Code]

- MIST: Multiple Instance Spatial Transformer Network [Paper]

- Variational Transformer Networks for Layout Generation [Paper]

4.1 Motion recognition detection

- 3D Vision Transformers for Action Recognition [Paper]

4.2 Target Detection

- UP-DETR: Unsupervised Pre-training for Object Detection with Transformers [Paper] [Code]

4.3 Image Processing

- Pre-Trained Image Processing Transformer [Paper]

4.4 Human-computer interaction

- End-to-End Human Object Interaction Detection with HOI Transformer [Paper]

- HOTR: End-to-End Human-Object Interaction Detection with Transformers [Paper] [Code]

4.5 Image segmentation

- Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers [Paper]

- [VisTR] End-to-End Video Instance Segmentation with Transformers [Paper] [Code]

4.6 Tracking

- Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking [Paper]

- Transformer Tracking [Paper]

4.7 Action prediction

- Multimodal Motion Prediction with Stacked Transformers [Paper]

4.8 Self-attention mechanism

- Scaling Local Self-Attention For Parameter Efficient Visual Backbones [Paper]

4.9 Search

- Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning [Paper]

4.10 Feature matching

- LoFTR: Detector-Free Local Feature Matching with Transformers [Paper]

4.11 Gesture recognition

- Pose Recognition with Cascade Transformers [Paper]

4.12 Autopilot

- Multi-Modal Fusion Transformer for End-to-End Autonomous Driving [Paper]

5. Tracking

- Rotation Equivariant Siamese Networks for Tracking [Paper]

- Multiple Object Tracking with Correlation Learning [Paper]

- Graph Attention Tracking [Paper]

- LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search [Paper]

- Track, Check, Repeat: An EM Approach to Unsupervised Tracking [Paper]

5.1 Multi-target tracking

- Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking [Paper]

- Track to Detect and Segment: An Online Multi-Object Tracker [Paper]

- Multiple Object Tracking with Correlation Learning [Paper]

- Learning a Proposal Classifier for Multiple Object Tracking [Paper]

- Learnable Graph Matching: Incorporating Graph Partitioning with Deep Feature Learning for Multiple Object Tracking [Paper]

- Online Multiple Object Tracking with Cross-Task Synergy [Paper]

5.2 Visual-target tracking

- IoU Attack: Towards Temporally Coherent Black-Box Adversarial Attack for Visual Object Tracking [Paper]

- Learning to Track Instances without Video Annotations [Paper]

5.3 Single-target tracking

- Towards More Flexible and Accurate Object Tracking with Natural Language: Algorithms and Benchmark [Paper]

- SiamGAT: Graph Attention Tracking [Paper]

6. Anomaly/ Defect Detection

- Multiresolution Knowledge Distillation for Anomaly Detection [Paper]

- CutPaste: Self-Supervised Learning for Anomaly Detection and Localization [Paper]

7. Data augmentation

- KeepAugment: A Simple Information-Preserving Data Augmentation [Paper]

8. GAN

- Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editin [Paper]

- Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation [Paper]

- Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs [Paper]

- Image-to-image Translation via Hierarchical Style Disentanglement [Paper]

- Efficient Conditional GAN Transfer with Knowledge Propagation across Classes [Paper]

- Anycost GANs for Interactive Image Synthesis and Editing [Paper]

- TediGAN: Text-Guided Diverse Image Generation and Manipulation [Paper]

- Generative Hierarchical Features from Synthesizing Images [Paper]

- Teachers Do More Than Teach: Compressing Image-to-Image Models [Paper]

- PISE: Person Image Synthesis and Editing with Decoupled GAN [Paper]

- LOHO: Latent Optimization of Hairstyles via Orthogonalization [Paper]

- HumanGAN: A Generative Model of Humans Images [Paper]

- HistoGAN: Controlling Colors of GAN-Generated and Real Images via Color Histograms [Paper] https://arxiv.org/abs/2011.11731

- DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network [Paper] https://arxiv.org/abs/2103.07893

- Training Generative Adversarial Networks in One Stage [Paper] https://arxiv.org/abs/2103.00430

- Closed-Form Factorization of Latent Semantics in GANs [Paper] https://arxiv.org/abs/2007.06600 Code https://github.com/genforce/sefa

- pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis [Paper]

- ReMix: Towards Image-to-Image Translation with Limited Data [Paper]

- Unsupervised Disentanglement of Linear-Encoded Facial Semantics [Paper]

- Content-Aware GAN Compression [Paper]

- Regularizing Generative Adversarial Networks under Limited Data [Paper]

- Where and What? Examining Interpretable Disentangled Representations [Paper]

- Few-shot Image Generation via Cross-domain Correspondence [Paper]

- DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort [Paper]

- Surrogate Gradient Field for Latent Space Manipulation [Paper]

- StylePeople: A Generative Model of Fullbody Human Avatars [Paper]

- Ensembling with Deep Generative Views [Paper]

- Continuous Face Aging via Self-estimated Residual Age Embedding [Paper]

8.1 Image to image translation

Memory-guided Unsupervised Image-to-image Translation [Paper]

Image-to-image Translation via Hierarchical Style Disentanglement [Paper]

- CoMoGAN: continuous model-guided image-to-image translation [Paper]

Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation [Paper]

8.2 Image editing

StyleMapGAN: Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing [Paper]

9. Medical Imaging

- Deep Learning for Chest X-ray Analysis: A Survey [Paper]

- 3D Graph Anatomy Geometry-Integrated Network for Pancreatic Mass Segmentation, Diagnosis, and Quantitative Patient Management [Paper]

- Deep Lesion Tracker: Monitoring Lesions in 4D Longitudinal Imaging Studies [Paper]

- Automatic Vertebra Localization and Identification in CT by Spine Rectification and Anatomically-constrained Optimization [Paper]

- Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning [Paper]

- DeepTag: An Unsupervised Deep Learning Method for Motion Tracking on Cardiac Tagging Magnetic Resonance Images [Paper]

- Multiple Instance Captioning: Learning Representations from Histopathology Textbooks and Articles [Paper]

- XProtoNet: Diagnosis in Chest Radiography with Global and Local Explanations [Paper]

Medical image segmentation

- FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space [Paper] [Code]

- DoDNet: Learning to segment multi-organ and tumors from multiple partially labeled datasets [Paper]

- DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation [Paper]

- DARCNN: Domain Adaptive Region-based Convolutional Neural Network for Unsupervised Instance Segmentation in Biomedical Images [Paper]

- Every Annotation Counts: Multi-label Deep Supervision for Medical Image Segmentation [Paper]

Medical image synthesis

- Brain Image Synthesis with Unsupervised Multivariate Canonical CSCℓ4Net [Paper]

10. Image Clustering

- Improving Unsupervised Image Clustering With Robust Learning [Paper]

- Jigsaw Clustering for Unsupervised Visual Representation Learning [Paper]

References (update 25/05)

Page updated

Google Sites

Report abuse

CVPR-2021 Papers Collection

in Some Related Rearching Fields.

About Me: