01_Augmentation
VidAug, TorchVision Transforms, and Albumentations
VidAug, TorchVision Transforms, and Albumentations
Increase the variability of the input images.
The designed object detection model has higher robustness to the images obtained from different environments.
Transformation.
Flip (horizontal, vertical).
Rotation.
Zoom.
Crop.
Convert to Gray-scale.
Use different compression settings.
Adjust Brightness.
Adjust Color.
CutMix [35], Mixup [36], RandAugment [6], and Random Erasing
[TokenMix] J. Liu, B. Liu, H. Zhou, H. Li, Y. Liu, "TokenMix: Rethinking Image Mixing for Data Augmentation in Vision Transformers", arXiv preprint arXiv:2207.08409, 2022. [Code]
The World's Most Valuable Resource Is No Longer Oil, But Data
No, Data Is Not the New Oil
How much training data do you need?
One in Ten Rule
Power and Sample Size Distribution
Precision-Recall Versus Accuracy and the Role of Large Data Sets
Revisiting Unreasonable Effectiveness of Data in Deep Learning Era
Learning Visual Features from Large Weakly Supervised Data
How Training Data Affect the Accuracy and Robustness of Neural Networks for Image Classification
How Many Images Do You Need to Train A Neural Network
Random Erasing Data Augmentation
SMOTE: synthetic minority over-sampling technique
Data Augmentation by Pairing Samples for Images Classification
mixup: Beyond Empirical Risk Minimization
Improved Regularization of Convolutional Neural Networks with Cutout
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
YOLOv4: Optimal Speed and Accuracy of Object Detection
Generative Adversarial Networks
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
AutoAugment: Learning Augmentation Policies from Data
Fast AutoAugment
KeepAugment: A Simple Information-PreservingData Augmentation Approach
RandAugment: Practical automated data augmentation with a reduced search space
Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules
Label Enhancement for Label Distribution Learning
J Nalepa, M Marcinkiewicz, M Kawulok, "Data Augmentation for Brain-Tumor Segmentation: A Review", Frontiers in computational neuroscience, 2019
Photometric distortions: Adjust the brightness, contrast, hue, saturation, and noise of an image
Geometric distortions: Add random scaling, cropping, flipping, and rotating.
Apply to input image:
Random erase [100] and CutOut [11]: randomly select the rectangle region in an image and fill in a random or complementary value of zero.
Hide-and-seek [69] and grid mask [6]: randomly or evenly select multiple rectangle regions in an image and replace them to all zeros.
Apply to feature maps: DropOut [71], DropConnect [80], and DropBlock [16] methods.
MixUp [92]: uses two images to multiply and superimpose with different coefficient ratios and adjusts the label with these superimposed ratios.
CutMix [91]: cover the cropped image to rectangle region of other images and adjusts the label according to the size of the mix area.
effectively reduce the texture bias learned by CNN.
Geometric transformation: rotation, flipping, cropping, deformation, zooming, and other operations.
Color change: blur, color change, erase, fill, noise-based data enhancement, Coarse Dropout, Random erasing, color perturbation.
SMOTE: based on interpolation, artificially synthesizing new samples. encapsulated in the imbalanced-learn library
Samplepairing: Two images are randomly selected from the training set and processed by basic data enhancement operations (such as random flips, etc.)
Mixup: randomly extract two samples for random weighted summation of short answers, and the same sample labels should also be weighted summation
Cutout: Randomly cut some areas of the sample and fill it with 0 pixel values, and the result of the classification remains unchanged.
Cutmix: Cut out a part of the area but do not fill it with 0 pixels, but randomly fill the area pixel values of other data in the training set, and the classification results are distributed according to a certain proportion.
Mosaic: For mosaic data enhancement, four pictures are used to stitch the four pictures. Each picture has its corresponding frame.
Ricap: Data Augmentation using Random Image Cropping and Patching for Deep CNNs [Paper]
Learning the distribution of the data through the model, and randomly generating images that are consistent with the training set distribution, representative methods GAN [18], DCGAN [19].
GAN consists of two networks, one is a generative network and the other is a confrontation network.
Through the model, learn a data enhancement method suitable for the current task, representative methods AutoAugment[20], Fast AutoAugment[21], KeepAugment[22], RandAugment[23].
AutoAugment is to use reinforcement learning to find the best image transformation strategy from the data itself, and learn different enhancement methods for different tasks
Randaugment directly enumerate common K=14 data enhancement methods, and randomly select N types for model training according to the size of the data set and model.
KeepAugment First, the saliency map is used to detect the important areas on the original image, and then these information areas are retained during the augmentation process.
Survey: https://journalofbigdata.springeropen.com/track/pdf/10.1186/s40537-019-0197-0.pdf
Recommend a git project: imgaug
Fast AutoAugment augmentation.
https://www.youtube.com/watch?v=F4lYLvL66gA&list=PL2Yggtk_pK69nyeIgJsjPN0traCIyMJ_f&index=19