Search this site
Embedded Files
Lê Phong Phú
  • About Me!
  • AI Expert Roadmap
    • PyTorch
      • PyTorch Fundamentals
        • 1. Introduction to PyTorch
        • 2. Introduction to Computer Vision with PyTorch
        • 3. Introduction to Natural Language Processing with PyTorch
        • 4. Introduction to Audio Classification with PyTorch
      • Intermediate DL with Pytorch
        • 1_TrainingRobustNN
        • 2_Image&CNN
        • 3_Sequences&RNN
        • 4_Multi-Input&Multi-Output
    • Machine Learning
      • 01_ML_General
      • 02_ML_Supervised Learning
      • 03_ML_Unsupervised Learning
    • Mamba
      • 00_Sequence Modelling, S4 and Mamba
    • Transformers (CV&NLP)
      • NLNet
      • 01_Pure Transformer
        • ViT
        • Segformer
      • 02_Hybrid Transformer
        • DETR
        • Deformable DETR
        • DINO (Detection)
      • 99_Unfilter
        • LG-Transformer
        • Image GPT
        • Points as Queries
        • VST
        • MAXViT
        • ViTMAE-Detect
        • MAGNETO
        • AIT
        • MTV
        • PiT
        • Swin
        • PVTv2
        • PVT
        • FAVOR+
        • T2T-ViT
        • CaiT
        • CCT
        • DeiT
        • SSA
        • SA3D
    • [NLP] Natural Language Processing
      • 01_[LLMs] Large Language Models
      • [MoEs] Mixture of Experts
      • LLM Techniques
      • Attention is All You Need
      • Positional Encoding
      • Tokenization
      • MICLe
    • [CV] Computer Vision
      • MLP-based Classification
        • MLP-Mixer
        • FNet
        • EANet
      • 01_[SL] Supervised Learning
        • 01_Classification
          • Convolution Variants
          • 1x1 Convolution
          • EfficientNetV2
          • ConvNeXtV2
        • 02_Detection
          • ConvMixer
          • SOLO
          • YOLOX
          • YOLOR
          • AugFPN
          • BoT_Cls
          • BoF_OD
          • YOLOv3
          • YOLOv4
          • YOLOv5
          • YOLOv6
          • YOLOv7
          • YOLOv8
          • YOLOv9
          • YOLO-NAS
          • TPH-YOLOv5
          • TPH-YOLOv5++
          • ViTDET
        • 03_Segmentation
          • Object Instance Survey 2022
          • 01_Instance Segmentation
          • 02_Semantic Segmentation
          • 03_Panoptic Segmentation
          • 04_3D Segmentation
          • 05_Unsupervised Segmentation
          • BMask RCNN
          • ISTR
          • Transfuse
        • 04_[IS] Interactive Segmentation
          • Interactive Segmentation Techniques
          • 02_3D Interactive Segmentation
          • 03_Video Object Segmentation
          • SAM
          • HA_SAM
          • CFR-ICL
          • MST
          • ECONet
          • SimpleClick
          • FocusCut
          • f-BRS
          • iSegformer
        • 05_Object Tracking
          • 00_ObjectTracking
          • Sort
          • DeepSort
          • FairMOT
          • ByteTrack
          • StrongSORT
          • Tracktor
          • JDE
          • CenterTrack
          • PermaTrack
          • TransTrack
          • TrackFormer
          • BoT-SORT
        • 06_Face Recognition
        • 07_Image Stitching
        • 08_Image Restoration
        • 06_Refinement
          • BPR
        • 10_Scene Understanding
          • CPNet
        • 11_Human Pose Estimation
          • 3D Human Pose
          • Human Pose
        • 12_[SR] Super Resolution
          • Bicubic++
        • 13_VideoPropagation
        • 14_Image Mating
        • 15_Knowledge Distillation
        • 16_Others
      • 02_[UL] Unsupervised Learning
        • 00_Unsupervised Learning
        • 02_Deep Clustering
          • 00_K_Clusters Decision
          • Deep Cluster
          • Cluster Fit
          • DEC
          • Improving Relational Regularized Autoencoders with Spherical Sliced Fused G
          • Taxanomy
          • DeepDPM
          • BCL
          • VaDE
          • t-SNE
          • Tree-SNE
        • 04_Diffusional Models
      • 03_[SSL] Self-Supervised Learning
        • 00_Self-Supervised Learning
        • 01_Contrastive Learning
          • CPC
          • DIM
          • CMC
          • AMDIM
          • SimCLR
          • MoCo
          • MoCov2
          • YADIM
          • VICReg
          • CSL
          • Towards Domain-Agnostic Contrastive Learning
          • Non-Parametric Instance Discrimination
          • Video Contrastive Learning with Global Context
          • SupCon
          • Barlow Twin
        • 02_Predictive Tasks
        • 03_Bootstrapping
          • BYOL
        • 04_Regularization
        • 05_Masked Image Models
          • Patch Localization
          • MAE
          • SimMIM
          • DINO
        • 06_Pretext Tasks
          • PIRL
        • 07_Clustering-based
          • SwAV
      • 04_Semi-Supervised Learning
        • Fully-/Semi-/Weakly-/ Learning
        • 01_Self-training
          • Pseudo-label
          • Noisy Student
        • 02_Consistency Regularization
          • Temporal Ensembling
          • Mean Teacher
          • VAT
          • UDA
        • 03_Hybrid Methods
          • MixUp
          • MixMatch
          • ReMixMatch
          • FixMatch
          • FixMatch (unmerge)
      • 05_Multi-learning Paradigm
        • 00_Multi-learning
        • 01_Multitask
        • Gradient Surgery
        • EtE Multi-task Learning with Attention
        • MTL for Dense Predictions
        • MTL using Uncertainty
        • Which Task learned together
        • GradNorm
        • OM-Net
        • 06_Multi-task Learning
      • 06_Generative Models
        • 00_Generative Models
        • 01_Autoencoders
          • AE vs Others
          • Sparse AE
          • Denoising AE
          • Contractive AE
          • Variational AE
          • DELG
        • 02_GAN
      • Graph Convolutional Networks
        • 00_Graph Convolutional Networks
      • Neural Radiance Fields (NeRFs)
      • Deep Belief Networks
    • Multimodal Models
    • Bag of Freebies - BOF
      • 01_Augmentation
        • Mosaic
        • Cut Out
        • Mix Up
      • 02_Loss Functions
        • 01_Classification Loss
        • 02_Segmentation Loss
        • 03_Object Detection Loss
        • 04_Self-Supervised Loss
        • 05_Interactive Segmentation Loss
      • 03_Optimizer
      • 04_Normalization
        • 00_Normalization
      • 05_Regularization
      • 06_Label Assignment
        • 00_Label Assignment
        • OTA
        • SimOTA
      • 07_Auxiliary Head
    • Bag of Specials - BoS
      • Feature Pyramid
        • RCNet
      • Receptive Field
      • Attention
        • 00_Attention Modules
        • SENet
        • CBAM
        • DANet
        • SDANet
        • AttaNet
        • HaloNets
        • GCNet
        • DeepSquare
        • LBAM
        • External-Attention
        • PCT
        • Residual Attention
        • DCANet
        • GANet
        • Triplet Attention
        • Lambda Networks
        • ACTION
        • VAN
        • SegNeXt
      • Local-/Global- Features
        • Unifying Nonlocal Blocks for Neural Networks
        • Local Features
        • Global Features
      • Activation Functions
        • SiLU dSiLU
      • Post-Processing
        • Soft-NMS
        • NMW
        • WBF
      • Sliding Window
      • Graph Networks
      • Feature Fusion/Integration
      • Data-Centric
    • Others
      • Selected Top-Conference Papers
        • AAAI2021_Papers
        • CVPR2021_Papers
        • ECCV2020_Papers
        • ICCV2021_Papers
        • ICLM2022_Papers
      • Cheat Sheets
        • Pandas
      • Conference Schedule
  • Data Science
    • 03_DS_Discrete Distribution
    • Data Scientist Professional
      • 3. Statistical Experimentation Theory
      • 4. Statistical Experimentation in Python
      • 5. Model development in Python
      • 7. Data Management in SQL
    • Data...
    • ETL
    • Airflow
  • Cloud Computing
    • Azure Data Fundamental
    • Amazon Web Services
      • AWS - Cloud 101
      • AWS - Machine Learning Foundation (Lab)
        • 1. Introduction to MLF
        • 2. AI and ML
        • 3. ML Pipeline
        • 4. ML Tools and Services
        • 5. Wrapping it Up
      • AWS - Cloud Practitioner Essentials
      • AWS - GenAI
    • Google Cloud
    • IBM Watson
  • Big Data
    • PySpark
      • Introduction to PySpark
        • 1. Getting to know PySpark
        • 2. Manipulating Data
        • 3. Getting Started with ML Pipelines
        • 4. Model Tuning and Selection
      • Big Data Fundamentals with PySpark
        • 1. Introduction to BigData Analysis with Spark
        • 2. Programming in PySpark RDD’s
        • 3. PySpark SQL & DataFrames
        • 4. Machine Learning with PySpark MLlib
  • English
    • Reading
    • Listening
    • Speaking
      • Speaking_Part1
        • 1_Speaking Part 1
        • 2_Speaking Part 1
        • 3_Speaking Part 1
        • 4_Speaking Part 1
        • 5_Speaking Part 1
        • 6_Speaking Part 1
        • 7_Speaking Part 1
        • 8_Speaking Part 1
        • 9_Speaking Part 1
        • 10_Speaking Part 1
        • 11_Speaking Part 1
        • 12_Speaking Part 1
        • 13_Speaking Part 1
        • 14_Speaking Part 1
        • 15_Speaking Part 1
        • 16_Speaking Part 1
        • 17_Speaking Part 1
        • 18_Speaking Part 1
        • 19_Speaking Part 1
        • 20_Speaking Part 1
        • 21_Speaking Part 1
        • 22_Speaking Part 1
        • 23_Speaking Part 1
      • Speaking_Part2
        • 1_Speaking Part 2
        • 2_Speaking Part 2
        • 3_Speaking Part 2
        • 4_Speaking Part 2
        • 5_Speaking Part 2
        • 6_Speaking Part 2
        • 7_Speaking Part 2
        • 8_Speaking Part 2
        • 9_Speaking Part 2
        • 10_Speaking Part 2
        • 11_Speaking Part 2
        • 12_Speaking Part 2
        • 13_Speaking Part 2
        • 14_Speaking Part 2
        • 15_Speaking Part 2
        • 16_Speaking Part 2
        • 17_Speaking Part 2
        • 18_Speaking Part 2
        • 19_Speaking Part 2
        • 20_Speaking Part 2
        • People
        • Places
          • Visited House
        • Events
        • Activities
          • Interesting Job
        • Things
      • Speaking_Part3
        • Advertisements
        • Outdoor Activities
        • Navigation and Exploration
        • Fast Food
        • Air Pollution
        • Free Time
        • Interesting Movie
        • Gifts
        • Independence in Children
        • Noisy
        • Complain
        • T-shirts
        • Value of Money
        • Restaurant
        • Global
        • Relaxation
        • Special Places
      • Mixed-Test
        • 01_Mix_Language
    • Writing
      • Writing_Task1
        • Paraphrase
        • Overview Sentence
        • Grammar
        • Charts
          • Line - Average Montly Temperatures
          • Line - Fuels
          • Line - Birth Rate
          • Line - River Water
          • Line - U.S Energy
          • Line - Areas of Crime
          • Line - Renewable Energy
          • Line - Oversea Visitors
          • Chart - People ate in the UK
          • Chart - Music Event Attendance
          • Chart - Wind Energy
          • Chart - Children Attend Sports in Australia
          • Chart - Weekly Hours in Australia
          • Chart - Films released vs Tickets sold
          • Chart - Average Retirement Age
        • Process
        • Maps
          • Library Ground
        • Table
        • Multiple Graphs
          • Life Expectancy
      • Writing_Task2
        • Opinion Essay
          • Higher Salary
          • Goal of Schools
          • Local History
          • Retirement Age
          • Happy Society
          • Food Necessary
          • Pay for more Art
          • Eradicate Poverty
          • Team Activities
          • Wild Animals and Birds
        • Discussion Essay
          • Sports
          • Make Money
          • Crime punished
          • Equipment for Student
          • Keep a Gun
        • Advantages and Disadvantages Essay
          • Live Away
          • Transform to Farms
        • Problem-Solution Essay
          • Extreme Sports
          • Spend Time Away From Families
      • Complex Sentence
      • If, Wish, Hope
    • Synonym Common Mistakes
    • Phrasal Verbs
    • TOEIC 990
  • Interview
    • Deep Learning Questions
      • C1_Mathematical Foundation
      • C2_Fundamentals of ML
      • C3_Fundamentals of DL
      • C4_Classic Network
      • C5_CNN
      • C6_RNN
      • C7_Target Detection
      • C8_Image Segmentation
      • C9_Reinforcement Learning
      • C10_Migration Learning
      • C13_Optimization Algorithm
      • C14_Super Parameter Adjustment
      • C15_Hetorogeneous Computing
    • Data Science Questions
  • Courses (Uni and Mooc)
    • AI Open Courses
    • DS Certificates
    • IBM Gen AI Engineering Professional Certificate
      • 10. Generative AI and LLMs: Architecture and Data Preparation
      • 11. Gen AI Foundational Models for NLP & Language Understanding
      • 12. Gen AI Language Modeling with Transformers
        • Module 1 - Fundamental Concepts of Transformer Architecture
        • Module 2 - Advanced Concepts of Transformer Architecture
      • 13. Generative AI Engineering and Fine-Tuning Transformers
      • 14. Generative AI Advanced Fine-Tuning for LLMs
      • 15. Fundamentals of AI Agents using RAG and Langchain
        • Module 1 - RAG Framework
        • Module 2 - Prompt Engineering and LangChain
      • 16. Project: Generative AI Applications with RAG and LangChain
    • Data Science Foundations: Data Structures and Algorithms Specialization
    • Flask - AI Applications
      • 1. Packaging Concepts
      • 2. Web App Deployment
      • 3. Creating AI Application
        • Sentiment Analysis
        • Emotion Detector
      • Deploy Deep Learning Models using Flask
    • Docker, Kubernetes & OpenShift
      • 1. Containers and Containerization
      • 2. Kubernetes Basics
      • 3. Managing Applications with Kubernetes
      • 4. The Kubernetes Ecosystem
      • 5. Final Assignments
    • Data Structures
      • 1. Introduction to DS&A
    • Algorithms
      • QE - Algorithms
      • Sorting Algorithms
        • Binary Search
        • Insertion Sort
        • Merge Sort
        • Quick sort
        • Heap sort
      • Divide and Conquer
      • Greedy Algorithm
      • Dynamic Programming
    • Operating System
      • QE - Operating System
      • 00_Operating System
    • CS231n Deep Learning for Computer Vision
      • 13. Self-Supervised Learning
    • CS480 Introduction to Machine Learning
      • 19. Attention and Transformer Networks
    • CS330 Multi-task and Meta Learning
      • 1. What is Multi-task Learning
    • Processing the Environment
      • Attention
    • Open VINO
    • Metaverse
      • 00_Metaverse
      • Spark AR
  • Research Projects
    • PPE Detection
      • Few-shot Data Sampling
    • Multiple Object Tracking
      • In-place Augmentation
    • Deep Clustering
      • Metrics
    • Defect Detection
      • 01_Defect_Improvement
      • Dataset: MVTec
      • Mixed supervision for surface-defect detection:
      • Practical Defect Detection
      • (Survey) Fabric Defect Detection
      • (Summary) Fabric Defect Detection
    • Medical Images
      • 01_Lung_Improvement
      • SANet
      • AnaXNet
      • 3D_EtoE Lung Cancer Screening
      • Semantics-enriched Representation
      • Attend And Compare
      • Recent Works
      • Kaggle_Medical Images
  • AI Engineer
  • Financial Invesment
    • 01_TPTrading
    • 02_BCTC
    • 03_Demand Side Platform (DSP)
    • 04_Business Models
    • Trading
      • 01_Technical Analysis
      • 02_Mentality
      • 03_Support and Resistance
  • Books
    • AI Books
    • Books
      • Persuasion IQ
      • Communication Skills
      • 48 Hours a Day
      • Maslow's Pyramid
      • MBTI
      • Tư Duy Ngược
    • Audio Books
  • Project Management
    • PM Methods
      • Agile
      • Scrum
      • Kanban
    • Foundations of PM
      • Module 1
      • Module 2
      • Module 3
      • Module 4
    • Project Initiation: Starting a Successul Projet
      • Module 1
      • Module 2
      • Module 3
      • Module 4
    • Project Planning: Putting It All Together
      • Module 1
    • Project Execution: Running the Project
    • Agile Project Management
    • Capstone: Applying Project Management in the Real World
  • Administrator
Lê Phong Phú
  • About Me!
  • AI Expert Roadmap
    • PyTorch
      • PyTorch Fundamentals
        • 1. Introduction to PyTorch
        • 2. Introduction to Computer Vision with PyTorch
        • 3. Introduction to Natural Language Processing with PyTorch
        • 4. Introduction to Audio Classification with PyTorch
      • Intermediate DL with Pytorch
        • 1_TrainingRobustNN
        • 2_Image&CNN
        • 3_Sequences&RNN
        • 4_Multi-Input&Multi-Output
    • Machine Learning
      • 01_ML_General
      • 02_ML_Supervised Learning
      • 03_ML_Unsupervised Learning
    • Mamba
      • 00_Sequence Modelling, S4 and Mamba
    • Transformers (CV&NLP)
      • NLNet
      • 01_Pure Transformer
        • ViT
        • Segformer
      • 02_Hybrid Transformer
        • DETR
        • Deformable DETR
        • DINO (Detection)
      • 99_Unfilter
        • LG-Transformer
        • Image GPT
        • Points as Queries
        • VST
        • MAXViT
        • ViTMAE-Detect
        • MAGNETO
        • AIT
        • MTV
        • PiT
        • Swin
        • PVTv2
        • PVT
        • FAVOR+
        • T2T-ViT
        • CaiT
        • CCT
        • DeiT
        • SSA
        • SA3D
    • [NLP] Natural Language Processing
      • 01_[LLMs] Large Language Models
      • [MoEs] Mixture of Experts
      • LLM Techniques
      • Attention is All You Need
      • Positional Encoding
      • Tokenization
      • MICLe
    • [CV] Computer Vision
      • MLP-based Classification
        • MLP-Mixer
        • FNet
        • EANet
      • 01_[SL] Supervised Learning
        • 01_Classification
          • Convolution Variants
          • 1x1 Convolution
          • EfficientNetV2
          • ConvNeXtV2
        • 02_Detection
          • ConvMixer
          • SOLO
          • YOLOX
          • YOLOR
          • AugFPN
          • BoT_Cls
          • BoF_OD
          • YOLOv3
          • YOLOv4
          • YOLOv5
          • YOLOv6
          • YOLOv7
          • YOLOv8
          • YOLOv9
          • YOLO-NAS
          • TPH-YOLOv5
          • TPH-YOLOv5++
          • ViTDET
        • 03_Segmentation
          • Object Instance Survey 2022
          • 01_Instance Segmentation
          • 02_Semantic Segmentation
          • 03_Panoptic Segmentation
          • 04_3D Segmentation
          • 05_Unsupervised Segmentation
          • BMask RCNN
          • ISTR
          • Transfuse
        • 04_[IS] Interactive Segmentation
          • Interactive Segmentation Techniques
          • 02_3D Interactive Segmentation
          • 03_Video Object Segmentation
          • SAM
          • HA_SAM
          • CFR-ICL
          • MST
          • ECONet
          • SimpleClick
          • FocusCut
          • f-BRS
          • iSegformer
        • 05_Object Tracking
          • 00_ObjectTracking
          • Sort
          • DeepSort
          • FairMOT
          • ByteTrack
          • StrongSORT
          • Tracktor
          • JDE
          • CenterTrack
          • PermaTrack
          • TransTrack
          • TrackFormer
          • BoT-SORT
        • 06_Face Recognition
        • 07_Image Stitching
        • 08_Image Restoration
        • 06_Refinement
          • BPR
        • 10_Scene Understanding
          • CPNet
        • 11_Human Pose Estimation
          • 3D Human Pose
          • Human Pose
        • 12_[SR] Super Resolution
          • Bicubic++
        • 13_VideoPropagation
        • 14_Image Mating
        • 15_Knowledge Distillation
        • 16_Others
      • 02_[UL] Unsupervised Learning
        • 00_Unsupervised Learning
        • 02_Deep Clustering
          • 00_K_Clusters Decision
          • Deep Cluster
          • Cluster Fit
          • DEC
          • Improving Relational Regularized Autoencoders with Spherical Sliced Fused G
          • Taxanomy
          • DeepDPM
          • BCL
          • VaDE
          • t-SNE
          • Tree-SNE
        • 04_Diffusional Models
      • 03_[SSL] Self-Supervised Learning
        • 00_Self-Supervised Learning
        • 01_Contrastive Learning
          • CPC
          • DIM
          • CMC
          • AMDIM
          • SimCLR
          • MoCo
          • MoCov2
          • YADIM
          • VICReg
          • CSL
          • Towards Domain-Agnostic Contrastive Learning
          • Non-Parametric Instance Discrimination
          • Video Contrastive Learning with Global Context
          • SupCon
          • Barlow Twin
        • 02_Predictive Tasks
        • 03_Bootstrapping
          • BYOL
        • 04_Regularization
        • 05_Masked Image Models
          • Patch Localization
          • MAE
          • SimMIM
          • DINO
        • 06_Pretext Tasks
          • PIRL
        • 07_Clustering-based
          • SwAV
      • 04_Semi-Supervised Learning
        • Fully-/Semi-/Weakly-/ Learning
        • 01_Self-training
          • Pseudo-label
          • Noisy Student
        • 02_Consistency Regularization
          • Temporal Ensembling
          • Mean Teacher
          • VAT
          • UDA
        • 03_Hybrid Methods
          • MixUp
          • MixMatch
          • ReMixMatch
          • FixMatch
          • FixMatch (unmerge)
      • 05_Multi-learning Paradigm
        • 00_Multi-learning
        • 01_Multitask
        • Gradient Surgery
        • EtE Multi-task Learning with Attention
        • MTL for Dense Predictions
        • MTL using Uncertainty
        • Which Task learned together
        • GradNorm
        • OM-Net
        • 06_Multi-task Learning
      • 06_Generative Models
        • 00_Generative Models
        • 01_Autoencoders
          • AE vs Others
          • Sparse AE
          • Denoising AE
          • Contractive AE
          • Variational AE
          • DELG
        • 02_GAN
      • Graph Convolutional Networks
        • 00_Graph Convolutional Networks
      • Neural Radiance Fields (NeRFs)
      • Deep Belief Networks
    • Multimodal Models
    • Bag of Freebies - BOF
      • 01_Augmentation
        • Mosaic
        • Cut Out
        • Mix Up
      • 02_Loss Functions
        • 01_Classification Loss
        • 02_Segmentation Loss
        • 03_Object Detection Loss
        • 04_Self-Supervised Loss
        • 05_Interactive Segmentation Loss
      • 03_Optimizer
      • 04_Normalization
        • 00_Normalization
      • 05_Regularization
      • 06_Label Assignment
        • 00_Label Assignment
        • OTA
        • SimOTA
      • 07_Auxiliary Head
    • Bag of Specials - BoS
      • Feature Pyramid
        • RCNet
      • Receptive Field
      • Attention
        • 00_Attention Modules
        • SENet
        • CBAM
        • DANet
        • SDANet
        • AttaNet
        • HaloNets
        • GCNet
        • DeepSquare
        • LBAM
        • External-Attention
        • PCT
        • Residual Attention
        • DCANet
        • GANet
        • Triplet Attention
        • Lambda Networks
        • ACTION
        • VAN
        • SegNeXt
      • Local-/Global- Features
        • Unifying Nonlocal Blocks for Neural Networks
        • Local Features
        • Global Features
      • Activation Functions
        • SiLU dSiLU
      • Post-Processing
        • Soft-NMS
        • NMW
        • WBF
      • Sliding Window
      • Graph Networks
      • Feature Fusion/Integration
      • Data-Centric
    • Others
      • Selected Top-Conference Papers
        • AAAI2021_Papers
        • CVPR2021_Papers
        • ECCV2020_Papers
        • ICCV2021_Papers
        • ICLM2022_Papers
      • Cheat Sheets
        • Pandas
      • Conference Schedule
  • Data Science
    • 03_DS_Discrete Distribution
    • Data Scientist Professional
      • 3. Statistical Experimentation Theory
      • 4. Statistical Experimentation in Python
      • 5. Model development in Python
      • 7. Data Management in SQL
    • Data...
    • ETL
    • Airflow
  • Cloud Computing
    • Azure Data Fundamental
    • Amazon Web Services
      • AWS - Cloud 101
      • AWS - Machine Learning Foundation (Lab)
        • 1. Introduction to MLF
        • 2. AI and ML
        • 3. ML Pipeline
        • 4. ML Tools and Services
        • 5. Wrapping it Up
      • AWS - Cloud Practitioner Essentials
      • AWS - GenAI
    • Google Cloud
    • IBM Watson
  • Big Data
    • PySpark
      • Introduction to PySpark
        • 1. Getting to know PySpark
        • 2. Manipulating Data
        • 3. Getting Started with ML Pipelines
        • 4. Model Tuning and Selection
      • Big Data Fundamentals with PySpark
        • 1. Introduction to BigData Analysis with Spark
        • 2. Programming in PySpark RDD’s
        • 3. PySpark SQL & DataFrames
        • 4. Machine Learning with PySpark MLlib
  • English
    • Reading
    • Listening
    • Speaking
      • Speaking_Part1
        • 1_Speaking Part 1
        • 2_Speaking Part 1
        • 3_Speaking Part 1
        • 4_Speaking Part 1
        • 5_Speaking Part 1
        • 6_Speaking Part 1
        • 7_Speaking Part 1
        • 8_Speaking Part 1
        • 9_Speaking Part 1
        • 10_Speaking Part 1
        • 11_Speaking Part 1
        • 12_Speaking Part 1
        • 13_Speaking Part 1
        • 14_Speaking Part 1
        • 15_Speaking Part 1
        • 16_Speaking Part 1
        • 17_Speaking Part 1
        • 18_Speaking Part 1
        • 19_Speaking Part 1
        • 20_Speaking Part 1
        • 21_Speaking Part 1
        • 22_Speaking Part 1
        • 23_Speaking Part 1
      • Speaking_Part2
        • 1_Speaking Part 2
        • 2_Speaking Part 2
        • 3_Speaking Part 2
        • 4_Speaking Part 2
        • 5_Speaking Part 2
        • 6_Speaking Part 2
        • 7_Speaking Part 2
        • 8_Speaking Part 2
        • 9_Speaking Part 2
        • 10_Speaking Part 2
        • 11_Speaking Part 2
        • 12_Speaking Part 2
        • 13_Speaking Part 2
        • 14_Speaking Part 2
        • 15_Speaking Part 2
        • 16_Speaking Part 2
        • 17_Speaking Part 2
        • 18_Speaking Part 2
        • 19_Speaking Part 2
        • 20_Speaking Part 2
        • People
        • Places
          • Visited House
        • Events
        • Activities
          • Interesting Job
        • Things
      • Speaking_Part3
        • Advertisements
        • Outdoor Activities
        • Navigation and Exploration
        • Fast Food
        • Air Pollution
        • Free Time
        • Interesting Movie
        • Gifts
        • Independence in Children
        • Noisy
        • Complain
        • T-shirts
        • Value of Money
        • Restaurant
        • Global
        • Relaxation
        • Special Places
      • Mixed-Test
        • 01_Mix_Language
    • Writing
      • Writing_Task1
        • Paraphrase
        • Overview Sentence
        • Grammar
        • Charts
          • Line - Average Montly Temperatures
          • Line - Fuels
          • Line - Birth Rate
          • Line - River Water
          • Line - U.S Energy
          • Line - Areas of Crime
          • Line - Renewable Energy
          • Line - Oversea Visitors
          • Chart - People ate in the UK
          • Chart - Music Event Attendance
          • Chart - Wind Energy
          • Chart - Children Attend Sports in Australia
          • Chart - Weekly Hours in Australia
          • Chart - Films released vs Tickets sold
          • Chart - Average Retirement Age
        • Process
        • Maps
          • Library Ground
        • Table
        • Multiple Graphs
          • Life Expectancy
      • Writing_Task2
        • Opinion Essay
          • Higher Salary
          • Goal of Schools
          • Local History
          • Retirement Age
          • Happy Society
          • Food Necessary
          • Pay for more Art
          • Eradicate Poverty
          • Team Activities
          • Wild Animals and Birds
        • Discussion Essay
          • Sports
          • Make Money
          • Crime punished
          • Equipment for Student
          • Keep a Gun
        • Advantages and Disadvantages Essay
          • Live Away
          • Transform to Farms
        • Problem-Solution Essay
          • Extreme Sports
          • Spend Time Away From Families
      • Complex Sentence
      • If, Wish, Hope
    • Synonym Common Mistakes
    • Phrasal Verbs
    • TOEIC 990
  • Interview
    • Deep Learning Questions
      • C1_Mathematical Foundation
      • C2_Fundamentals of ML
      • C3_Fundamentals of DL
      • C4_Classic Network
      • C5_CNN
      • C6_RNN
      • C7_Target Detection
      • C8_Image Segmentation
      • C9_Reinforcement Learning
      • C10_Migration Learning
      • C13_Optimization Algorithm
      • C14_Super Parameter Adjustment
      • C15_Hetorogeneous Computing
    • Data Science Questions
  • Courses (Uni and Mooc)
    • AI Open Courses
    • DS Certificates
    • IBM Gen AI Engineering Professional Certificate
      • 10. Generative AI and LLMs: Architecture and Data Preparation
      • 11. Gen AI Foundational Models for NLP & Language Understanding
      • 12. Gen AI Language Modeling with Transformers
        • Module 1 - Fundamental Concepts of Transformer Architecture
        • Module 2 - Advanced Concepts of Transformer Architecture
      • 13. Generative AI Engineering and Fine-Tuning Transformers
      • 14. Generative AI Advanced Fine-Tuning for LLMs
      • 15. Fundamentals of AI Agents using RAG and Langchain
        • Module 1 - RAG Framework
        • Module 2 - Prompt Engineering and LangChain
      • 16. Project: Generative AI Applications with RAG and LangChain
    • Data Science Foundations: Data Structures and Algorithms Specialization
    • Flask - AI Applications
      • 1. Packaging Concepts
      • 2. Web App Deployment
      • 3. Creating AI Application
        • Sentiment Analysis
        • Emotion Detector
      • Deploy Deep Learning Models using Flask
    • Docker, Kubernetes & OpenShift
      • 1. Containers and Containerization
      • 2. Kubernetes Basics
      • 3. Managing Applications with Kubernetes
      • 4. The Kubernetes Ecosystem
      • 5. Final Assignments
    • Data Structures
      • 1. Introduction to DS&A
    • Algorithms
      • QE - Algorithms
      • Sorting Algorithms
        • Binary Search
        • Insertion Sort
        • Merge Sort
        • Quick sort
        • Heap sort
      • Divide and Conquer
      • Greedy Algorithm
      • Dynamic Programming
    • Operating System
      • QE - Operating System
      • 00_Operating System
    • CS231n Deep Learning for Computer Vision
      • 13. Self-Supervised Learning
    • CS480 Introduction to Machine Learning
      • 19. Attention and Transformer Networks
    • CS330 Multi-task and Meta Learning
      • 1. What is Multi-task Learning
    • Processing the Environment
      • Attention
    • Open VINO
    • Metaverse
      • 00_Metaverse
      • Spark AR
  • Research Projects
    • PPE Detection
      • Few-shot Data Sampling
    • Multiple Object Tracking
      • In-place Augmentation
    • Deep Clustering
      • Metrics
    • Defect Detection
      • 01_Defect_Improvement
      • Dataset: MVTec
      • Mixed supervision for surface-defect detection:
      • Practical Defect Detection
      • (Survey) Fabric Defect Detection
      • (Summary) Fabric Defect Detection
    • Medical Images
      • 01_Lung_Improvement
      • SANet
      • AnaXNet
      • 3D_EtoE Lung Cancer Screening
      • Semantics-enriched Representation
      • Attend And Compare
      • Recent Works
      • Kaggle_Medical Images
  • AI Engineer
  • Financial Invesment
    • 01_TPTrading
    • 02_BCTC
    • 03_Demand Side Platform (DSP)
    • 04_Business Models
    • Trading
      • 01_Technical Analysis
      • 02_Mentality
      • 03_Support and Resistance
  • Books
    • AI Books
    • Books
      • Persuasion IQ
      • Communication Skills
      • 48 Hours a Day
      • Maslow's Pyramid
      • MBTI
      • Tư Duy Ngược
    • Audio Books
  • Project Management
    • PM Methods
      • Agile
      • Scrum
      • Kanban
    • Foundations of PM
      • Module 1
      • Module 2
      • Module 3
      • Module 4
    • Project Initiation: Starting a Successul Projet
      • Module 1
      • Module 2
      • Module 3
      • Module 4
    • Project Planning: Putting It All Together
      • Module 1
    • Project Execution: Running the Project
    • Agile Project Management
    • Capstone: Applying Project Management in the Real World
  • Administrator
  • More
    • About Me!
    • AI Expert Roadmap
      • PyTorch
        • PyTorch Fundamentals
          • 1. Introduction to PyTorch
          • 2. Introduction to Computer Vision with PyTorch
          • 3. Introduction to Natural Language Processing with PyTorch
          • 4. Introduction to Audio Classification with PyTorch
        • Intermediate DL with Pytorch
          • 1_TrainingRobustNN
          • 2_Image&CNN
          • 3_Sequences&RNN
          • 4_Multi-Input&Multi-Output
      • Machine Learning
        • 01_ML_General
        • 02_ML_Supervised Learning
        • 03_ML_Unsupervised Learning
      • Mamba
        • 00_Sequence Modelling, S4 and Mamba
      • Transformers (CV&NLP)
        • NLNet
        • 01_Pure Transformer
          • ViT
          • Segformer
        • 02_Hybrid Transformer
          • DETR
          • Deformable DETR
          • DINO (Detection)
        • 99_Unfilter
          • LG-Transformer
          • Image GPT
          • Points as Queries
          • VST
          • MAXViT
          • ViTMAE-Detect
          • MAGNETO
          • AIT
          • MTV
          • PiT
          • Swin
          • PVTv2
          • PVT
          • FAVOR+
          • T2T-ViT
          • CaiT
          • CCT
          • DeiT
          • SSA
          • SA3D
      • [NLP] Natural Language Processing
        • 01_[LLMs] Large Language Models
        • [MoEs] Mixture of Experts
        • LLM Techniques
        • Attention is All You Need
        • Positional Encoding
        • Tokenization
        • MICLe
      • [CV] Computer Vision
        • MLP-based Classification
          • MLP-Mixer
          • FNet
          • EANet
        • 01_[SL] Supervised Learning
          • 01_Classification
            • Convolution Variants
            • 1x1 Convolution
            • EfficientNetV2
            • ConvNeXtV2
          • 02_Detection
            • ConvMixer
            • SOLO
            • YOLOX
            • YOLOR
            • AugFPN
            • BoT_Cls
            • BoF_OD
            • YOLOv3
            • YOLOv4
            • YOLOv5
            • YOLOv6
            • YOLOv7
            • YOLOv8
            • YOLOv9
            • YOLO-NAS
            • TPH-YOLOv5
            • TPH-YOLOv5++
            • ViTDET
          • 03_Segmentation
            • Object Instance Survey 2022
            • 01_Instance Segmentation
            • 02_Semantic Segmentation
            • 03_Panoptic Segmentation
            • 04_3D Segmentation
            • 05_Unsupervised Segmentation
            • BMask RCNN
            • ISTR
            • Transfuse
          • 04_[IS] Interactive Segmentation
            • Interactive Segmentation Techniques
            • 02_3D Interactive Segmentation
            • 03_Video Object Segmentation
            • SAM
            • HA_SAM
            • CFR-ICL
            • MST
            • ECONet
            • SimpleClick
            • FocusCut
            • f-BRS
            • iSegformer
          • 05_Object Tracking
            • 00_ObjectTracking
            • Sort
            • DeepSort
            • FairMOT
            • ByteTrack
            • StrongSORT
            • Tracktor
            • JDE
            • CenterTrack
            • PermaTrack
            • TransTrack
            • TrackFormer
            • BoT-SORT
          • 06_Face Recognition
          • 07_Image Stitching
          • 08_Image Restoration
          • 06_Refinement
            • BPR
          • 10_Scene Understanding
            • CPNet
          • 11_Human Pose Estimation
            • 3D Human Pose
            • Human Pose
          • 12_[SR] Super Resolution
            • Bicubic++
          • 13_VideoPropagation
          • 14_Image Mating
          • 15_Knowledge Distillation
          • 16_Others
        • 02_[UL] Unsupervised Learning
          • 00_Unsupervised Learning
          • 02_Deep Clustering
            • 00_K_Clusters Decision
            • Deep Cluster
            • Cluster Fit
            • DEC
            • Improving Relational Regularized Autoencoders with Spherical Sliced Fused G
            • Taxanomy
            • DeepDPM
            • BCL
            • VaDE
            • t-SNE
            • Tree-SNE
          • 04_Diffusional Models
        • 03_[SSL] Self-Supervised Learning
          • 00_Self-Supervised Learning
          • 01_Contrastive Learning
            • CPC
            • DIM
            • CMC
            • AMDIM
            • SimCLR
            • MoCo
            • MoCov2
            • YADIM
            • VICReg
            • CSL
            • Towards Domain-Agnostic Contrastive Learning
            • Non-Parametric Instance Discrimination
            • Video Contrastive Learning with Global Context
            • SupCon
            • Barlow Twin
          • 02_Predictive Tasks
          • 03_Bootstrapping
            • BYOL
          • 04_Regularization
          • 05_Masked Image Models
            • Patch Localization
            • MAE
            • SimMIM
            • DINO
          • 06_Pretext Tasks
            • PIRL
          • 07_Clustering-based
            • SwAV
        • 04_Semi-Supervised Learning
          • Fully-/Semi-/Weakly-/ Learning
          • 01_Self-training
            • Pseudo-label
            • Noisy Student
          • 02_Consistency Regularization
            • Temporal Ensembling
            • Mean Teacher
            • VAT
            • UDA
          • 03_Hybrid Methods
            • MixUp
            • MixMatch
            • ReMixMatch
            • FixMatch
            • FixMatch (unmerge)
        • 05_Multi-learning Paradigm
          • 00_Multi-learning
          • 01_Multitask
          • Gradient Surgery
          • EtE Multi-task Learning with Attention
          • MTL for Dense Predictions
          • MTL using Uncertainty
          • Which Task learned together
          • GradNorm
          • OM-Net
          • 06_Multi-task Learning
        • 06_Generative Models
          • 00_Generative Models
          • 01_Autoencoders
            • AE vs Others
            • Sparse AE
            • Denoising AE
            • Contractive AE
            • Variational AE
            • DELG
          • 02_GAN
        • Graph Convolutional Networks
          • 00_Graph Convolutional Networks
        • Neural Radiance Fields (NeRFs)
        • Deep Belief Networks
      • Multimodal Models
      • Bag of Freebies - BOF
        • 01_Augmentation
          • Mosaic
          • Cut Out
          • Mix Up
        • 02_Loss Functions
          • 01_Classification Loss
          • 02_Segmentation Loss
          • 03_Object Detection Loss
          • 04_Self-Supervised Loss
          • 05_Interactive Segmentation Loss
        • 03_Optimizer
        • 04_Normalization
          • 00_Normalization
        • 05_Regularization
        • 06_Label Assignment
          • 00_Label Assignment
          • OTA
          • SimOTA
        • 07_Auxiliary Head
      • Bag of Specials - BoS
        • Feature Pyramid
          • RCNet
        • Receptive Field
        • Attention
          • 00_Attention Modules
          • SENet
          • CBAM
          • DANet
          • SDANet
          • AttaNet
          • HaloNets
          • GCNet
          • DeepSquare
          • LBAM
          • External-Attention
          • PCT
          • Residual Attention
          • DCANet
          • GANet
          • Triplet Attention
          • Lambda Networks
          • ACTION
          • VAN
          • SegNeXt
        • Local-/Global- Features
          • Unifying Nonlocal Blocks for Neural Networks
          • Local Features
          • Global Features
        • Activation Functions
          • SiLU dSiLU
        • Post-Processing
          • Soft-NMS
          • NMW
          • WBF
        • Sliding Window
        • Graph Networks
        • Feature Fusion/Integration
        • Data-Centric
      • Others
        • Selected Top-Conference Papers
          • AAAI2021_Papers
          • CVPR2021_Papers
          • ECCV2020_Papers
          • ICCV2021_Papers
          • ICLM2022_Papers
        • Cheat Sheets
          • Pandas
        • Conference Schedule
    • Data Science
      • 03_DS_Discrete Distribution
      • Data Scientist Professional
        • 3. Statistical Experimentation Theory
        • 4. Statistical Experimentation in Python
        • 5. Model development in Python
        • 7. Data Management in SQL
      • Data...
      • ETL
      • Airflow
    • Cloud Computing
      • Azure Data Fundamental
      • Amazon Web Services
        • AWS - Cloud 101
        • AWS - Machine Learning Foundation (Lab)
          • 1. Introduction to MLF
          • 2. AI and ML
          • 3. ML Pipeline
          • 4. ML Tools and Services
          • 5. Wrapping it Up
        • AWS - Cloud Practitioner Essentials
        • AWS - GenAI
      • Google Cloud
      • IBM Watson
    • Big Data
      • PySpark
        • Introduction to PySpark
          • 1. Getting to know PySpark
          • 2. Manipulating Data
          • 3. Getting Started with ML Pipelines
          • 4. Model Tuning and Selection
        • Big Data Fundamentals with PySpark
          • 1. Introduction to BigData Analysis with Spark
          • 2. Programming in PySpark RDD’s
          • 3. PySpark SQL & DataFrames
          • 4. Machine Learning with PySpark MLlib
    • English
      • Reading
      • Listening
      • Speaking
        • Speaking_Part1
          • 1_Speaking Part 1
          • 2_Speaking Part 1
          • 3_Speaking Part 1
          • 4_Speaking Part 1
          • 5_Speaking Part 1
          • 6_Speaking Part 1
          • 7_Speaking Part 1
          • 8_Speaking Part 1
          • 9_Speaking Part 1
          • 10_Speaking Part 1
          • 11_Speaking Part 1
          • 12_Speaking Part 1
          • 13_Speaking Part 1
          • 14_Speaking Part 1
          • 15_Speaking Part 1
          • 16_Speaking Part 1
          • 17_Speaking Part 1
          • 18_Speaking Part 1
          • 19_Speaking Part 1
          • 20_Speaking Part 1
          • 21_Speaking Part 1
          • 22_Speaking Part 1
          • 23_Speaking Part 1
        • Speaking_Part2
          • 1_Speaking Part 2
          • 2_Speaking Part 2
          • 3_Speaking Part 2
          • 4_Speaking Part 2
          • 5_Speaking Part 2
          • 6_Speaking Part 2
          • 7_Speaking Part 2
          • 8_Speaking Part 2
          • 9_Speaking Part 2
          • 10_Speaking Part 2
          • 11_Speaking Part 2
          • 12_Speaking Part 2
          • 13_Speaking Part 2
          • 14_Speaking Part 2
          • 15_Speaking Part 2
          • 16_Speaking Part 2
          • 17_Speaking Part 2
          • 18_Speaking Part 2
          • 19_Speaking Part 2
          • 20_Speaking Part 2
          • People
          • Places
            • Visited House
          • Events
          • Activities
            • Interesting Job
          • Things
        • Speaking_Part3
          • Advertisements
          • Outdoor Activities
          • Navigation and Exploration
          • Fast Food
          • Air Pollution
          • Free Time
          • Interesting Movie
          • Gifts
          • Independence in Children
          • Noisy
          • Complain
          • T-shirts
          • Value of Money
          • Restaurant
          • Global
          • Relaxation
          • Special Places
        • Mixed-Test
          • 01_Mix_Language
      • Writing
        • Writing_Task1
          • Paraphrase
          • Overview Sentence
          • Grammar
          • Charts
            • Line - Average Montly Temperatures
            • Line - Fuels
            • Line - Birth Rate
            • Line - River Water
            • Line - U.S Energy
            • Line - Areas of Crime
            • Line - Renewable Energy
            • Line - Oversea Visitors
            • Chart - People ate in the UK
            • Chart - Music Event Attendance
            • Chart - Wind Energy
            • Chart - Children Attend Sports in Australia
            • Chart - Weekly Hours in Australia
            • Chart - Films released vs Tickets sold
            • Chart - Average Retirement Age
          • Process
          • Maps
            • Library Ground
          • Table
          • Multiple Graphs
            • Life Expectancy
        • Writing_Task2
          • Opinion Essay
            • Higher Salary
            • Goal of Schools
            • Local History
            • Retirement Age
            • Happy Society
            • Food Necessary
            • Pay for more Art
            • Eradicate Poverty
            • Team Activities
            • Wild Animals and Birds
          • Discussion Essay
            • Sports
            • Make Money
            • Crime punished
            • Equipment for Student
            • Keep a Gun
          • Advantages and Disadvantages Essay
            • Live Away
            • Transform to Farms
          • Problem-Solution Essay
            • Extreme Sports
            • Spend Time Away From Families
        • Complex Sentence
        • If, Wish, Hope
      • Synonym Common Mistakes
      • Phrasal Verbs
      • TOEIC 990
    • Interview
      • Deep Learning Questions
        • C1_Mathematical Foundation
        • C2_Fundamentals of ML
        • C3_Fundamentals of DL
        • C4_Classic Network
        • C5_CNN
        • C6_RNN
        • C7_Target Detection
        • C8_Image Segmentation
        • C9_Reinforcement Learning
        • C10_Migration Learning
        • C13_Optimization Algorithm
        • C14_Super Parameter Adjustment
        • C15_Hetorogeneous Computing
      • Data Science Questions
    • Courses (Uni and Mooc)
      • AI Open Courses
      • DS Certificates
      • IBM Gen AI Engineering Professional Certificate
        • 10. Generative AI and LLMs: Architecture and Data Preparation
        • 11. Gen AI Foundational Models for NLP & Language Understanding
        • 12. Gen AI Language Modeling with Transformers
          • Module 1 - Fundamental Concepts of Transformer Architecture
          • Module 2 - Advanced Concepts of Transformer Architecture
        • 13. Generative AI Engineering and Fine-Tuning Transformers
        • 14. Generative AI Advanced Fine-Tuning for LLMs
        • 15. Fundamentals of AI Agents using RAG and Langchain
          • Module 1 - RAG Framework
          • Module 2 - Prompt Engineering and LangChain
        • 16. Project: Generative AI Applications with RAG and LangChain
      • Data Science Foundations: Data Structures and Algorithms Specialization
      • Flask - AI Applications
        • 1. Packaging Concepts
        • 2. Web App Deployment
        • 3. Creating AI Application
          • Sentiment Analysis
          • Emotion Detector
        • Deploy Deep Learning Models using Flask
      • Docker, Kubernetes & OpenShift
        • 1. Containers and Containerization
        • 2. Kubernetes Basics
        • 3. Managing Applications with Kubernetes
        • 4. The Kubernetes Ecosystem
        • 5. Final Assignments
      • Data Structures
        • 1. Introduction to DS&A
      • Algorithms
        • QE - Algorithms
        • Sorting Algorithms
          • Binary Search
          • Insertion Sort
          • Merge Sort
          • Quick sort
          • Heap sort
        • Divide and Conquer
        • Greedy Algorithm
        • Dynamic Programming
      • Operating System
        • QE - Operating System
        • 00_Operating System
      • CS231n Deep Learning for Computer Vision
        • 13. Self-Supervised Learning
      • CS480 Introduction to Machine Learning
        • 19. Attention and Transformer Networks
      • CS330 Multi-task and Meta Learning
        • 1. What is Multi-task Learning
      • Processing the Environment
        • Attention
      • Open VINO
      • Metaverse
        • 00_Metaverse
        • Spark AR
    • Research Projects
      • PPE Detection
        • Few-shot Data Sampling
      • Multiple Object Tracking
        • In-place Augmentation
      • Deep Clustering
        • Metrics
      • Defect Detection
        • 01_Defect_Improvement
        • Dataset: MVTec
        • Mixed supervision for surface-defect detection:
        • Practical Defect Detection
        • (Survey) Fabric Defect Detection
        • (Summary) Fabric Defect Detection
      • Medical Images
        • 01_Lung_Improvement
        • SANet
        • AnaXNet
        • 3D_EtoE Lung Cancer Screening
        • Semantics-enriched Representation
        • Attend And Compare
        • Recent Works
        • Kaggle_Medical Images
    • AI Engineer
    • Financial Invesment
      • 01_TPTrading
      • 02_BCTC
      • 03_Demand Side Platform (DSP)
      • 04_Business Models
      • Trading
        • 01_Technical Analysis
        • 02_Mentality
        • 03_Support and Resistance
    • Books
      • AI Books
      • Books
        • Persuasion IQ
        • Communication Skills
        • 48 Hours a Day
        • Maslow's Pyramid
        • MBTI
        • Tư Duy Ngược
      • Audio Books
    • Project Management
      • PM Methods
        • Agile
        • Scrum
        • Kanban
      • Foundations of PM
        • Module 1
        • Module 2
        • Module 3
        • Module 4
      • Project Initiation: Starting a Successul Projet
        • Module 1
        • Module 2
        • Module 3
        • Module 4
      • Project Planning: Putting It All Together
        • Module 1
      • Project Execution: Running the Project
      • Agile Project Management
      • Capstone: Applying Project Management in the Real World
    • Administrator

Non-local Network

Xiaolong Wang, Ross Girshick, Abhinav Gupta, Kaiming He

{, }

Paper: https://openaccess.thecvf.com/content_cvpr_2018/html/Wang_Non-Local_Neural_Networks_CVPR_2018_paper.html 

Code: https://github.com/facebookresearch/video-nonlocal-net?utm_source=catalyzex.com

Motivation, Objectives and Related Works

Motivation

  • Both convolutional and recurrent operations are building blocks that process one local neighbourhood at a time. 

Objectives

  • Present non-local operations as a generic family of building blocks for capturing long-range dependencies.

  • Non-local operation computes the response at a position as a weighted sum of the features at all positions. 

  • This building block can be plugged into many computer vision architectures. 

Related Works

Non-local Image Processing

    1. Non-local means [4] 

      • A classical filtering algorithm that computes a weighted mean of all pixels in an image. 

      • It allows distant pixels to contribute to the filtered response at a location based on patch appearance similarity. 

    2. BM3D (block-matching 3D) [10]

      • Implement non-local filtering idea.

      • Performs filtering on a group of similar, but non-local, patches. 

Graphical Models

    1. Conditional random fields (CRF) [29, 28] ==> a graphic model that models long-range dependencies. 

      • A CRF can be exploited to post-process semantic segmentation predictions of a network [9]. 

      • The iterative mean-field inference of CRF can be turned into a recurrent network and trained [56, 42, 8, 18, 34]. 

Feedforward Modelling for Sequences

    1. Using feedforward (i.e., non-recurrent) networks for modelling sequences in speech and language [36, 54, 15]. 

      • Long-term dependencies are captured by the large receptive fields contributed by very deep 1-D convolutions. 

      • These feedforward models are amenable to parallelised implementations and can be more efficient than widely used recurrent models. 

Self-attention

    1. A self-attention module [49] computes the response at a position in a sequence (e.g., a sentence) by attending to all positions and taking their weighted average in an embedding space. 

Interaction Networks

    1. Interaction Networks (IN) [2, 52] model physical systems, operating on graphs of objects involved in pairwise interactions. 

      • Hoshen [24] presented the more efficient Vertex Attention IN (VAIN) in the context of multi-agent predictive modelling. 

      • Relation Networks [40], computes a function on the feature embeddings at all pairs of positions in its input. 

Video Classification Architectures

    1. A natural solution to video classification is to combine the success of CNNs for images and RNNs for sequences [55, 11]. 

    2. Feedforward models are achieved by 3D convolutions (C3D) [26, 48] in spacetime, and the 3D filters can be formed by “inflating” [13, 7] pre-trained 2D filters.

    3. Optical flow [45] and trajectories [50, 51] can be helpful. 

Non-local Means [4 - Phu Read]

  1. Definition:

    • The NLM algorithm aims to reduce noise in images while preserving image details and textures. 

    • Unlike traditional "local" filters (e.g., Gaussian, median), which average pixels within a small neighborhood around the target pixel, NLM considers similarities between patches of pixels throughout the entire image.

  2. How It Works:

    • Patch Comparison: For a target pixel, the NLM algorithm defines a small patch around that pixel. It then searches the entire image for other patches that are similar to the target patch.

    • Weighted Averaging: The value of the target pixel is replaced with a weighted average of pixels from similar patches. The weights are determined by how similar the patches are to the target patch. More similar patches get higher weights.

  3. Python:

    • skimage.restoration.denoise_nl_means (scikit-image)

    • cv2.fastNlMeansDenoising (OpenCV)

  4. Formular:

    • Given an image u, at a pixel p, the denoise value of pixel p  is calculated as:

with

    • C(p): normalized parameter.

    • w(p,q) is the weighting function of pixels p and q. It can be weighted Euclidean distance.

    • Let B(p) be the average value of pixels around pixel p.

    • h is a parameter that adjusts the degree of weight reduction as the Euclidean distance increases.

Model

Idea

  • Non-local modules enhance deep neural networks by directly computing relationships between distant positions within an image or video, capturing long-range dependencies that convolutional layers might miss.

Steps

Non-local Operation 

    1. In Neural Network, non-local operation can be:

    • i: index of an output position (in space, time, or spacetime) whose response is to be computed.

    • j: index that enumerates all possible positions. 

    • x: input signal (image, sequence, video; often their features) 

    • y: output signal of the same size as x. 

    • f: A pairwise function computes a scalar (representing a relationship such as affinity) between i and all j. 

    • g: The unary function computes a representation of the input signal at the position j. 

    • C(x): normalizing factor.

Instantiations

  • g(xj) = Wgxj, where Wg is a weight matrix to be learned. 

  • Choices for the pairwise function f:

Gaussian

  • Here xiTxj is dot-product similarity. 

  • Euclidean distance as used in [4, 47] is also applicable, but dot product is more implementation-friendly in modern deep learning platforms. 

with

Embedded Gaussian

  • Here θ(xi) = Wθxi (query) and φ(xj) = Wφxj (key) are two embeddings. 

with

Dot Product

  • Here θ(xi) = Wθxi and φ(xj) = Wφxj are two embeddings. 

  • C(x) = N where N is the number of positions in x, rather than the sum of f.

Concatenation

  • Concatenation is used by the pairwise function in Relation Networks [40] for visual reasoning.

  • Here [· , ·] denotes concatenation and wf is a weight vector that projects the concatenated vector to a scalar. 

  • C(x) = N. 

  • In this case, we adopt ReLU [35] in f.

Non-local Block

  • Wrap the non-local operation in Eq.(1) into a non-local block:

  • where yi is given in Eq.(1) and “+xi” denotes a residual connection [21]. 

  • The residual connection allows us to insert a new non-local block into any pre-trained model, without breaking its initial behavior (e.g., if Wz is initialized as zero). 

  • The pairwise computation in Eq.(2), (3), or (4) can be simply done by matrix multiplication as shown in Figure 2; the concatenation version in (5) is straightforward. 

  • Lightweight: The pairwise computation of a non-local block is lightweight when it is used in high-level, sub-sampled feature maps. (ex: T = 4, H = W = 14 or 7). The pairwise computation as done by matrix multiplication is comparable to a typical convolutional layer in standard networks.

  • Implementation of Non-local Blocks: 

    • We set the number of channels represented by Wg, Wθ, and Wφ to be half of the number of channels in x. [21] 

    • The weight matrix Wz in Eq.(6) computes a position-wise embedding on yi, matching the number of channels to that of x. 

    • A subsampling trick can be used to further reduce computation. 

Architecture

  • Baseline ResNet-50 C2D model for video: 

Training Strategy

  1. Training. 

    • Pre-trained on ImageNet [39]. 

    • Fine-tune using 32-frame input clips. 

    • These clips are formed by randomly cropping out 64 consecutive frames from the original full-length video and then dropping every other frame. 

    • The spatial size is 224×224 pixels, randomly cropped from a scaled video whose shorter side is randomly sampled in [256, 320] pixels, following [46]. 

    • Train on an 8-GPU machine and each GPU has 8 clips in a mini-batch (so in total with a mini-batch size of 64 clips). 

    • 400k iterations in total, starting with a learning rate of 0.01 and reducing it by a factor of 10 at every 150k iterations (see also Figure 4). 

    • Momentum of 0.9 and a weight decay of 0.0001. 

    • Dropout [22] after the global pooling layer, with a dropout ratio of 0.5. 

    • Fine-tune our models with BatchNorm (BN) [25] enabled when it is applied. ==> reduces overfitting. 

    • Add a BN layer right after the last 1×1×1 layer that represents Wz; 

  2. Inference. 

    • Spatial: Perform spatially fully convolutional inference on videos whose shorter side is rescaled to 256. 

    • Temporal: Sample 10 clips evenly from a full-length video and compute the softmax scores on them individually. 

    • The final prediction is the averaged softmax scores of all clips.

Experimental Results

Dataset

  • Kinetics and Charades Datasets

Metrics

  • Top 1 - accuracy.

  • AP_box, AP_mask.

Experimental Results

Ablations

Key Takeaways

  • Both optical flow and trajectories are off-the-shelf modules that may find long-range, non-local dependency. 

References

  • Buades, Antoni, Bartomeu Coll, and J-M. Morel. "A non-local algorithm for image denoising." 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05). Vol. 2. Ieee, 2005.

  • https://viblo.asia/p/paper-explained-non-local-neural-networks-EvbLbx3Z4nk 

  • https://colab.research.google.com/drive/1xB-bkB1DmpttoukIddfOTihFKvILnZth?usp=sharing 

    • n2  n0

    • θ

About Me:

  • Phone: +84 946 937 937 (Phu)

  • Email: [email protected]  or  [email protected] 

  • Facebook: https://www.facebook.com/phu210.vn/

  • Page: https://www.lephongphu.works/home-page 

"I have gathered information from various sources on the internet, and it is now available for your perusal.

Thank you so much for coming here!"

Google Sites
Report abuse
Page details
Page updated
Google Sites
Report abuse