Search this site
Embedded Files
Lê Phong Phú
  • About Me!
  • AI Expert Roadmap
    • PyTorch
      • PyTorch Fundamentals
        • 1. Introduction to PyTorch
        • 2. Introduction to Computer Vision with PyTorch
        • 3. Introduction to Natural Language Processing with PyTorch
        • 4. Introduction to Audio Classification with PyTorch
      • Intermediate DL with Pytorch
        • 1_TrainingRobustNN
        • 2_Image&CNN
        • 3_Sequences&RNN
        • 4_Multi-Input&Multi-Output
    • Machine Learning
      • 01_ML_General
      • 02_ML_Supervised Learning
      • 03_ML_Unsupervised Learning
    • Mamba
      • 00_Sequence Modelling, S4 and Mamba
    • Transformers (CV&NLP)
      • NLNet
      • 01_Pure Transformer
        • ViT
        • Segformer
      • 02_Hybrid Transformer
        • DETR
        • Deformable DETR
        • DINO (Detection)
      • 99_Unfilter
        • LG-Transformer
        • Image GPT
        • Points as Queries
        • VST
        • MAXViT
        • ViTMAE-Detect
        • MAGNETO
        • AIT
        • MTV
        • PiT
        • Swin
        • PVTv2
        • PVT
        • FAVOR+
        • T2T-ViT
        • CaiT
        • CCT
        • DeiT
        • SSA
        • SA3D
    • [NLP] Natural Language Processing
      • 01_[LLMs] Large Language Models
      • [MoEs] Mixture of Experts
      • LLM Techniques
      • Attention is All You Need
      • Positional Encoding
      • Tokenization
      • MICLe
    • [CV] Computer Vision
      • MLP-based Classification
        • MLP-Mixer
        • FNet
        • EANet
      • 01_[SL] Supervised Learning
        • 01_Classification
          • Convolution Variants
          • 1x1 Convolution
          • EfficientNetV2
          • ConvNeXtV2
        • 02_Detection
          • ConvMixer
          • SOLO
          • YOLOX
          • YOLOR
          • AugFPN
          • BoT_Cls
          • BoF_OD
          • YOLOv3
          • YOLOv4
          • YOLOv5
          • YOLOv6
          • YOLOv7
          • YOLOv8
          • YOLOv9
          • YOLO-NAS
          • TPH-YOLOv5
          • TPH-YOLOv5++
          • ViTDET
        • 03_Segmentation
          • Object Instance Survey 2022
          • 01_Instance Segmentation
          • 02_Semantic Segmentation
          • 03_Panoptic Segmentation
          • 04_3D Segmentation
          • 05_Unsupervised Segmentation
          • BMask RCNN
          • ISTR
          • Transfuse
        • 04_[IS] Interactive Segmentation
          • Interactive Segmentation Techniques
          • 02_3D Interactive Segmentation
          • 03_Video Object Segmentation
          • SAM
          • HA_SAM
          • CFR-ICL
          • MST
          • ECONet
          • SimpleClick
          • FocusCut
          • f-BRS
          • iSegformer
        • 05_Object Tracking
          • 00_ObjectTracking
          • Sort
          • DeepSort
          • FairMOT
          • ByteTrack
          • StrongSORT
          • Tracktor
          • JDE
          • CenterTrack
          • PermaTrack
          • TransTrack
          • TrackFormer
          • BoT-SORT
        • 06_Face Recognition
        • 07_Image Stitching
        • 08_Image Restoration
        • 06_Refinement
          • BPR
        • 10_Scene Understanding
          • CPNet
        • 11_Human Pose Estimation
          • 3D Human Pose
          • Human Pose
        • 12_[SR] Super Resolution
          • Bicubic++
        • 13_VideoPropagation
        • 14_Image Mating
        • 15_Knowledge Distillation
        • 16_Others
      • 02_[UL] Unsupervised Learning
        • 00_Unsupervised Learning
        • 02_Deep Clustering
          • 00_K_Clusters Decision
          • Deep Cluster
          • Cluster Fit
          • DEC
          • Improving Relational Regularized Autoencoders with Spherical Sliced Fused G
          • Taxanomy
          • DeepDPM
          • BCL
          • VaDE
          • t-SNE
          • Tree-SNE
        • 04_Diffusional Models
      • 03_[SSL] Self-Supervised Learning
        • 00_Self-Supervised Learning
        • 01_Contrastive Learning
          • CPC
          • DIM
          • CMC
          • AMDIM
          • SimCLR
          • MoCo
          • MoCov2
          • YADIM
          • VICReg
          • CSL
          • Towards Domain-Agnostic Contrastive Learning
          • Non-Parametric Instance Discrimination
          • Video Contrastive Learning with Global Context
          • SupCon
          • Barlow Twin
        • 02_Predictive Tasks
        • 03_Bootstrapping
          • BYOL
        • 04_Regularization
        • 05_Masked Image Models
          • Patch Localization
          • MAE
          • SimMIM
          • DINO
        • 06_Pretext Tasks
          • PIRL
        • 07_Clustering-based
          • SwAV
      • 04_Semi-Supervised Learning
        • Fully-/Semi-/Weakly-/ Learning
        • 01_Self-training
          • Pseudo-label
          • Noisy Student
        • 02_Consistency Regularization
          • Temporal Ensembling
          • Mean Teacher
          • VAT
          • UDA
        • 03_Hybrid Methods
          • MixUp
          • MixMatch
          • ReMixMatch
          • FixMatch
          • FixMatch (unmerge)
      • 05_Multi-learning Paradigm
        • 00_Multi-learning
        • 01_Multitask
        • Gradient Surgery
        • EtE Multi-task Learning with Attention
        • MTL for Dense Predictions
        • MTL using Uncertainty
        • Which Task learned together
        • GradNorm
        • OM-Net
        • 06_Multi-task Learning
      • 06_Generative Models
        • 00_Generative Models
        • 01_Autoencoders
          • AE vs Others
          • Sparse AE
          • Denoising AE
          • Contractive AE
          • Variational AE
          • DELG
        • 02_GAN
      • Graph Convolutional Networks
        • 00_Graph Convolutional Networks
      • Neural Radiance Fields (NeRFs)
      • Deep Belief Networks
    • Multimodal Models
    • Bag of Freebies - BOF
      • 01_Augmentation
        • Mosaic
        • Cut Out
        • Mix Up
      • 02_Loss Functions
        • 01_Classification Loss
        • 02_Segmentation Loss
        • 03_Object Detection Loss
        • 04_Self-Supervised Loss
        • 05_Interactive Segmentation Loss
      • 03_Optimizer
      • 04_Normalization
        • 00_Normalization
      • 05_Regularization
      • 06_Label Assignment
        • 00_Label Assignment
        • OTA
        • SimOTA
      • 07_Auxiliary Head
    • Bag of Specials - BoS
      • Feature Pyramid
        • RCNet
      • Receptive Field
      • Attention
        • 00_Attention Modules
        • SENet
        • CBAM
        • DANet
        • SDANet
        • AttaNet
        • HaloNets
        • GCNet
        • DeepSquare
        • LBAM
        • External-Attention
        • PCT
        • Residual Attention
        • DCANet
        • GANet
        • Triplet Attention
        • Lambda Networks
        • ACTION
        • VAN
        • SegNeXt
      • Local-/Global- Features
        • Unifying Nonlocal Blocks for Neural Networks
        • Local Features
        • Global Features
      • Activation Functions
        • SiLU dSiLU
      • Post-Processing
        • Soft-NMS
        • NMW
        • WBF
      • Sliding Window
      • Graph Networks
      • Feature Fusion/Integration
      • Data-Centric
    • Others
      • Selected Top-Conference Papers
        • AAAI2021_Papers
        • CVPR2021_Papers
        • ECCV2020_Papers
        • ICCV2021_Papers
        • ICLM2022_Papers
      • Cheat Sheets
        • Pandas
      • Conference Schedule
  • Data Science
    • 03_DS_Discrete Distribution
    • Data Scientist Professional
      • 3. Statistical Experimentation Theory
      • 4. Statistical Experimentation in Python
      • 5. Model development in Python
      • 7. Data Management in SQL
    • Data...
    • ETL
    • Airflow
  • Cloud Computing
    • Azure Data Fundamental
    • Amazon Web Services
      • AWS - Cloud 101
      • AWS - Machine Learning Foundation (Lab)
        • 1. Introduction to MLF
        • 2. AI and ML
        • 3. ML Pipeline
        • 4. ML Tools and Services
        • 5. Wrapping it Up
      • AWS - Cloud Practitioner Essentials
      • AWS - GenAI
    • Google Cloud
    • IBM Watson
  • Big Data
    • PySpark
      • Introduction to PySpark
        • 1. Getting to know PySpark
        • 2. Manipulating Data
        • 3. Getting Started with ML Pipelines
        • 4. Model Tuning and Selection
      • Big Data Fundamentals with PySpark
        • 1. Introduction to BigData Analysis with Spark
        • 2. Programming in PySpark RDD’s
        • 3. PySpark SQL & DataFrames
        • 4. Machine Learning with PySpark MLlib
  • English
    • Reading
    • Listening
    • Speaking
      • Speaking_Part1
        • 1_Speaking Part 1
        • 2_Speaking Part 1
        • 3_Speaking Part 1
        • 4_Speaking Part 1
        • 5_Speaking Part 1
        • 6_Speaking Part 1
        • 7_Speaking Part 1
        • 8_Speaking Part 1
        • 9_Speaking Part 1
        • 10_Speaking Part 1
        • 11_Speaking Part 1
        • 12_Speaking Part 1
        • 13_Speaking Part 1
        • 14_Speaking Part 1
        • 15_Speaking Part 1
        • 16_Speaking Part 1
        • 17_Speaking Part 1
        • 18_Speaking Part 1
        • 19_Speaking Part 1
        • 20_Speaking Part 1
        • 21_Speaking Part 1
        • 22_Speaking Part 1
        • 23_Speaking Part 1
      • Speaking_Part2
        • 1_Speaking Part 2
        • 2_Speaking Part 2
        • 3_Speaking Part 2
        • 4_Speaking Part 2
        • 5_Speaking Part 2
        • 6_Speaking Part 2
        • 7_Speaking Part 2
        • 8_Speaking Part 2
        • 9_Speaking Part 2
        • 10_Speaking Part 2
        • 11_Speaking Part 2
        • 12_Speaking Part 2
        • 13_Speaking Part 2
        • 14_Speaking Part 2
        • 15_Speaking Part 2
        • 16_Speaking Part 2
        • 17_Speaking Part 2
        • 18_Speaking Part 2
        • 19_Speaking Part 2
        • 20_Speaking Part 2
        • People
        • Places
          • Visited House
        • Events
        • Activities
          • Interesting Job
        • Things
      • Speaking_Part3
        • Advertisements
        • Outdoor Activities
        • Navigation and Exploration
        • Fast Food
        • Air Pollution
        • Free Time
        • Interesting Movie
        • Gifts
        • Independence in Children
        • Noisy
        • Complain
        • T-shirts
        • Value of Money
        • Restaurant
        • Global
        • Relaxation
        • Special Places
      • Mixed-Test
        • 01_Mix_Language
    • Writing
      • Writing_Task1
        • Paraphrase
        • Overview Sentence
        • Grammar
        • Charts
          • Line - Average Montly Temperatures
          • Line - Fuels
          • Line - Birth Rate
          • Line - River Water
          • Line - U.S Energy
          • Line - Areas of Crime
          • Line - Renewable Energy
          • Line - Oversea Visitors
          • Chart - People ate in the UK
          • Chart - Music Event Attendance
          • Chart - Wind Energy
          • Chart - Children Attend Sports in Australia
          • Chart - Weekly Hours in Australia
          • Chart - Films released vs Tickets sold
          • Chart - Average Retirement Age
        • Process
        • Maps
          • Library Ground
        • Table
        • Multiple Graphs
          • Life Expectancy
      • Writing_Task2
        • Opinion Essay
          • Higher Salary
          • Goal of Schools
          • Local History
          • Retirement Age
          • Happy Society
          • Food Necessary
          • Pay for more Art
          • Eradicate Poverty
          • Team Activities
          • Wild Animals and Birds
        • Discussion Essay
          • Sports
          • Make Money
          • Crime punished
          • Equipment for Student
          • Keep a Gun
        • Advantages and Disadvantages Essay
          • Live Away
          • Transform to Farms
        • Problem-Solution Essay
          • Extreme Sports
          • Spend Time Away From Families
      • Complex Sentence
      • If, Wish, Hope
    • Synonym Common Mistakes
    • Phrasal Verbs
    • TOEIC 990
  • Interview
    • Deep Learning Questions
      • C1_Mathematical Foundation
      • C2_Fundamentals of ML
      • C3_Fundamentals of DL
      • C4_Classic Network
      • C5_CNN
      • C6_RNN
      • C7_Target Detection
      • C8_Image Segmentation
      • C9_Reinforcement Learning
      • C10_Migration Learning
      • C13_Optimization Algorithm
      • C14_Super Parameter Adjustment
      • C15_Hetorogeneous Computing
    • Data Science Questions
  • Courses (Uni and Mooc)
    • AI Open Courses
    • DS Certificates
    • IBM Gen AI Engineering Professional Certificate
      • 10. Generative AI and LLMs: Architecture and Data Preparation
      • 11. Gen AI Foundational Models for NLP & Language Understanding
      • 12. Gen AI Language Modeling with Transformers
        • Module 1 - Fundamental Concepts of Transformer Architecture
        • Module 2 - Advanced Concepts of Transformer Architecture
      • 13. Generative AI Engineering and Fine-Tuning Transformers
      • 14. Generative AI Advanced Fine-Tuning for LLMs
      • 15. Fundamentals of AI Agents using RAG and Langchain
        • Module 1 - RAG Framework
        • Module 2 - Prompt Engineering and LangChain
      • 16. Project: Generative AI Applications with RAG and LangChain
    • Data Science Foundations: Data Structures and Algorithms Specialization
    • Flask - AI Applications
      • 1. Packaging Concepts
      • 2. Web App Deployment
      • 3. Creating AI Application
        • Sentiment Analysis
        • Emotion Detector
      • Deploy Deep Learning Models using Flask
    • Docker, Kubernetes & OpenShift
      • 1. Containers and Containerization
      • 2. Kubernetes Basics
      • 3. Managing Applications with Kubernetes
      • 4. The Kubernetes Ecosystem
      • 5. Final Assignments
    • Data Structures
      • 1. Introduction to DS&A
    • Algorithms
      • QE - Algorithms
      • Sorting Algorithms
        • Binary Search
        • Insertion Sort
        • Merge Sort
        • Quick sort
        • Heap sort
      • Divide and Conquer
      • Greedy Algorithm
      • Dynamic Programming
    • Operating System
      • QE - Operating System
      • 00_Operating System
    • CS231n Deep Learning for Computer Vision
      • 13. Self-Supervised Learning
    • CS480 Introduction to Machine Learning
      • 19. Attention and Transformer Networks
    • CS330 Multi-task and Meta Learning
      • 1. What is Multi-task Learning
    • Processing the Environment
      • Attention
    • Open VINO
    • Metaverse
      • 00_Metaverse
      • Spark AR
  • Research Projects
    • PPE Detection
      • Few-shot Data Sampling
    • Multiple Object Tracking
      • In-place Augmentation
    • Deep Clustering
      • Metrics
    • Defect Detection
      • 01_Defect_Improvement
      • Dataset: MVTec
      • Mixed supervision for surface-defect detection:
      • Practical Defect Detection
      • (Survey) Fabric Defect Detection
      • (Summary) Fabric Defect Detection
    • Medical Images
      • 01_Lung_Improvement
      • SANet
      • AnaXNet
      • 3D_EtoE Lung Cancer Screening
      • Semantics-enriched Representation
      • Attend And Compare
      • Recent Works
      • Kaggle_Medical Images
  • AI Engineer
  • Financial Invesment
    • 01_TPTrading
    • 02_BCTC
    • 03_Demand Side Platform (DSP)
    • 04_Business Models
    • Trading
      • 01_Technical Analysis
      • 02_Mentality
      • 03_Support and Resistance
  • Books
    • AI Books
    • Books
      • Persuasion IQ
      • Communication Skills
      • 48 Hours a Day
      • Maslow's Pyramid
      • MBTI
      • Tư Duy Ngược
    • Audio Books
  • Project Management
    • PM Methods
      • Agile
      • Scrum
      • Kanban
    • Foundations of PM
      • Module 1
      • Module 2
      • Module 3
      • Module 4
    • Project Initiation: Starting a Successul Projet
      • Module 1
      • Module 2
      • Module 3
      • Module 4
    • Project Planning: Putting It All Together
      • Module 1
    • Project Execution: Running the Project
    • Agile Project Management
    • Capstone: Applying Project Management in the Real World
  • Administrator
Lê Phong Phú
  • About Me!
  • AI Expert Roadmap
    • PyTorch
      • PyTorch Fundamentals
        • 1. Introduction to PyTorch
        • 2. Introduction to Computer Vision with PyTorch
        • 3. Introduction to Natural Language Processing with PyTorch
        • 4. Introduction to Audio Classification with PyTorch
      • Intermediate DL with Pytorch
        • 1_TrainingRobustNN
        • 2_Image&CNN
        • 3_Sequences&RNN
        • 4_Multi-Input&Multi-Output
    • Machine Learning
      • 01_ML_General
      • 02_ML_Supervised Learning
      • 03_ML_Unsupervised Learning
    • Mamba
      • 00_Sequence Modelling, S4 and Mamba
    • Transformers (CV&NLP)
      • NLNet
      • 01_Pure Transformer
        • ViT
        • Segformer
      • 02_Hybrid Transformer
        • DETR
        • Deformable DETR
        • DINO (Detection)
      • 99_Unfilter
        • LG-Transformer
        • Image GPT
        • Points as Queries
        • VST
        • MAXViT
        • ViTMAE-Detect
        • MAGNETO
        • AIT
        • MTV
        • PiT
        • Swin
        • PVTv2
        • PVT
        • FAVOR+
        • T2T-ViT
        • CaiT
        • CCT
        • DeiT
        • SSA
        • SA3D
    • [NLP] Natural Language Processing
      • 01_[LLMs] Large Language Models
      • [MoEs] Mixture of Experts
      • LLM Techniques
      • Attention is All You Need
      • Positional Encoding
      • Tokenization
      • MICLe
    • [CV] Computer Vision
      • MLP-based Classification
        • MLP-Mixer
        • FNet
        • EANet
      • 01_[SL] Supervised Learning
        • 01_Classification
          • Convolution Variants
          • 1x1 Convolution
          • EfficientNetV2
          • ConvNeXtV2
        • 02_Detection
          • ConvMixer
          • SOLO
          • YOLOX
          • YOLOR
          • AugFPN
          • BoT_Cls
          • BoF_OD
          • YOLOv3
          • YOLOv4
          • YOLOv5
          • YOLOv6
          • YOLOv7
          • YOLOv8
          • YOLOv9
          • YOLO-NAS
          • TPH-YOLOv5
          • TPH-YOLOv5++
          • ViTDET
        • 03_Segmentation
          • Object Instance Survey 2022
          • 01_Instance Segmentation
          • 02_Semantic Segmentation
          • 03_Panoptic Segmentation
          • 04_3D Segmentation
          • 05_Unsupervised Segmentation
          • BMask RCNN
          • ISTR
          • Transfuse
        • 04_[IS] Interactive Segmentation
          • Interactive Segmentation Techniques
          • 02_3D Interactive Segmentation
          • 03_Video Object Segmentation
          • SAM
          • HA_SAM
          • CFR-ICL
          • MST
          • ECONet
          • SimpleClick
          • FocusCut
          • f-BRS
          • iSegformer
        • 05_Object Tracking
          • 00_ObjectTracking
          • Sort
          • DeepSort
          • FairMOT
          • ByteTrack
          • StrongSORT
          • Tracktor
          • JDE
          • CenterTrack
          • PermaTrack
          • TransTrack
          • TrackFormer
          • BoT-SORT
        • 06_Face Recognition
        • 07_Image Stitching
        • 08_Image Restoration
        • 06_Refinement
          • BPR
        • 10_Scene Understanding
          • CPNet
        • 11_Human Pose Estimation
          • 3D Human Pose
          • Human Pose
        • 12_[SR] Super Resolution
          • Bicubic++
        • 13_VideoPropagation
        • 14_Image Mating
        • 15_Knowledge Distillation
        • 16_Others
      • 02_[UL] Unsupervised Learning
        • 00_Unsupervised Learning
        • 02_Deep Clustering
          • 00_K_Clusters Decision
          • Deep Cluster
          • Cluster Fit
          • DEC
          • Improving Relational Regularized Autoencoders with Spherical Sliced Fused G
          • Taxanomy
          • DeepDPM
          • BCL
          • VaDE
          • t-SNE
          • Tree-SNE
        • 04_Diffusional Models
      • 03_[SSL] Self-Supervised Learning
        • 00_Self-Supervised Learning
        • 01_Contrastive Learning
          • CPC
          • DIM
          • CMC
          • AMDIM
          • SimCLR
          • MoCo
          • MoCov2
          • YADIM
          • VICReg
          • CSL
          • Towards Domain-Agnostic Contrastive Learning
          • Non-Parametric Instance Discrimination
          • Video Contrastive Learning with Global Context
          • SupCon
          • Barlow Twin
        • 02_Predictive Tasks
        • 03_Bootstrapping
          • BYOL
        • 04_Regularization
        • 05_Masked Image Models
          • Patch Localization
          • MAE
          • SimMIM
          • DINO
        • 06_Pretext Tasks
          • PIRL
        • 07_Clustering-based
          • SwAV
      • 04_Semi-Supervised Learning
        • Fully-/Semi-/Weakly-/ Learning
        • 01_Self-training
          • Pseudo-label
          • Noisy Student
        • 02_Consistency Regularization
          • Temporal Ensembling
          • Mean Teacher
          • VAT
          • UDA
        • 03_Hybrid Methods
          • MixUp
          • MixMatch
          • ReMixMatch
          • FixMatch
          • FixMatch (unmerge)
      • 05_Multi-learning Paradigm
        • 00_Multi-learning
        • 01_Multitask
        • Gradient Surgery
        • EtE Multi-task Learning with Attention
        • MTL for Dense Predictions
        • MTL using Uncertainty
        • Which Task learned together
        • GradNorm
        • OM-Net
        • 06_Multi-task Learning
      • 06_Generative Models
        • 00_Generative Models
        • 01_Autoencoders
          • AE vs Others
          • Sparse AE
          • Denoising AE
          • Contractive AE
          • Variational AE
          • DELG
        • 02_GAN
      • Graph Convolutional Networks
        • 00_Graph Convolutional Networks
      • Neural Radiance Fields (NeRFs)
      • Deep Belief Networks
    • Multimodal Models
    • Bag of Freebies - BOF
      • 01_Augmentation
        • Mosaic
        • Cut Out
        • Mix Up
      • 02_Loss Functions
        • 01_Classification Loss
        • 02_Segmentation Loss
        • 03_Object Detection Loss
        • 04_Self-Supervised Loss
        • 05_Interactive Segmentation Loss
      • 03_Optimizer
      • 04_Normalization
        • 00_Normalization
      • 05_Regularization
      • 06_Label Assignment
        • 00_Label Assignment
        • OTA
        • SimOTA
      • 07_Auxiliary Head
    • Bag of Specials - BoS
      • Feature Pyramid
        • RCNet
      • Receptive Field
      • Attention
        • 00_Attention Modules
        • SENet
        • CBAM
        • DANet
        • SDANet
        • AttaNet
        • HaloNets
        • GCNet
        • DeepSquare
        • LBAM
        • External-Attention
        • PCT
        • Residual Attention
        • DCANet
        • GANet
        • Triplet Attention
        • Lambda Networks
        • ACTION
        • VAN
        • SegNeXt
      • Local-/Global- Features
        • Unifying Nonlocal Blocks for Neural Networks
        • Local Features
        • Global Features
      • Activation Functions
        • SiLU dSiLU
      • Post-Processing
        • Soft-NMS
        • NMW
        • WBF
      • Sliding Window
      • Graph Networks
      • Feature Fusion/Integration
      • Data-Centric
    • Others
      • Selected Top-Conference Papers
        • AAAI2021_Papers
        • CVPR2021_Papers
        • ECCV2020_Papers
        • ICCV2021_Papers
        • ICLM2022_Papers
      • Cheat Sheets
        • Pandas
      • Conference Schedule
  • Data Science
    • 03_DS_Discrete Distribution
    • Data Scientist Professional
      • 3. Statistical Experimentation Theory
      • 4. Statistical Experimentation in Python
      • 5. Model development in Python
      • 7. Data Management in SQL
    • Data...
    • ETL
    • Airflow
  • Cloud Computing
    • Azure Data Fundamental
    • Amazon Web Services
      • AWS - Cloud 101
      • AWS - Machine Learning Foundation (Lab)
        • 1. Introduction to MLF
        • 2. AI and ML
        • 3. ML Pipeline
        • 4. ML Tools and Services
        • 5. Wrapping it Up
      • AWS - Cloud Practitioner Essentials
      • AWS - GenAI
    • Google Cloud
    • IBM Watson
  • Big Data
    • PySpark
      • Introduction to PySpark
        • 1. Getting to know PySpark
        • 2. Manipulating Data
        • 3. Getting Started with ML Pipelines
        • 4. Model Tuning and Selection
      • Big Data Fundamentals with PySpark
        • 1. Introduction to BigData Analysis with Spark
        • 2. Programming in PySpark RDD’s
        • 3. PySpark SQL & DataFrames
        • 4. Machine Learning with PySpark MLlib
  • English
    • Reading
    • Listening
    • Speaking
      • Speaking_Part1
        • 1_Speaking Part 1
        • 2_Speaking Part 1
        • 3_Speaking Part 1
        • 4_Speaking Part 1
        • 5_Speaking Part 1
        • 6_Speaking Part 1
        • 7_Speaking Part 1
        • 8_Speaking Part 1
        • 9_Speaking Part 1
        • 10_Speaking Part 1
        • 11_Speaking Part 1
        • 12_Speaking Part 1
        • 13_Speaking Part 1
        • 14_Speaking Part 1
        • 15_Speaking Part 1
        • 16_Speaking Part 1
        • 17_Speaking Part 1
        • 18_Speaking Part 1
        • 19_Speaking Part 1
        • 20_Speaking Part 1
        • 21_Speaking Part 1
        • 22_Speaking Part 1
        • 23_Speaking Part 1
      • Speaking_Part2
        • 1_Speaking Part 2
        • 2_Speaking Part 2
        • 3_Speaking Part 2
        • 4_Speaking Part 2
        • 5_Speaking Part 2
        • 6_Speaking Part 2
        • 7_Speaking Part 2
        • 8_Speaking Part 2
        • 9_Speaking Part 2
        • 10_Speaking Part 2
        • 11_Speaking Part 2
        • 12_Speaking Part 2
        • 13_Speaking Part 2
        • 14_Speaking Part 2
        • 15_Speaking Part 2
        • 16_Speaking Part 2
        • 17_Speaking Part 2
        • 18_Speaking Part 2
        • 19_Speaking Part 2
        • 20_Speaking Part 2
        • People
        • Places
          • Visited House
        • Events
        • Activities
          • Interesting Job
        • Things
      • Speaking_Part3
        • Advertisements
        • Outdoor Activities
        • Navigation and Exploration
        • Fast Food
        • Air Pollution
        • Free Time
        • Interesting Movie
        • Gifts
        • Independence in Children
        • Noisy
        • Complain
        • T-shirts
        • Value of Money
        • Restaurant
        • Global
        • Relaxation
        • Special Places
      • Mixed-Test
        • 01_Mix_Language
    • Writing
      • Writing_Task1
        • Paraphrase
        • Overview Sentence
        • Grammar
        • Charts
          • Line - Average Montly Temperatures
          • Line - Fuels
          • Line - Birth Rate
          • Line - River Water
          • Line - U.S Energy
          • Line - Areas of Crime
          • Line - Renewable Energy
          • Line - Oversea Visitors
          • Chart - People ate in the UK
          • Chart - Music Event Attendance
          • Chart - Wind Energy
          • Chart - Children Attend Sports in Australia
          • Chart - Weekly Hours in Australia
          • Chart - Films released vs Tickets sold
          • Chart - Average Retirement Age
        • Process
        • Maps
          • Library Ground
        • Table
        • Multiple Graphs
          • Life Expectancy
      • Writing_Task2
        • Opinion Essay
          • Higher Salary
          • Goal of Schools
          • Local History
          • Retirement Age
          • Happy Society
          • Food Necessary
          • Pay for more Art
          • Eradicate Poverty
          • Team Activities
          • Wild Animals and Birds
        • Discussion Essay
          • Sports
          • Make Money
          • Crime punished
          • Equipment for Student
          • Keep a Gun
        • Advantages and Disadvantages Essay
          • Live Away
          • Transform to Farms
        • Problem-Solution Essay
          • Extreme Sports
          • Spend Time Away From Families
      • Complex Sentence
      • If, Wish, Hope
    • Synonym Common Mistakes
    • Phrasal Verbs
    • TOEIC 990
  • Interview
    • Deep Learning Questions
      • C1_Mathematical Foundation
      • C2_Fundamentals of ML
      • C3_Fundamentals of DL
      • C4_Classic Network
      • C5_CNN
      • C6_RNN
      • C7_Target Detection
      • C8_Image Segmentation
      • C9_Reinforcement Learning
      • C10_Migration Learning
      • C13_Optimization Algorithm
      • C14_Super Parameter Adjustment
      • C15_Hetorogeneous Computing
    • Data Science Questions
  • Courses (Uni and Mooc)
    • AI Open Courses
    • DS Certificates
    • IBM Gen AI Engineering Professional Certificate
      • 10. Generative AI and LLMs: Architecture and Data Preparation
      • 11. Gen AI Foundational Models for NLP & Language Understanding
      • 12. Gen AI Language Modeling with Transformers
        • Module 1 - Fundamental Concepts of Transformer Architecture
        • Module 2 - Advanced Concepts of Transformer Architecture
      • 13. Generative AI Engineering and Fine-Tuning Transformers
      • 14. Generative AI Advanced Fine-Tuning for LLMs
      • 15. Fundamentals of AI Agents using RAG and Langchain
        • Module 1 - RAG Framework
        • Module 2 - Prompt Engineering and LangChain
      • 16. Project: Generative AI Applications with RAG and LangChain
    • Data Science Foundations: Data Structures and Algorithms Specialization
    • Flask - AI Applications
      • 1. Packaging Concepts
      • 2. Web App Deployment
      • 3. Creating AI Application
        • Sentiment Analysis
        • Emotion Detector
      • Deploy Deep Learning Models using Flask
    • Docker, Kubernetes & OpenShift
      • 1. Containers and Containerization
      • 2. Kubernetes Basics
      • 3. Managing Applications with Kubernetes
      • 4. The Kubernetes Ecosystem
      • 5. Final Assignments
    • Data Structures
      • 1. Introduction to DS&A
    • Algorithms
      • QE - Algorithms
      • Sorting Algorithms
        • Binary Search
        • Insertion Sort
        • Merge Sort
        • Quick sort
        • Heap sort
      • Divide and Conquer
      • Greedy Algorithm
      • Dynamic Programming
    • Operating System
      • QE - Operating System
      • 00_Operating System
    • CS231n Deep Learning for Computer Vision
      • 13. Self-Supervised Learning
    • CS480 Introduction to Machine Learning
      • 19. Attention and Transformer Networks
    • CS330 Multi-task and Meta Learning
      • 1. What is Multi-task Learning
    • Processing the Environment
      • Attention
    • Open VINO
    • Metaverse
      • 00_Metaverse
      • Spark AR
  • Research Projects
    • PPE Detection
      • Few-shot Data Sampling
    • Multiple Object Tracking
      • In-place Augmentation
    • Deep Clustering
      • Metrics
    • Defect Detection
      • 01_Defect_Improvement
      • Dataset: MVTec
      • Mixed supervision for surface-defect detection:
      • Practical Defect Detection
      • (Survey) Fabric Defect Detection
      • (Summary) Fabric Defect Detection
    • Medical Images
      • 01_Lung_Improvement
      • SANet
      • AnaXNet
      • 3D_EtoE Lung Cancer Screening
      • Semantics-enriched Representation
      • Attend And Compare
      • Recent Works
      • Kaggle_Medical Images
  • AI Engineer
  • Financial Invesment
    • 01_TPTrading
    • 02_BCTC
    • 03_Demand Side Platform (DSP)
    • 04_Business Models
    • Trading
      • 01_Technical Analysis
      • 02_Mentality
      • 03_Support and Resistance
  • Books
    • AI Books
    • Books
      • Persuasion IQ
      • Communication Skills
      • 48 Hours a Day
      • Maslow's Pyramid
      • MBTI
      • Tư Duy Ngược
    • Audio Books
  • Project Management
    • PM Methods
      • Agile
      • Scrum
      • Kanban
    • Foundations of PM
      • Module 1
      • Module 2
      • Module 3
      • Module 4
    • Project Initiation: Starting a Successul Projet
      • Module 1
      • Module 2
      • Module 3
      • Module 4
    • Project Planning: Putting It All Together
      • Module 1
    • Project Execution: Running the Project
    • Agile Project Management
    • Capstone: Applying Project Management in the Real World
  • Administrator
  • More
    • About Me!
    • AI Expert Roadmap
      • PyTorch
        • PyTorch Fundamentals
          • 1. Introduction to PyTorch
          • 2. Introduction to Computer Vision with PyTorch
          • 3. Introduction to Natural Language Processing with PyTorch
          • 4. Introduction to Audio Classification with PyTorch
        • Intermediate DL with Pytorch
          • 1_TrainingRobustNN
          • 2_Image&CNN
          • 3_Sequences&RNN
          • 4_Multi-Input&Multi-Output
      • Machine Learning
        • 01_ML_General
        • 02_ML_Supervised Learning
        • 03_ML_Unsupervised Learning
      • Mamba
        • 00_Sequence Modelling, S4 and Mamba
      • Transformers (CV&NLP)
        • NLNet
        • 01_Pure Transformer
          • ViT
          • Segformer
        • 02_Hybrid Transformer
          • DETR
          • Deformable DETR
          • DINO (Detection)
        • 99_Unfilter
          • LG-Transformer
          • Image GPT
          • Points as Queries
          • VST
          • MAXViT
          • ViTMAE-Detect
          • MAGNETO
          • AIT
          • MTV
          • PiT
          • Swin
          • PVTv2
          • PVT
          • FAVOR+
          • T2T-ViT
          • CaiT
          • CCT
          • DeiT
          • SSA
          • SA3D
      • [NLP] Natural Language Processing
        • 01_[LLMs] Large Language Models
        • [MoEs] Mixture of Experts
        • LLM Techniques
        • Attention is All You Need
        • Positional Encoding
        • Tokenization
        • MICLe
      • [CV] Computer Vision
        • MLP-based Classification
          • MLP-Mixer
          • FNet
          • EANet
        • 01_[SL] Supervised Learning
          • 01_Classification
            • Convolution Variants
            • 1x1 Convolution
            • EfficientNetV2
            • ConvNeXtV2
          • 02_Detection
            • ConvMixer
            • SOLO
            • YOLOX
            • YOLOR
            • AugFPN
            • BoT_Cls
            • BoF_OD
            • YOLOv3
            • YOLOv4
            • YOLOv5
            • YOLOv6
            • YOLOv7
            • YOLOv8
            • YOLOv9
            • YOLO-NAS
            • TPH-YOLOv5
            • TPH-YOLOv5++
            • ViTDET
          • 03_Segmentation
            • Object Instance Survey 2022
            • 01_Instance Segmentation
            • 02_Semantic Segmentation
            • 03_Panoptic Segmentation
            • 04_3D Segmentation
            • 05_Unsupervised Segmentation
            • BMask RCNN
            • ISTR
            • Transfuse
          • 04_[IS] Interactive Segmentation
            • Interactive Segmentation Techniques
            • 02_3D Interactive Segmentation
            • 03_Video Object Segmentation
            • SAM
            • HA_SAM
            • CFR-ICL
            • MST
            • ECONet
            • SimpleClick
            • FocusCut
            • f-BRS
            • iSegformer
          • 05_Object Tracking
            • 00_ObjectTracking
            • Sort
            • DeepSort
            • FairMOT
            • ByteTrack
            • StrongSORT
            • Tracktor
            • JDE
            • CenterTrack
            • PermaTrack
            • TransTrack
            • TrackFormer
            • BoT-SORT
          • 06_Face Recognition
          • 07_Image Stitching
          • 08_Image Restoration
          • 06_Refinement
            • BPR
          • 10_Scene Understanding
            • CPNet
          • 11_Human Pose Estimation
            • 3D Human Pose
            • Human Pose
          • 12_[SR] Super Resolution
            • Bicubic++
          • 13_VideoPropagation
          • 14_Image Mating
          • 15_Knowledge Distillation
          • 16_Others
        • 02_[UL] Unsupervised Learning
          • 00_Unsupervised Learning
          • 02_Deep Clustering
            • 00_K_Clusters Decision
            • Deep Cluster
            • Cluster Fit
            • DEC
            • Improving Relational Regularized Autoencoders with Spherical Sliced Fused G
            • Taxanomy
            • DeepDPM
            • BCL
            • VaDE
            • t-SNE
            • Tree-SNE
          • 04_Diffusional Models
        • 03_[SSL] Self-Supervised Learning
          • 00_Self-Supervised Learning
          • 01_Contrastive Learning
            • CPC
            • DIM
            • CMC
            • AMDIM
            • SimCLR
            • MoCo
            • MoCov2
            • YADIM
            • VICReg
            • CSL
            • Towards Domain-Agnostic Contrastive Learning
            • Non-Parametric Instance Discrimination
            • Video Contrastive Learning with Global Context
            • SupCon
            • Barlow Twin
          • 02_Predictive Tasks
          • 03_Bootstrapping
            • BYOL
          • 04_Regularization
          • 05_Masked Image Models
            • Patch Localization
            • MAE
            • SimMIM
            • DINO
          • 06_Pretext Tasks
            • PIRL
          • 07_Clustering-based
            • SwAV
        • 04_Semi-Supervised Learning
          • Fully-/Semi-/Weakly-/ Learning
          • 01_Self-training
            • Pseudo-label
            • Noisy Student
          • 02_Consistency Regularization
            • Temporal Ensembling
            • Mean Teacher
            • VAT
            • UDA
          • 03_Hybrid Methods
            • MixUp
            • MixMatch
            • ReMixMatch
            • FixMatch
            • FixMatch (unmerge)
        • 05_Multi-learning Paradigm
          • 00_Multi-learning
          • 01_Multitask
          • Gradient Surgery
          • EtE Multi-task Learning with Attention
          • MTL for Dense Predictions
          • MTL using Uncertainty
          • Which Task learned together
          • GradNorm
          • OM-Net
          • 06_Multi-task Learning
        • 06_Generative Models
          • 00_Generative Models
          • 01_Autoencoders
            • AE vs Others
            • Sparse AE
            • Denoising AE
            • Contractive AE
            • Variational AE
            • DELG
          • 02_GAN
        • Graph Convolutional Networks
          • 00_Graph Convolutional Networks
        • Neural Radiance Fields (NeRFs)
        • Deep Belief Networks
      • Multimodal Models
      • Bag of Freebies - BOF
        • 01_Augmentation
          • Mosaic
          • Cut Out
          • Mix Up
        • 02_Loss Functions
          • 01_Classification Loss
          • 02_Segmentation Loss
          • 03_Object Detection Loss
          • 04_Self-Supervised Loss
          • 05_Interactive Segmentation Loss
        • 03_Optimizer
        • 04_Normalization
          • 00_Normalization
        • 05_Regularization
        • 06_Label Assignment
          • 00_Label Assignment
          • OTA
          • SimOTA
        • 07_Auxiliary Head
      • Bag of Specials - BoS
        • Feature Pyramid
          • RCNet
        • Receptive Field
        • Attention
          • 00_Attention Modules
          • SENet
          • CBAM
          • DANet
          • SDANet
          • AttaNet
          • HaloNets
          • GCNet
          • DeepSquare
          • LBAM
          • External-Attention
          • PCT
          • Residual Attention
          • DCANet
          • GANet
          • Triplet Attention
          • Lambda Networks
          • ACTION
          • VAN
          • SegNeXt
        • Local-/Global- Features
          • Unifying Nonlocal Blocks for Neural Networks
          • Local Features
          • Global Features
        • Activation Functions
          • SiLU dSiLU
        • Post-Processing
          • Soft-NMS
          • NMW
          • WBF
        • Sliding Window
        • Graph Networks
        • Feature Fusion/Integration
        • Data-Centric
      • Others
        • Selected Top-Conference Papers
          • AAAI2021_Papers
          • CVPR2021_Papers
          • ECCV2020_Papers
          • ICCV2021_Papers
          • ICLM2022_Papers
        • Cheat Sheets
          • Pandas
        • Conference Schedule
    • Data Science
      • 03_DS_Discrete Distribution
      • Data Scientist Professional
        • 3. Statistical Experimentation Theory
        • 4. Statistical Experimentation in Python
        • 5. Model development in Python
        • 7. Data Management in SQL
      • Data...
      • ETL
      • Airflow
    • Cloud Computing
      • Azure Data Fundamental
      • Amazon Web Services
        • AWS - Cloud 101
        • AWS - Machine Learning Foundation (Lab)
          • 1. Introduction to MLF
          • 2. AI and ML
          • 3. ML Pipeline
          • 4. ML Tools and Services
          • 5. Wrapping it Up
        • AWS - Cloud Practitioner Essentials
        • AWS - GenAI
      • Google Cloud
      • IBM Watson
    • Big Data
      • PySpark
        • Introduction to PySpark
          • 1. Getting to know PySpark
          • 2. Manipulating Data
          • 3. Getting Started with ML Pipelines
          • 4. Model Tuning and Selection
        • Big Data Fundamentals with PySpark
          • 1. Introduction to BigData Analysis with Spark
          • 2. Programming in PySpark RDD’s
          • 3. PySpark SQL & DataFrames
          • 4. Machine Learning with PySpark MLlib
    • English
      • Reading
      • Listening
      • Speaking
        • Speaking_Part1
          • 1_Speaking Part 1
          • 2_Speaking Part 1
          • 3_Speaking Part 1
          • 4_Speaking Part 1
          • 5_Speaking Part 1
          • 6_Speaking Part 1
          • 7_Speaking Part 1
          • 8_Speaking Part 1
          • 9_Speaking Part 1
          • 10_Speaking Part 1
          • 11_Speaking Part 1
          • 12_Speaking Part 1
          • 13_Speaking Part 1
          • 14_Speaking Part 1
          • 15_Speaking Part 1
          • 16_Speaking Part 1
          • 17_Speaking Part 1
          • 18_Speaking Part 1
          • 19_Speaking Part 1
          • 20_Speaking Part 1
          • 21_Speaking Part 1
          • 22_Speaking Part 1
          • 23_Speaking Part 1
        • Speaking_Part2
          • 1_Speaking Part 2
          • 2_Speaking Part 2
          • 3_Speaking Part 2
          • 4_Speaking Part 2
          • 5_Speaking Part 2
          • 6_Speaking Part 2
          • 7_Speaking Part 2
          • 8_Speaking Part 2
          • 9_Speaking Part 2
          • 10_Speaking Part 2
          • 11_Speaking Part 2
          • 12_Speaking Part 2
          • 13_Speaking Part 2
          • 14_Speaking Part 2
          • 15_Speaking Part 2
          • 16_Speaking Part 2
          • 17_Speaking Part 2
          • 18_Speaking Part 2
          • 19_Speaking Part 2
          • 20_Speaking Part 2
          • People
          • Places
            • Visited House
          • Events
          • Activities
            • Interesting Job
          • Things
        • Speaking_Part3
          • Advertisements
          • Outdoor Activities
          • Navigation and Exploration
          • Fast Food
          • Air Pollution
          • Free Time
          • Interesting Movie
          • Gifts
          • Independence in Children
          • Noisy
          • Complain
          • T-shirts
          • Value of Money
          • Restaurant
          • Global
          • Relaxation
          • Special Places
        • Mixed-Test
          • 01_Mix_Language
      • Writing
        • Writing_Task1
          • Paraphrase
          • Overview Sentence
          • Grammar
          • Charts
            • Line - Average Montly Temperatures
            • Line - Fuels
            • Line - Birth Rate
            • Line - River Water
            • Line - U.S Energy
            • Line - Areas of Crime
            • Line - Renewable Energy
            • Line - Oversea Visitors
            • Chart - People ate in the UK
            • Chart - Music Event Attendance
            • Chart - Wind Energy
            • Chart - Children Attend Sports in Australia
            • Chart - Weekly Hours in Australia
            • Chart - Films released vs Tickets sold
            • Chart - Average Retirement Age
          • Process
          • Maps
            • Library Ground
          • Table
          • Multiple Graphs
            • Life Expectancy
        • Writing_Task2
          • Opinion Essay
            • Higher Salary
            • Goal of Schools
            • Local History
            • Retirement Age
            • Happy Society
            • Food Necessary
            • Pay for more Art
            • Eradicate Poverty
            • Team Activities
            • Wild Animals and Birds
          • Discussion Essay
            • Sports
            • Make Money
            • Crime punished
            • Equipment for Student
            • Keep a Gun
          • Advantages and Disadvantages Essay
            • Live Away
            • Transform to Farms
          • Problem-Solution Essay
            • Extreme Sports
            • Spend Time Away From Families
        • Complex Sentence
        • If, Wish, Hope
      • Synonym Common Mistakes
      • Phrasal Verbs
      • TOEIC 990
    • Interview
      • Deep Learning Questions
        • C1_Mathematical Foundation
        • C2_Fundamentals of ML
        • C3_Fundamentals of DL
        • C4_Classic Network
        • C5_CNN
        • C6_RNN
        • C7_Target Detection
        • C8_Image Segmentation
        • C9_Reinforcement Learning
        • C10_Migration Learning
        • C13_Optimization Algorithm
        • C14_Super Parameter Adjustment
        • C15_Hetorogeneous Computing
      • Data Science Questions
    • Courses (Uni and Mooc)
      • AI Open Courses
      • DS Certificates
      • IBM Gen AI Engineering Professional Certificate
        • 10. Generative AI and LLMs: Architecture and Data Preparation
        • 11. Gen AI Foundational Models for NLP & Language Understanding
        • 12. Gen AI Language Modeling with Transformers
          • Module 1 - Fundamental Concepts of Transformer Architecture
          • Module 2 - Advanced Concepts of Transformer Architecture
        • 13. Generative AI Engineering and Fine-Tuning Transformers
        • 14. Generative AI Advanced Fine-Tuning for LLMs
        • 15. Fundamentals of AI Agents using RAG and Langchain
          • Module 1 - RAG Framework
          • Module 2 - Prompt Engineering and LangChain
        • 16. Project: Generative AI Applications with RAG and LangChain
      • Data Science Foundations: Data Structures and Algorithms Specialization
      • Flask - AI Applications
        • 1. Packaging Concepts
        • 2. Web App Deployment
        • 3. Creating AI Application
          • Sentiment Analysis
          • Emotion Detector
        • Deploy Deep Learning Models using Flask
      • Docker, Kubernetes & OpenShift
        • 1. Containers and Containerization
        • 2. Kubernetes Basics
        • 3. Managing Applications with Kubernetes
        • 4. The Kubernetes Ecosystem
        • 5. Final Assignments
      • Data Structures
        • 1. Introduction to DS&A
      • Algorithms
        • QE - Algorithms
        • Sorting Algorithms
          • Binary Search
          • Insertion Sort
          • Merge Sort
          • Quick sort
          • Heap sort
        • Divide and Conquer
        • Greedy Algorithm
        • Dynamic Programming
      • Operating System
        • QE - Operating System
        • 00_Operating System
      • CS231n Deep Learning for Computer Vision
        • 13. Self-Supervised Learning
      • CS480 Introduction to Machine Learning
        • 19. Attention and Transformer Networks
      • CS330 Multi-task and Meta Learning
        • 1. What is Multi-task Learning
      • Processing the Environment
        • Attention
      • Open VINO
      • Metaverse
        • 00_Metaverse
        • Spark AR
    • Research Projects
      • PPE Detection
        • Few-shot Data Sampling
      • Multiple Object Tracking
        • In-place Augmentation
      • Deep Clustering
        • Metrics
      • Defect Detection
        • 01_Defect_Improvement
        • Dataset: MVTec
        • Mixed supervision for surface-defect detection:
        • Practical Defect Detection
        • (Survey) Fabric Defect Detection
        • (Summary) Fabric Defect Detection
      • Medical Images
        • 01_Lung_Improvement
        • SANet
        • AnaXNet
        • 3D_EtoE Lung Cancer Screening
        • Semantics-enriched Representation
        • Attend And Compare
        • Recent Works
        • Kaggle_Medical Images
    • AI Engineer
    • Financial Invesment
      • 01_TPTrading
      • 02_BCTC
      • 03_Demand Side Platform (DSP)
      • 04_Business Models
      • Trading
        • 01_Technical Analysis
        • 02_Mentality
        • 03_Support and Resistance
    • Books
      • AI Books
      • Books
        • Persuasion IQ
        • Communication Skills
        • 48 Hours a Day
        • Maslow's Pyramid
        • MBTI
        • Tư Duy Ngược
      • Audio Books
    • Project Management
      • PM Methods
        • Agile
        • Scrum
        • Kanban
      • Foundations of PM
        • Module 1
        • Module 2
        • Module 3
        • Module 4
      • Project Initiation: Starting a Successul Projet
        • Module 1
        • Module 2
        • Module 3
        • Module 4
      • Project Planning: Putting It All Together
        • Module 1
      • Project Execution: Running the Project
      • Agile Project Management
      • Capstone: Applying Project Management in the Real World
    • Administrator

DeepDPM: Deep Clustering With an Unknown Number of Clusters

{Non-parametric, Dirichlet Process Gaussian Mixture Model (DPGMM), Cluster Convex}

Paper: https://arxiv.org/pdf/2203.14309.pdf

Code: 

  • https://github.com/BGU-CS-VIL/DeepDPM

  • https://www.catalyzex.com/code/BGU-CS-VIL/DeepDPM

Review: https://medium.com/syncedreview/meet-deepdpm-no-predefined-number-of-clusters-needed-for-deep-clustering-tasks-e7c635039013

Author's comment: https://www.reddit.com/r/MachineLearning/comments/tv9fuv/r_deepdpm_deep_clustering_with_an_unknown_number/

0) Motivation, Object and Related works:

Credit: [Link] 

Motivation:

  • Most deep-clustering methods are parametric - require a predefined and fixed number of clusters, denoted by K. 

  • K-selection is computationally expensive.

Objectives:

  • DeepDPM: removes the need to predefine the number of clusters, but infer K instead.

    1. Combine the benefits of DL and the Dirichlet Process Mixture (DPM).

      • Use splits and merges of clusters to change K together with a dynamic architecture to accommodate for such changes. 

      • Use a new loss function motivated by the expectation–maximization algorithm in Bayesian Gaussian mixture models (EM-GMM) to enable a novel amortized inference.

    2. Can be incorporated in deep pipelines that rely on clustering (e.g., for feature learning). 

    3. Be differentiable during most of the training and thus supports gradient propagation through it. 

    4. Handles class imbalance gracefully and scales well to large datasets. 

  • First report the performance of such a method on ImageNet. 

  • Demonstrate the importance of inferring K, especially on imbalanced datasets. 

Introduction:

  • Clustering problem:

    1. no class labels. 

    2. no number of classes K.

    3. no relative sizes (i.e., the class weights).

  • Classical clustering = non-parametric methods (methods that find K) 

  • Deep clustering methods = non-parametric or parametric ones (methods that re-quire a known K) 

  • The ability to infer the latent K:

    1. Without a good estimate of K, parametric methods might suffer in performance. (Figure 1)

    2. Changing K during training has positive optimization-related implications; 

    3. Using a model selection to find K: run a parametric method numerous times, using different K values over a wide range, and then choose the “best” K via an unsupervised criterion. ==> costly.

    4. K itself may be a sought-after quantity of importance.

  • Bayesian nonparametric (BNP) mixture models, exemplified by the Dirichlet Process Mixture (DPM) model, offer an elegant, data-adaptive, and mathematically-principled solution for clustering when K is unknown. 

Figure 1. Mean clustering accuracy of 3 runs (± std. dev.) on ImageNet50. Ground Truth K is 50. 

  • Parametric methods such as K-means, DCN++ (an improved variant of [71]) and SCAN [64], require knowing K. When given a poor estimate of K, they deteriorate in performance in a balanced dataset (a) and even more so in an imbalanced dataset (b). 

  • The proposed DeepDPM does not require knowing K (it infers its value; e.g., K = 55.3 ± 1.53 in (a) and 46.3 ± 2.52 in (b)) and yet yields comparable results.

Contribution:

  1. A deep clustering method that infers the number of clusters. 

  2. A novel loss that enables a new amortized inference in mixture models. 

  3. A demonstration of the importance, in deep clustering, of inferring K. 

  4. Outperforms existing non-parametric clustering methods and be the first to report results of a deep non-parametric clustering method on a large dataset such as ImageNet.

Conclusion:

  1. Limitations:

    • If DeepDPM’s input features are poor it would struggle to recover. 

    • If K is known and the dataset is balanced, parametric methods (e.g., SCAN) may be a slightly better choice. 

  2. Future work: 

    • Adapting DeepDPM to streaming data (e.g., similarly to how [20]) or hierarchical settings [7,19,61]. 

    • Our results may improve given a more sophisticated framework for building split proposals (e.g., see [67]). 

  3. Broader impact: 

    • Inspire the deep-clustering community to adopt the non-parametric approach.

    • Raise awareness of issues with the parametric one. 

    • Non-parametric also has an environmentally positive impact: reduces resource usage. 

Related Work:

  1. Parametric Deep Clustering:

1) Two-step approaches: Clustering is performed on features extracted in a pretext task. 

      • McConville et al. [47] run K-means on the embeddings, transformed by UMAP [48], of a pre-trained Auto-encoder (AE). 

      • SCAN (reaching SOTA results), which uses unsupervised pretrained feature extractors (e.g., MoCo [13] and SimCLR [12]).

        • Being parametric, depends on having an estimate of K and, deteriorates in performance when the estimate is too inaccurate. 

        • Assumes uniform class weights (i.e. a balanced dataset) and that is often unrealistic in purely-unsupervised cases. 

2) End-to-end deep methods: jointly learn features and clustering, possibly by alternation. 

      • [40, 68, 70–72]; use an AE, or a Variational AE (VAE), with an additional clustering loss.

      • DCN [71] runs K-means on the embeddings of a pre-trained AE, and retrains it with a loss consisting of a reconstruction term and a clustering-based term.

      • [5, 6], use convolutional neural nets to alternately learn features and clustering. 

  1. Non-parametric Classical Clustering:

    • Ex: BNP clustering and, more specifically, the DPM model [1, 24]. 

    • Works rely on BNP clustering [4, 9, 14, 25–28, 30, 32, 33, 38, 39, 41, 44–46, 49, 53–59, 62], it has yet to become a mainstream choice, partly due to the lack of efficient large-scale inference tools. 

    • Mile-stones:

      • The highly-effective DPM sampler [21] (a modern and scalable implementation of the DPM sampler from [10]).

      • The scalable streaming DPM inference [20]. 

      • Variational DPM inference [3, 31, 34, 36, 42]. 

    • DBSCAN [23] (a non-Bayesian method) is density-based and groups together closely-packed points. (efficient implementations, but sensitive to hyper-parameters which are hard to tune).

  2. Non-parametric Deep Clustering. 

    • Adaptively find K [11, 52, 66, 74]. 

      • Use an offline DPM inference for pseudo-labels for fine-tuning a deep belief network [11], or an AE [66] (similarly to the parametric methods in [5, 6, 71]). 

      • [66] and [11] rely on slow DPM samplers, but do not scale to large datasets. 

      • AdapVAE [74] uses a DPM prior for a VAE + ELBO minimization.

      • DCC [52], feature learning and clustering are performed simultaneously like in [74]; plus Nearest-neighbor graph to group points that are close in the latent space of an AE. 

    • [47], [65] uses an AE and t-SNE [63] to find K. 

    • [22], a deep net is simultaneously trained on a family of losses instead of a single one.

    • [60] and [50] do not assume a known K, where [60] focuses on clustering faces and [50] on generating posterior samples of cluster labels for any new dataset. 

    • [2] iteratively forms clusters by sequentially examining each sample against the members of existing clusters. 

    • [73] relies on a BNP mixture, their method (and code) still uses a fixed K.

2) Methods:

Inspired by [10] 

2.1 Preliminaries: DPGMM-based Clustering

  • Charles E Antoniak. Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Annals of Statistics, 1974.

  • Thomas S Ferguson. A Bayesian analysis of some nonparametric problems. Annals of statistics, 1973.

2.2 The Proposed Method: DeepDPM

  • DeepDPM has two main parts:

    1. A clustering net - generates soft cluster assignments for each input data point.

    2. K sub-clustering nets (one for each cluster k, k ∈ {1, . . . , K}) - take the previously generated soft cluster assignments as inputs and generate soft sub-cluster assignments, which will later be used to support split and merge decisions to dynamically adapt to and change the number of clusters.

3) Results:

Metrics: 

  • Clustering accuracy (ACC); 

  • Normalized Mutual Information (NMI); 

  • Adjusted Rand Index (ARI). 

  • Silhouette Score

Overall:

  1. uniformly achieves the best performance, reaching SOTA levels. 

  2. robust to both class imbalance and initial cluster value.

  3. eliminates the need to repeatedly train deep parametric methods for model selection.

  4. reduce resource usage.


  1. Comparing with Classical Methods.

    • We compared DeepDPM with classical parametric methods (K-means; GMM) and nonparametric ones (DBSCAN [23], moVB [34]; the SOTA DPM sampler from [21]). For feature extraction, we performed the process suggested in [47]. We performed the evaluation on the MNIST, USPS, and Fashion-MNIST datasets, as well their imbalanced versions (the latter are defined in the Supmat). All the methods used the same (and fixed) data embeddings as input, and the parametric ones were given the GT K, given them an unfair advantage. Table 1 shows that DeepDPM almost uniformly dominates across all datasets and metrics, and its performance gain only increases in the imbalanced cases. It is also observable that, compared with the parametric methods, the nonparametric ones (ours included) are less affected by the imbalance. Moreover, Table 2 shows that among the nonparametric methods, DeepDPM’s inferred K is the closest to the GT K (see Supmat for similar results in the imbalanced case).

  2. Comparing with Deep Nonparametric Methods.

    • As there exist very few deep nonparametric methods, and some of them reported results only on extremely-small toy datasets [11, 66] (e.g., one of them stated they could not process even MNIST’s train dataset as it was too large for them), we compared DeepDPM with DCC [52] and AdapVAE [74], the only unsupervised deep nonparametric methods that can at least handle the MNIST [18], USPS [35], and STL-10 [15] datasets. As both those methods jointly learn features and clustering, and to show the flexible nature of DeepDPM, we demonstrate its integration with two feature-extraction techniques (described in § 4.5): an end-to-end pipeline (for MNIST and REUTERS-10k [43]) and a two-step approach using features pretrained by MoCo [13] (for STL-10). Unfortunately, we could not run AdapVAE’s published code, and thus resort to including the results reported by them. For DCC, using their code we could reproduce their results only on MNIST, so we compare with both the results we managed to obtain using their code and the ones reported by them. Due to these reproducibility issues, we could compare with those methods only on the original (i.e., balanced) datasets. Table 3 shows that DeepDPM outperforms both DCC and AdapVAE. Note we could not find other unsupervised deep nonparametric methods (let alone with available code) that scale to even these fairly-small datasets.

  3. Clustering the Entire ImageNet Dataset.

    • On ImageNet, we obtained the following results: ACC: 0.25, NMI: 0.65, ARI: 0.14. Our method was initialized with K = 200 and converged into 707 clusters (GT=1000). These are first results on ImageNet reported for deep nonparametric clustering. Figure 3 shows examples of images clustered together.

  1. The Value of Deep Nonparametric Methods

    • When Parametric Methods Break. We study the effect of not knowing K on parametric methods, with and without class imbalance. We evaluate each method with a wide range of different K values on ImageNet-50. The latter, curated in [64], consists of 50 randomly-selected classes of ImageNet [17]. To generate an imbalanced version of it, we sampled a normalized nonuniform histogram from a uniform distribution over the 50-dimensional probability simplex (i.e., all histograms were equally probable) and then sampled examples from the 50 classes in proportions according to that nonuniform histogram. We compared with 3 parametric methods: 1) K-means; 2) the SOTA method SCAN [64]; 3) an improved version of DCN [71], self-coined DCN++, where instead of training an AE on the raw data, we trained it on top of the embeddings SCAN uses (MoCo [13]) where, following [64], we froze those embeddings during training. For DeepDPM, we used the same features.

    • Since SCAN requires large amounts of memory (e.g., we could only run it on 2 RTX-3090 GPU cards with 24GB memory each, compared with DeepDPM for which a single RTX-2080 (or even GTX-1080) with 8GB sufficed), and due to resource constraints, we were limited in how many K values we could run SCAN with and in the number of times each experiment could run (this high computational cost is one of the problems with model selection in parametric methods). Thus, we collected the results of the parametric methods with K values ranging from 5 to 350. For both the balanced and imbalanced cases, we initialized DeepDPM with K = 10. Figure 1 summarizes the ACC results (see Supmat for ARI/NMI). As the K value used by the parametric methods diverges from the GT (i.e., K = 50), their results deteriorate. Unsurprisingly, when using the GT K, or sufficiently close to it, the parametric methods outperform our nonparametric one, confirming our claim that having a good estimate of K is important for good clustering. Figure 1a, however, shows that even with fairly-moderate deviates from the GT K, DeepDPM’s result (0.66±.01) surpasses the leading parametric method. Moreover, Figure 1 shows that the parametric SCAN is sensitive to class imbalance; e.g., in Figure 1b, SCAN performs best when K = 30 suggesting it is due to ignoring many small classes. In contrast, DeepDPM (scoring 0.60 ± .01) is fairly robust to these changes and is comparable in results to SCAN when the latter was given the GT K. In addition, we also show in Table 4 the performance of other nonparametric methods (3 runs on the same features as ours: MoCo+AE). We include DeepDPM’s results with alternation (between clustering and feature learning) and without (i.e., holding the features frozen and training DeepDPM only once). Table 5 compares the K values found by the nonparametric methods. DeepDPM inferred a K value close to the GT in both the balanced and imbalanced cases. In the imbalanced case, moVB scored a slightly better K but its results (see Table 4) were worse. For the parametric methods, Table 5 also shows the K value of the best silhouette score. The unsupervised silhouette metric is commonly used for model selection (NMI/ACC/ARI are supervised, hence inapplicable for model selection). As Table 5 shows, DeepDPM yielded a more accurate K than that approach

    • Running Times. Our running time is comparable with a single run of SCAN (the SOTA deep parametric method); e.g., on ImageNet-50, SCAN (with 2 NVIDIA 3090 GPUs) trains for ∼8 [hr] while ours (with 1 weaker NVIDIA 2080 GPU) takes ∼11 [hr]. However, training SCAN multiple times with a different K each time (as needed for model selection) took more than 3 days. Thus, DeepDPM’s value and positive environmental impact are clear.

  2. Ablation Study and Robustness to the Initial K

    • Table 6 quantifies the performance gains due to the different parts of DeepDPM through an ablation study done on Fashion-MNIST (in the setting described earlier). It shows the effect of disabling splits, merges and both; e.g., merges help even when initializing with K = 3. In fact, the large moves made by splits/merges help even when Kinit = 10. Also, replacing the subclustering nets with K-means (using K = 2) results in deterioration. Likewise, either turning off the priors when computing the cluster parameters, or using an isotropic loss instead of Lcl, hurts performance and (while not shown here) often destabilizes the optimization. Finally, Figure 4 demonstrates, on three different datasets, DeepDPM’s robustness to the initial K.

References:


  • https://towardsdatascience.com/a-framework-for-contrastive-self-supervised-learning-and-designing-a-new-approach-3caab5d29619


About Me:

  • Phone: +84 946 937 937 (Phu)

  • Email: [email protected]  or  [email protected] 

  • Facebook: https://www.facebook.com/phu210.vn/

  • Page: https://www.lephongphu.works/home-page 

"I have gathered information from various sources on the internet, and it is now available for your perusal.

Thank you so much for coming here!"

Google Sites
Report abuse
Page details
Page updated
Google Sites
Report abuse