1) Cluster Fit Pipeline:
ClusterFit follows two steps. One is the cluster step, and the other is the predict step:
1. Cluster: Feature Clustering
Take a pretrained network and use it to extract a bunch of features from a set of images.
The network can be any kind of pretrained network.
K-means clustering is then performed on these features, so each image belongs to a cluster, which becomes its label.
2. Fit: Predict Cluster Assignment
For this step, we train a network from scratch to predict the pseudo labels of images.
These pseudo labels are what we obtained in the first step through clustering.
A standard pretrain and transfer task first pretrains a network and then evaluates it in downstream tasks, as it is shown in the first row of Fig. 5. ClusterFit performs the pretraining on a dataset Dcf to get the pretrained network Npre. The pretrained network Npreare performed on dataset D_{cf}D
cf
to generate clusters. We then learn a new network N_{cf}N
cf
from scratch on this data. Finally, use N_{cf}N
cf
for all downstream tasks.
References:
https://atcold.github.io/pytorch-Deep-Learning/en/week10/10-2/