Unsupervised Learning

Unsupervised learning is an active field of research and has always been a challenge in deep learning.
Finding out meaningful patterns from large datasets without the presence of labels is extremely helpful for many applications.
Performing unsupervised clustering is equivalent to building a classifier without using labeled samples. Where we are given a set of feature vectors without labels, we attempt to group them into natural clusters. [Link]

Papers

Survey

Classical Clustering

A. K. Jain, M. N. Murty, and P. J. Flynn, "Data Clustering: A Review", ACM computing surveys (CSUR), 1999.
Overview of clustering algorithms, 2001.
Evaluation and comparison of clustering algorithms in analyzing ES cell gene expression data, 2002.
R. Xu, and D. Wunsch, "Survey of Clustering Algorithms", IEEE Transactions on neural networks, 2005.
Survey of Clustering Algorithms in Data Mining, 2007.
Comparisons between Data Clustering Algorithms, 2008.
Review and comparison between clustering algorithms with duplicate entities detection purpose, 2012.
A comparative study of data clustering algorithms, 2013.
Survey on clustering methods: Towards fuzzy clustering for big data, 2014.
D. Xu, and Y. Tian, "A Comprehensive Survey of Clustering Algorithms", Annals of Data Science, 2015.
A. C. Benabdellah, A. Benghabrit, and I. Bouhaddou, "A survey of clustering algorithms for an industrial context", Procedia computer science, 2019.
A. E. Ezugwu, A. M. Ikotun, O. O. Oyelade, L. Abualigah, J. O. Agushaka, C. I. Eke, and A. A. Akinyelu, "A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects", Engineering Applications of Artificial Intelligence, Vol. 110, 2022.

Deep Clustering

E. Aljalbout, V. Golkov, Y. Siddiqui, M. Strobel, and D. Cremers, "Clustering with Deep Learning: Taxonomy and New Methods", in arXiv:1801.07648, 2018.
E. Min, X. Guo, Q. Liu, G. Zhang, J. Cui, and J. Long, "A Survey of Clustering With Deep Learning: From the Perspective of Network Architecture", IEEE Access, 2018.
A. I. Károly, R .Fullér, and P. Galambos, "Unsupervised clustering for deep learning: A tutorial survey", Acta Polytechnica Hungarica, 2018.
S. Zhou, H. Xu, Z. Zheng, J. Chen, J. Bu, J. Wu, X. Wang, "A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future Directions", arXiv preprint arXiv:2206.07579, 2022. [Code]

Classical Clustering

K Decision

Kaufman, Leonard, and Peter Rousseeuw. 1990. Finding Groups in Data: An Introduction to Cluster Analysis [Paper]
Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a data set via the gap statistic. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 63(2), 411–423 (2001) [Paper]
Trevor Hastie, Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2009), Springer

Partitioning-based (Centroid -based)

[K-means] J. A. Hartigan and M. A. Wong, ‘‘Algorithm AS 136: A K-means Clustering Algorithm,’’ J. Roy. Stat. Soc. C, Appl. Stat., vol. 28, no. 1, pp. 100–108, 1979.
[K-means++] D. Arthur, and S. Vassilvitskii, “K-means++: The Advantages of Careful Seeding”, Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, Society for Industrial and Applied Mathematics, 2007. (Select centers using the furthest point algorithm)
[K-Modes]
[K-Medoids] L. Kaufman; and P. J. Rousseeuw, "Partitioning Around Medoids (Program PAM)", Wiley Series in Probability and Statistics, Hoboken, NJ, USA: John Wiley & Sons, Inc., pp. 68–125, doi:10.1002/9780470316801.ch2, ISBN 978-0-470-31680-1, retrieved 2021-06-13, 1990.
[PAM] Finding groups in data: an introduction to cluster analysis, 2008
[CLARANS]
[CLARA] Finding Groups in Data: an Introduction to Cluster Analysis, 1990
[FCM] J. C. Bezdek, R. Ehrlich, and W. Full, "FCM: The fuzzy c-means clustering algorithm", Computers & geosciences, 1984. (fuzzifier and membership values).

Hierarchical-based

[WARD] J. H. Ward Jr, "Hierarchical Grouping to Optimize an Objective Function", Journal of the American statistical association, 1963.
[BIRCH] T. Zhang, R. Ramakrishnan, M. Livny, "BIRCH: A new data clustering algorithm and its applications", Journal of Data Mining and Knowledge Discovery, Vol. 1, Num. 2, 1997. (balanced iterative reducing and clustering using hierarchies)
[CURE] S. Guha, R. Rastogi, and K. Shim, "CURE: An efficient clustering algorithm for large databases", ACM SIGMOID record, 1998. (clustering using representatives)
[ROCK] ROCK: a robust clustering algorithm for categorical attributes (robust clustering using links)
[Chameleon] Chameleon: hierarchical clustering using dynamic modeling
[Echidna]
[DIANA] (divisive analysis)
[MONA]
[HDBSCAN] L. McInnes, J. Healy, and S. Astels, "HDBSCAN: Hierarchical density-based clustering", J. Open Source Softw., 2017.

Tree-based (KdTree)

A. Moore, "Very Fast EM-based Mixture Model Clustering using Multiresolution kd-trees", Advances in Neural information processing systems, 1998.
[Incremental Mixture] K. Blekas, and A. Likas, "Incremental Mixture Learning for Clustering Discrete Data", Hellenic Conference on Artificial Intelligence, 2004
[Clustree] L Zappia, A Oshlack, "Clustering trees: a visualization for evaluating clusterings at multiple resolutions", Gigascience, 2018. [Code]
[MergeTree] A Hulot, J Chiquet, F Jaffrezic, G Rigaill, "Fast tree aggregation for consensus hierarchical clustering", BMC bioinformatics, 2020. (O(nqlog(n)))
[Tree-SNE] I. Robinson, and E. Pierce-Hoffman, "Tree-SNE: Hierarchical Clustering and Visualization Using t-SNE", arXiv preprint arXiv:2002.05687, 2020. [Code]
[KMT] P. Tavallali, P. Tavallali, M. Singhal, "K-means tree: an optimal clustering tree for unsupervised learning", Journal of Supercomputing, Vol. 77, Issue 5, pp. 5239-5266, 2021.

Density-based

[DBSCAN] M. Ester, H. P. Kriegel, J. Sander, X. Xu, "A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise". In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96) (AAAI Press), pp. 226–231, 1996.
[OPTICS] M. Ankerst, M. M. Breunig, H. P. Kriegel, J. Sander, "OPTICS: Ordering points to identify the clustering structure", ACM SIGMOD international conference on Management of data (ACM Press), pp. 49–60, 1999.
[DBCLASD]
[DENCLUE] An efficient approach to clustering in large multimedia databases with noise
[DBSCANv2]E Schubert, J Sander, M Ester, HP Kriegel, X Xu, "DBSCAN revisited, revisited: why and how you should (still) use DBSCAN", In ACM Transactions on Database Systems (TODS), 42(3), 19, 2017.
[SNG-DBSCAN] H. Jiang, J. Jang, and J. Lacki, "Faster DBSCAN via subsampled similarity queries", In Advances in Neural Information Processing Systems (NIPS), 2020.
[DBSCAN++] J. Jang, and H. Jiang, "DBSCAN++: Towards fast and scalable density clustering", International conference on machine learning (ICLM), 2019.

Grid-based

[Wave-Cluster] G. Sheikholeslami, S. Chatterjee, and A. Zhang, "Wavecluster: A multi-resolution clustering approach for very large spatial databases", VLDB, 1998
[STING] STING: a statistical information grid approach to spatial datamining. (Statistical Information Grid Approach)
[CLIQUE] Automatic subspace clustering of high-dimensional data for data mining applications. (Clustering in Quest) (density-based + grid-based clustering)
[OPTIGRID] Optimal grid-clustering: Towards breaking the curse of dimensionality in high-dimensional clustering

Graph Distance-based

[Spectra Clustering] A. Ng, M. Jordan, Y. Weiss, "On spectral clustering: Analysis and an algorithm", In Advances in Neural Information Processing Systems 15 (NIPS 2002), pp. 849–856, 2002.
A tutorial on spectral clustering. Stat. Comput. 17, 395–416 (2007). doi:10.1007/s11222-007- 9033-z.
[Affinity Propagation] D. Dueck, "Affinity propagation: clustering data by passing messages", 2009.
[Adaptive AP] K. Wang, J. Zhang, D. Li, X. Zhang, and T. Guo, "Adaptive affinity propagation clustering", arXiv preprint arXiv:0805.1096, 2008.

Model-based (Distribution)

[EM] Dempster AP, Laird NM, Rubin DB, "Maximum likelihood from incomplete data via the EM algorithm", Journal of the Royal Statistical Society: Series B 39: 1–38, 1977.
[COBWEB] Knowledge acquisition via incremental conceptual clustering.
[CLASSIT]
[SOMs] The self-organizing map
[GMM] D. A. Reynolds, ‘‘Gaussian Mixture Models’’ in Encyclopedia of Biometrics. Springer, pp. 827–832, 2015. (Mahalanobis distance to centers)

Others

A new shared nearest neighbor clustering algorithm and its applications.
Improving clustering performance using independent component analysis and unsupervised feature learning.
Infinite ensemble for image clustering. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, in print. ACM 1745-1754 (2016).
Learning a task-specific deep architecture for clustering. In Proceedings of the 2016 SIAM International Conference on Data Mining, in print. SIAM 369-377 (2016). doi:10.1137/1.9781611974348.42.
T. Kohonen, ‘‘The self-organizing map,’’ Neurocomputing, vol. 21, nos. 1–3, pp. 1–6, 1998.
[moVB] MC Hughes, E Sudderth, "Memoized Online Variational Inference for Dirichlet Process Mixture Models", Part of Advances in Neural Information Processing Systems 26 (NIPS), 2013.
Trigeorgis, G., Bousmalis, K., Zafeiriou, S., and Schuller, B. (2014). A deep semi-nmf model for learning hidden representations. In Proceedings of the International Conference on Machine Learning (ICML), pages 1692–1700.
[DPM sampler] O Dinari, A Yu, O Freifeld, J Fisher, "Distributed MCMC Inference in Dirichlet Process Mixture Models Using Julia", In 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), 2019.
[Bisecting K-means] https://www.ijeter.everscience.org/Manuscripts/Volume-4/Issue-8/Vol-4-issue-8-M-23.pdf (hierarchical+centroid based)
[Mean shift] D. Comaniciu and P. Meer, “Mean shift: A robust approach toward feature space analysis”, IEEE Transactions on Pattern Analysis and Machine Intelligence (2002), 2002.
Ding, C. and He, X. (2004). "k-Means clustering via principal component analysis". In Proceedings of the International Conference on Machine Learning (ICML), page 29. ACM.
[ICA BSS] Independent Component Analysis Blind Source Separation (, Gultepe and Makrehchi), 2018.
[AC-PIC] Agglomerative Clustering via Path Integral (, Zhang et al.), 2013.
Zhang, W., Zhao, D. & Wang, X. Agglomerative clustering via maximum incremental path integral. Pattern Recognition 46, 3056–3065 (2013). doi:10.1016/j.patcog.2013.04.013.
GACluster -- Graph Agglomerative Clustering

Deep Clustering

ResNet Autoencoders for Unsupervised Feature Learning From High-Dimensional Data: Deep Models Resistant to Performance Degradation

K Decision

[BCL] C31_Video Face Clustering with Unknown Number of Clusters_ICCV_2019 [Paper] [Personal Summary]
Y. Wang, Z. Shi, X. Guo, X. Liu, E. Zhu, and J. Yin. "Deep embedding for determining the number of clusters". In Thirty-Second AAAI Conference on Artificial Intelligence, 2018.

Deep Clustering

[RIM] Discriminative Clustering by Regularized Information Maximization
[DEC] J Xie, R Girshick, A Farhadi, "Unsupervised deep embedding for clustering analysis", Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:478-487, 2016.
[PARTY] X. Peng, S. Xiao, J. Feng, W. Y. Yau, Z. Yim, "Deep Subspace Clustering with Sparsity Prior", Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI), 2016.
[JULE] J. Yang, D. Parikh, and D. Batra, “Joint unsupervised learning of deep representations and image clusters,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
[IDEC] X. Guo, L. Gao, X. Liu, and J. Yin, "Improved Deep Embedded Clustering with Local Structure Preservation", In Proceedings of IJCAI, IJCAI ’17, pages 1753–1759, 2017.
[UGTAC] V. Premachandran, A. L. Yuille, "Unsupervised Learning Using Generative Adversarial Training And Clustering", Under Review at ICLR, 2017.
[UMMC] D. Chen, J. Lv, Y. Zhang, "Unsupervised Multi-Manifold Clustering by Learning Deep Representation", In Workshops at the AAAI Conference on Artificial Intelligence, pages 385–391, 2017.
[DCN] B. Yang, X. Fu, N. D. Sidiropoulos, M. Hong, "Towards k-means-friendly spaces: Simultaneous deep learning and clustering", In Proceedings of the 34th International Conference on Machine Learning, PMLR 70:3861-3870, 2017.
[VaDE] Z. Jiang, Y. Zheng, H. Tan, B. Tang, and H. Zhou, “Variational deep embedding: An unsupervised and generative approach to clustering,” in IJCAI, 2017, pp. 1965–1972, 2017 [Code]
[DEPICT] K. Ghasedi Dizaji, A. Herandi, C. Deng, W. Cai, and H. Huang, “Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 5736–5745, 2017.
[DAC] J. Chang, L. Wang, G. Meng, S. Xiang, and C. Pan, “Deep adaptive image clustering,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 5879–5887, 2017.
X Peng, J Feng, J Lu, WY Yau, Z Yi, "Cascade subspace clustering", Thirty-First AAAI Conference on Artificial Intelligence, 2017.
[SR-k-means] M. Jabi, M. Pedersoli, A. Mitiche, and I. B. Ayed, “Deep clustering: On the link between discriminative models and k-means,” arXiv preprint arXiv:1810.04246, 2018.
[DBC] F. Li, H. Qiao, and B. Zhang, “Discriminatively boosted image clustering with fully convolutional auto-encoders,” Pattern Recognition, vol. 83, pp. 161–173, 2018.
[DMJC] B Lin, Y Xie, Y Qu, C Li, X Liang, "Jointly Deep Multi-View Learning for Clustering Analysis", arXiv preprint arXiv:1808.06220, 2018
[ClusterGAN] S. Mukherjee, H. Asnani, E. Lin, and S. Kannan, “ClusterGan : Latent space clustering in generative adversarial networks,” The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI), 2019.
Y Ren, K Hu, X Dai, L Pan, SCH Hoi, Z Xu, "Semi-supervised deep embedded clustering", Neurocomputing, 2019
[ASPC-DA] X. Guo, X. Liu, E. Zhu, X. Zhu, M. Li, X. Xu, and J. Yin, “Adaptive self-paced deep clustering with data augmentation,” IEEE Transactions on Knowledge and Data Engineering, pp. 1–1, 2019.
[N2D] R. McConville, R. Santos-Rodriguez, R. J. Piechocki, and I. Craddock, "N2d:(not too) deep clustering via clustering the local manifold of an autoencoded embedding", In 25th International Conference on Pattern Recognition (ICPR), 2020.
"C3: Cross-instance guided Contrastive Clustering"
[MICE] T. W. Tsai, C. Li, and J. Zhu, "MICE: Mixture of Contrastive Experts for Unsupervised Image Clustering", In International Conference on Learning Representations, 2020.
[DKM] M. M. Fard, T. Thonet, E. Gaussier, "Deep k-means: Jointly clustering with k-means and learning representations", Pattern Recognition Letters, 2020.
[SCAN] W. V. Gansbeke, S. Vandenhende, S. Georgoulis, M. Proesmans, and L. V. Gool, “SCAN: Learning to classify images without labels,” in ECCV, pp. 268–285, 2020. [Code]
[SPICE] "SPICE: Semantic Pseudo-Labeling for Image Clustering"
"Learning Representation for Clustering via Prototype Scattering and Positive Sampling"
"Representation Learning for Clustering via Building Consensus"
[DEKC] W. Guo, K. Lin, and W. Ye, "Deep Embedded K-Means Clustering", in International Conference on Data Mining Workshops (ICDMW), 2021.
[Deep DPM] M. Ronen, S. E. Finder, O. Freifeld, "DeepDPM: Deep Clustering With an Unknown Number of Clusters", in CVPR, 2022 [Code]
"Local Aggregation for Unsupervised Learning of Visual Embeddings"
[CCNN] Hsu, C.-C. and Lin, C.-W. (2017). Cnn-based joint clustering and representation learning with feature drift compensation for large-scale image data. arXiv preprint arXiv:1705.07091.
B. Yang, X. Fu, N. D. Sidiropoulos, and M. Hong, “Towards kmeans-friendly spaces: Simultaneous deep learning and clustering,” in ICML, vol. 70, 2017, pp. 3861–3870.
K. Tian, S. Zhou, and J. Guan, “DeepCluster: A general clustering framework based on deep learning,” in Machine Learning and Knowledge Discovery in Databases, 2017, pp. 809–825.
J. Zhang, C.-G. Li, C. You, X. Qi, H. Zhang, J. Guo, and Z. Lin, “Self-supervised convolutional subspace clustering network,” in CVPR, 2019.
N. Dilokthanakul, P. A. M. Mediano, M. Garnelo, M. C. H. Lee, H. Salimbeni, K. Arulkumaran, and M. Shanahan, “Deep unsupervised clustering with Gaussian mixture variational autoencoders,” ArXiv, vol. abs/1611.02648, 2017.
P. Zhou, Y. Hou, and J. Feng, “Deep adversarial subspace clustering,” in CVPR, 2018.

J. Chang, L. Wang, G. Meng, S. Xiang, and C. Pan, “Deep adaptive image clustering,” in ICCV, 2017, pp. 5880–5888.
X. Ji, J. F. Henriques, and A. Vedaldi, “Invariant information clustering for unsupervised image classification and segmentation,” in ICCV, 2019.
J. Wu, K. Long, F. Wang, C. Qian, C. Li, Z. Lin, and H. Zha, “Deep comprehensive correlation mining for image clustering,” in ICCV, 2019.
C. Niu, J. Zhang, G. Wang, and J. Liang, “GATCluster: Selfsupervised Gaussian-attention network for image clustering,” in ECCV, 2020, pp. 735–751.
C61_Y. Li, P. Hu, Z. Liu, D. Peng, J. T. Zhou, and X. Peng, “Contrastive clustering,” in AAAI, 2021.
Y. Tao, K. Takagi, and K. Nakata, “Clustering-friendly representation learning via instance discrimination and feature decorrelation,” in ICLR, 2021.
[DeepCluster] M. Caron, P. Bojanowski, A. Joulin, and M. Douze, “Deep clustering for unsupervised learning of visual features,” in Proceedings of the European Conference on Computer Vision (ECCV), pp. 132–149, 2018.
Y. M. Asano, C. Rupprecht, and A. Vedaldi, “Self-labelling via simultaneous clustering and representation learning,” in ICLR, 2020.
M. Cuturi, “Sinkhorn distances: Lightspeed computation of optimal transport,” in NeurIPS, vol. 26, 2013.
M. Caron, I. Misra, J. Mairal, P. Goyal, P. Bojanowski, and A. Joulin, “Unsupervised learning of visual features by contrasting cluster assignments,” in NeurIPS, 2020.

Saito, S. and Tan, R. T. (2017). Neural clustering: Concatenating layers for better projections.
[CCNN] (Hsu and Lin, 2017) Hsu, C.-C. and Lin, C.-W. (2017). Cnn-based joint clustering and representation learning with feature drift compensation for large-scale image data. arXiv preprint arXiv:1705.07091.
[IMSAT] (Hu et al., 2017) Hu, W., Miyato, T., Tokui, S., Matsumoto, E., and Sugiyama, M. (2017). Learning discrete representations via information maximizing self augmented training. arXiv preprint arXiv:1702.08720.
[SCCNN] (Lukic et al., 2016) Lukic, Y., Vogt, C., Durr, O., and Stadelmann, T. (2016). Speaker identification and clustering using convolutional neural networks. In International Workshop on Machine Learning for Signal Processing (MLSP), pages 1–6. IEEE.
"Deep clustering with convolutional autoencoders", In International Conference on Neural Information Processing, in print. Springer 373–382 (2017).
Auto-encoder based data clustering. In Iberoamerican Congress on Pattern Recognition, in print. Springer 117-124 (2013). doi:10.1007/978-3-642-41822-8_15.
[DCSS] M. Sadeghi, and N. Armanfard, "Deep Clustering with Self-supervision using Pairwise Data Similarities", 2021.

[IEC] Infinite Ensemble Clustering (, Liu et al.), 2016.
[AEC] Autoencoder-based Clustering (, Song et al.), 2013.
[NMF-D] NMF with Deep learning model (, Trigeorgis et al.), 2014.
[TSC-D] Task-specific Deep Architecture for Clustering (, Wang et al.), 2016.
[DCEC] Deep Convolutional Embedded Clustering (, Guo et al.), 2017.
Shaham, U. et al. SpectralNet: spectral clustering using deep neural networks. arXiv preprint arXiv:1801.01587 (2018).
Hierarchical Clustering With Hard-Batch Triplet Loss for Person Re-Identification_CVPR_2020
Deep Clustering With Consensus Representations
Image Clustering via Deep Embedded Dimensionality Reduction and Probability-Based Triplet Loss

Tree-based

[DeepECT] D Mautz, C Plant, C Böhm, "DeepECT: The Deep Embedded Cluster Tree", Data Science and Engineering, 2020.
[DTAE] Q. Garrido, S. Damrich, A. Jäger, D. Cerletti, M. Claassen, L. Najman, and F. Hamprecht, "Visualizing hierarchies in scRNA-seq data using a density tree-biased autoencoder", In arXiv preprint arXiv:2102.05892, 2021. [Code]
S. Lee, J. Jung, I. Park, K. Park, and D. S. Kim, "A deep learning and similarity-based hierarchical clustering approach for pathological stage prediction of papillary renal cell carcinoma", Computational and Structural Biotechnology Journal, Vol. 18, pp. 2639-2646, 2020.
[DSSC-EAGF] X. Jiang, P. Qian, Y. Jiang, Y. Gu, and A.Chen, "Deep self-supervised clustering with embedding adjacent graph features", Systems Science & Control Engineering, 2022.

Non-parametric Deep Clustering

[DCC] D. M. Steinberg, O. Pizarro, and S. B. Williams. "Hierarchical Bayesian models for unsupervised scene understanding". CVIU, 2015.
[DNB] Z. Wang, Y. Ni, B. Jing, D. Wang, H. Zhang, and E. Xing. "DNB: A joint learning framework for deep Bayesian non-parametric clustering". IEEE Transactions on Neural Networks and Learning Systems, 2021.
[AdapVAE] T. Zhao, Z. Wang, A. Masoomi, and J. G. Dy. "Streaming adaptive non-parametric variational autoencoder". arXiv:1906.03288, 2019.
A.Dosovitskiy and J. Djolonga. "You only train once: Loss-conditional training of deep networks" In ICLR, 2020
C. Avgerinos, V. Solachidis, N. Vretos,and P. Daras. "Non-parametric clustering using deep neural networks". IEEE Access, 2020.
X. Yang, Y. Yan, K. Huang, and R. Zhang. "Vsbdvm: an end-to-end Bayesian non-parametric generalization of deep variational mixture model". In ICDM, 2019.

Transformer-based Clustering

Clusformer: A Transformer based Clustering Approach to Unsupervised Large-scale Face and Visual Landmark Recognition_CVPR_2021 [Link]
N Wang, G Gan, P Zhang, S Zhang, J Wei, Q. Liu, X. Jiang, "Cluster-Former: Clustering-based Sparse Transformer for Long-Range Dependency Encoding", In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), Vol. 1, pp. 2390–2402, 2022. (NLP)
Cluster-former: Clustering-based sparse transformer for question answering
ClusterFormer: Neural Clustering Attention for Efficient and Effective Transformer

Dirichlet Process Mixture (DPM) - based models

[NMMC] (Chen, 2015) Chen, G. (2015). Deep learning with nonparametric clustering. arXiv preprint arXiv:1501.03084, 2015.
T. Zhao, Z. Wang, A. Masoomi, and J. G. Dy, "Streaming adaptive nonparametric variational autoencoder". In arXiv:1906.03288, 2019.

Others

Liu, H., Shao, M., Li, S., and Fu, Y. (2016). Infinite ensemble for image clustering. In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1745–1754.
Song, C., Liu, F., Huang, Y., Wang, L., and Tan, T. (2013). Autoencoder-based data clustering. In Iberoamerican Congress on Pattern Recognition (CIARP), pages 117–124. Springer.
Deep hierarchical clustering with Dirichlet Forest Prior
ClusterNet: Deep Hierarchical Cluster Network with Rigorously Rotation-Invariant Representation for Point Cloud Analysis.
Objective-Based Hierarchical Clustering of Deep Embedding Vectors.
Deep hierarchical embedding for simultaneous modeling ofGPCR proteins in a unifed metric space.
Visual Exploration of Relationships andStructure in Low-Dimensional Embeddings
Why do tree-based models still outperform deep learning on tabular data?

Applications

H Hadipour, C Liu, R Davis, ST Cardona, P Hu, "Deep clustering of small molecules at large-scale via variational autoencoder embedding and K-means", BMC bioinformatics, 2022.
An Explicit Local and Global Representation Disentanglement Framework with Applications in Deep Clustering and Unsupervised Object Detection

Dimensionality Reduction

Feature Selection + Elimination:

Missing values ratio
Low-variance filter
High-correlation filter
Random Forest
Backwards-feature elimination
Forward Feature Selection

Linear method:

[Factor analysis]
[CCA] - Canonical Correlation Analysis
[ICA] - Independent Component Analysis
[LDA] - Linear Discriminatory Analysis
[SVD] - Singular Value Composition
[NMF] - Nonnegative Matrix Factorization
[PCA] - Principal Component Analysis
[PGD]

Manifold - Non-linear method:

[Isomap] - Isometric Feature Mapping
[SNE] Stochastic neighbor embedding
[t-SNE] - t-Distributed Stochastic Neighbor
[LLE] - Locally linear embedding
[HLLE] - Hessian Eigenmapping
[Spectral Embedding]
[MDS] - Multidimensional scaling (preserve pairwise distance)
[Sparse Sub-space] E. Elhamifar, R. Vidal, "Sparse Subspace Clustering: Algorithm, Theory, and Applications", IEEE Transactions on Pattern Analysis and Machine Intelligence (tPAMI), pp. 2765 - 2781, 2013.
[Scalable Sparse Sub-space] C. You, D. Robinson, R. Vidal, "Scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3918-3927, 2016.
[UMAP] L. McInnes, J. Healy, and J. Melville. "UMAP: Uniform manifold approximation and projection for dimension reduction", in arXiv:1802.03426, 2018. [Code]

t-SNE Variants

L.J.P. van der Maaten, Accelerating t-SNE using Tree-Based Algorithms, Journal of Machine Learning Research 15(Oct):3221-3245, 2014.
L.J.P. van der Maaten and G.E. Hinton. Visualizing Non-Metric Similarities in Multiple Maps. Machine Learning 87(1):33-55, 2012.
L.J.P. van der Maaten. Learning a Parametric Embedding by Preserving Local Structure. In Proceedings of the Twelfth International Conference on Artificial Intelligence & Statistics (AI-STATS), JMLR W&CP 5:384-391, 2009.
L.J.P. van der Maaten and G.E. Hinton. Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008.
The art of using t-SNE for single-cell transcriptomics. Nature Communications 10, 5416 (2019). doi:10.1038/s41467-019-13056-x.
Heavy-tailed kernels reveal a finer cluster structure in t-SNE visualisations. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, in print. arXiv:1902.05804 (2019).
Efficient algorithms for t-distributed stochastic neighborhood embedding. arXiv preprint arXiv:1712.09005 (2017).
Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data. Nat. Methods 16, 243 (2019). doi:10.1038/s41592-018-0308-4
Clustering with t-SNE, provably. SIAM J. Math. Data Sci. 1, 313-332 (2019). doi: 10.1137/18M1216134
Oskolkov, N. How to tune hyperparameters of tSNE [blog post]. Towards Data Science. 18 July 2019. [Accessed 9 February 2020] https://towardsdatascience.com/how-to-tune-hyperparameters-of-tsne-7c0596a18868.

NLP

[ConAE] Z. Liu, H. Zhang, C. Xiong, Z. Liu, Y. Gu, and X. Li, "Dimension Reduction for Efficient Dense Retrieval via Conditional Autoencoder", in arXiv preprint arXiv:2205.03284, 2022.

Video Scene Understanding

"SIMONe: View-Invariant, Temporally-Abstracted Object Representations via Unsupervised Video Decomposition", ICLM 2023. [Link]

Functions

Distance Metrics

S. Xiang, F. Nie, C. Zhang, "Learning a Mahalanobis distance metric for data clustering and classification", Pattern recognition, 2008

Non-clustering losses

[Controlling cluster size distribution loss]
[Locality-preserving loss]
[Self-Augmentation loss] Hu, W., Miyato, T., Tokui, S., Matsumoto, E., and Sugiyama, M. (2017). Learning discrete representations via information maximizing self augmented training. arXiv preprint arXiv:1702.08720
[Reconstruction loss] Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., and Manzagol, P.-A. (2010). Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research, 11:3371–3408.

[Dimensional Reduction] Hinton, G. E. and Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786):504–507.
[Denoising] Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., and Manzagol, P.-A. (2010). Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research, 11:3371–3408.

Clustering loss

[k-Means loss] Yang, B., Fu, X., Sidiropoulos, N. D., and Hong, M. (2016a). Towards k-means-friendly spaces: Simultaneous deep learning and clustering. arXiv preprint arXiv:1610.04794.
[t-SNE] van der Maaten, L. and Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9:2579–2605.
[DEC] J Xie, R Girshick, A Farhadi, "Unsupervised deep embedding for clustering analysis", Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:478-487, 2016. (Cluster assignment hardening loss)
[DEPICT] K. Ghasedi Dizaji, A. Herandi, C. Deng, W. Cai, and H. Huang, “Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 5736–5745, 2017. (Balanced assignments loss)
[DEN] P. Huang, Y. Huang, W. Wang, L. Wang, "Deep embedding network for clustering", In International Conference on Pattern Recognition (ICPR), pages 1532–1537, 2014. (Locality-preserving loss)
[SC] A. Ng, M. Jordan, Y. Weiss, "On spectral clustering: Analysis and an algorithm", In Advances in Neural Information Processing Systems (NIPS), pages 849–856, 2002. (Group sparsity loss)
[Cluster Classification loss] Hsu, C.-C. and Lin, C.-W. (2017). Cnn-based joint clustering and representation learning with feature drift compensation for large-scale image data. arXiv preprint arXiv:1705.07091.
[Agglomerative clustering loss] Yang, J., Parikh, D., and Batra, D. (2016b). Joint unsupervised learning of deep representations and image clusters. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5147–5156.

Training Technique

Self-training (Nigam & Ghani, 2000).

References

Classical Clustering

Deep Clustering

K_Cluster Decision

- https://www.datanovia.com/en/lessons/determining-the-optimal-number-of-clusters-3-must-know-methods/
- https://towardsdatascience.com/k-means-clustering-and-the-gap-statistics-4c5d414acd29

Unsupervised Learning

Overview

Papers

Survey

Classical Clustering

K Decision

Partitioning-based (Centroid -based)

Hierarchical-based

Tree-based (KdTree)

Density-based

Grid-based

Graph Distance-based

Model-based (Distribution)

Others

Deep Clustering

K Decision

Deep Clustering

Tree-based

Non-parametric Deep Clustering

Transformer-based Clustering

Dirichlet Process Mixture (DPM) - based models

Others

Applications

Dimensionality Reduction

Feature Selection + Elimination:

t-SNE Variants

NLP

Video Scene Understanding

Functions

Distance Metrics

Non-clustering losses

Clustering loss

Training Technique

References

Classical Clustering

Deep Clustering

K_Cluster Decision

Dimensionality Reduction:

Validation

About Me: