0) Motivation, Object and Related works:
Motivation:
t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology.
Objectives: Tree-SNE + Alpha-clustering
Building on recent advances in speeding up t-SNE and obtaining finer-grained structure, we combine the two to create tree-SNE, a hierarchical clustering and visualization algorithm based on stacked one-dimensional t-SNE embeddings.
Tree-SNE allows for visualization and elucidation of high-dimensional hierarchical structures by creating t-SNE embeddings with increasingly heavy tails to reveal increasingly fine-grained structures, and then stacking these embeddings to create a tree-like structure.
Introducing alpha-clustering, which recommends the optimal cluster assignment, without fore-knowledge of the number of clusters, based off of the cluster stability across multiple scales.
We then run spectral clustering on each one-dimensional embedding, computationally determining the number of distinct clusters in the embedding. The number of clusters will increase as α decreases.
We define the alpha-clustering of the data to be the cluster assignment that is stable across the largest range of α values, and then we demonstrate that alpha-clustering is competitive with the state of the art in unsupervised clustering algorithms on several data sets
1) Tree-SNE:
Motivation:
t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology.
Objectives: Tree-SNE + Alpha-clustering
Building on recent advances in speeding up t-SNE and obtaining finer-grained structure, we combine the two to create tree-SNE, a hierarchical clustering and visualization algorithm based on stacked one-dimensional t-SNE embeddings.
Tree-SNE allows for visualization and elucidation of high-dimensional hierarchical structures by creating t-SNE embeddings with increasingly heavy tails to reveal increasingly fine-grained structures, and then stacking these embeddings to create a tree-like structure.
Introducing alpha-clustering, which recommends the optimal cluster assignment, without fore-knowledge of the number of clusters, based off of the cluster stability across multiple scales.
We then run spectral clustering on each one-dimensional embedding, computationally determining the number of distinct clusters in the embedding. The number of clusters will increase as α decreases.
We define the alpha-clustering of the data to be the cluster assignment that is stable across the largest range of α values, and then we demonstrate that alpha-clustering is competitive with the state of the art in unsupervised clustering algorithms on several data sets
2) Alpha-Clustering:
Motivation:
t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology.
Objectives: Tree-SNE + Alpha-clustering
Building on recent advances in speeding up t-SNE and obtaining finer-grained structure, we combine the two to create tree-SNE, a hierarchical clustering and visualization algorithm based on stacked one-dimensional t-SNE embeddings.
Tree-SNE allows for visualization and elucidation of high-dimensional hierarchical structures by creating t-SNE embeddings with increasingly heavy tails to reveal increasingly fine-grained structures, and then stacking these embeddings to create a tree-like structure.
Introducing alpha-clustering, which recommends the optimal cluster assignment, without fore-knowledge of the number of clusters, based off of the cluster stability across multiple scales.
We then run spectral clustering on each one-dimensional embedding, computationally determining the number of distinct clusters in the embedding. The number of clusters will increase as α decreases.
We define the alpha-clustering of the data to be the cluster assignment that is stable across the largest range of α values, and then we demonstrate that alpha-clustering is competitive with the state of the art in unsupervised clustering algorithms on several data sets
References: