AnaXNet: Anatomy Aware Multi-label Finding
Classification in Chest X-ray
Keywords: Graph Convolutional Networks, Multi-Label Chest X-ray, image classification, Graph Representation
Keywords: Graph Convolutional Networks, Multi-Label Chest X-ray, image classification, Graph Representation
0. Motivation, Objective and Related Works:
Motivation:
Radiologists usually observe anatomical regions of chest Xray images as well as the overall image before making a decision.
However, most existing deep learning models only look at the entire X-ray image for classification, failing to utilize important anatomical information.
Objectives:
In this paper, we propose a novel multi-label chest X-ray classification model that accurately classifies the image finding and also localizes the findings to their correct anatomical regions. Specifically, our model consists of two modules, the detection module and the anatomical dependency module. The latter utilizes graph convolutional networks, which enable our model to learn not only the label dependency but also the relationship between the anatomical regions in the chest X-ray. We further utilize a method to efficiently create an adjacency matrix for the anatomical regions using the correlation of the label across the different regions. Detailed experiments and analysis of our results show the effectiveness of our method when compared to the current state-of-the-art multi-label chest X-ray image classification methods while also providing accurate location information.
Interpreting a radiology imaging exam is a complex reasoning task, where radiologists are able to integrate patient history and image features from different anatomical locations to generate the most likely diagnoses. Convolutional Neural Networks (CNNs) have been widely applied in earlier works in automatic Chest X-ray (CXR) interpretation, one of the most commonly requested medical imaging modality. Many of these works have framed the problem either as a multi-label abnormality classification problem [18,27], an abnormality detection and localization problem [5,22,29], or an image-to-text report generation problem [14,28]. However, these models fail to capture inter-dependencies between features or labels. Leveraging such contextual information that encodes relational information among pathologies is crucial in improving interpretability and reasoning in clinical diagnosis.
To this end, Graph Neural Networks (GNN) have surfaced as a viable solution in modeling disease co-occurrence across images. Graph Neural Networks (GNNs) learn representations of the nodes based on the graph structure and have been widely explored, from graph embedding methods [7,23], generative models [25,32] to attention-based or recurrent models [15,24], among others. For a comprehensive review on model architectures, we refer the reader to a recent survey [31]. In particular, Graph Convolutional Networks (GCNs) [13] utilize graph convolution operations to learn representations by aggregating information from the neighborhood of a node, and have been successfully applied to CXR image classification. For example, the multi-relational ImageGCN model learns image representations that leverage additional information from related images [17], while CheXGCN and DD-GCN incorporate label co-occurrence GCN modules to capture the correlations between labels [2,16]. To mitigate the issues with noise originating from background regions in related images, recent work utilizes attention mechanisms [1,34] or auxiliary tasks such as lung segmentation [6,3]. However, none of these works consider modeling correlations among anatomical regions and findings, e.g., output the anatomical location semantics for each finding.
We propose a novel model that captures the dependencies between the anatomical regions of a chest X-ray for classification of the pathological findings, termed Anatomy-aware X-ray Network (AnaXNet). We first extract the features of the anatomical regions using an object detection model. We develop a method to accurately capture the correlations between the various anatomical regions and learn their dependencies with a GCN model. Finally, we combine the localized region features via attention weights computed with a non-local operation [26] that resembles self-attention.
The main contributions of this paper are summarized as follows: 1) we propose a novel multi-label CXR findings classification framework that integrates both global and local anatomical visual features and outputs accurate localization of clinically relevant anatomical regional levels for CXR findings, 2) we propose a method to automatically learn the correlation between the findings and the anatomical regions and 3) we conduct in-depth experimental analysis to demonstrate that our proposed AnaXNet model outperforms previous baselines and state-of-the-art models.
1) AnaXNet:
Ref:/
Fig. 1: Model overview. We extract anatomical regions of interest (ROIs) and their corresponding features, feed their vectors to a Graph Convolutional Network that learns their inter-dependencies, and combine the output with an attention mechanism, to perform the final classification with a dense layer. Note that throughout the paper, we use the term bounding box instead of ROI.
Fig. 2: Examples of the results obtained by our best two models. The overall chest X-ray image is shown alongside two anatomical regions. The predictions from best performing models are compared against the ground-truth labels.
References:
https://medium.com/@dptmn200/unsupervised-deep-embedding-for-clustering-analysis-a-summary-f6e5f2dce94f