Keywords

1 Introduction

Hematolymphoid Cancer is a type of primary cancer associated with blood, bone marrow and lymphoid organs. This type of cancer is subjected to high mortality rate. It is very challenging task to distinguish the healthy cells with tumor affected ones since no standard way is yet proposed to distinguish these cells into biological subtypes for diagnosing hematolymphoid tumors on cells. The diagnosis of cancer using histopathology depends upon the topological structure and phenotyping of histology based entities like nuclei, tissue areas and cells. Characterization and understanding of these cellular morphology based over image level analysis, nuclei level analysis and entity based analysis is becoming popular and still a challenge when dealing with complex structures [1]. Graphical representations using nodes and edges help to analyze network of cancer cells more deeply and detect the tumor areas more efficiently. Nucleus is represented by a node within original image and edges represent cellular interactions for defining nodes similarity.

1.1 Lymphoid Neoplastic Cells

The proposed work focused on three lymphoid neoplastic cells associated with hematolymphoid cancer, named as Richter transformation-diffuse large B-cell lymphoma (RT-DLBL), chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL/SLL), and Accelerated aCLL/SLL, discussed below [1,2,3]:

  1. 1.

    Chronic Lymphocytic Leukemia (CLL) is a low grade B lymphoid neoplasm that originates in lymphocytes cells (white blood cells) present in bone marrow and then mix with blood. This type of lymphoma grows slowly and patients affected with this type have mild symptoms in initial stages.

  2. 2.

    Accelerated Chronic Lymphocytic Leukemia (aCLL) is the aggressive variant of chronic lymphocytic leukemia (CLL). The diagnosis of this type is more challenging when we deal with biopsy specimens.

  3. 3.

    Richter Transformation - Diffuse Large B-cell Lymphoma (RT-DLBL) is the rare variant and grows fast in lymphocytes cells. It becomes more aggressive with age of patients.

In this paper, we present a framework based mainly on Graph theory for understanding Tumor Micro Environment (TME) related to Hematolymphoid Cancer based over three neoplastic lymphomas CLL, aCLL and RT. The rest of the paper is organized as follows; Sect. 2 discusses the existing Graph based methods, Sect. 3 discusses proposed research methodology used in this study. Section 4 presents the implementation details, results, comparison and analysis and Sect. 5 concludes this work with future prospects.

2 Literature Review

Most Graph based approaches are applied for the diagnosis of Breast Cancer using Digital Pathology. For understanding the Tumor Micro Environment, based over Graph representations and entity based analysis on cells, nuclei and tissues. This area has not been much explored.

HACT-NET [10] handled the hierarchical structure and shape of tissue in Tumor Micro Environment on the basis histological entities like nuclei, tissues, cells etc. The major limitation of this study is that it is restricted to Breast Cancer yet. It has to be explored to other cancer types and other imaging modalities. Graph-Based Spatial Model [11] constructed topological tumor graphs to diagnose stromal phenotypes in Melanoma. The main limitation was the limited access of clinical cohorts for treating the patients after getting immunotherapy.

Spatial Analysis [12] presents a survey to learn various methods to learn spatial heterogeneity of cellular patterns in Tumor Micro Environment. From the survey, it is concluded that an automated and quantitative analysis of spatial analysis is required for dealing with large clinical cancer data that are subjected to complex spatial patterns. Graph based analysis [16] learns the complex relationship of cancer cells and components of Tumor Micro Environment for diagnosing breast tumors using Graph approach and mathematical morphology.

Cell Spatial Graph [9] decoded cellular and clonal phenotypes in cellular morphology and characterize spatial architectures. They used the concept of local and global graphs for understanding the profile orchestration and interaction of cellular components. It achieves better understanding of intratumoral heterogeneity in Digital Pathology. However, the proposed scheme was unable to handle complex architectures when dealing with critical neoplastic cases and fails to integrate with image level and nuclei level analysis. In this paper, we further improve [9] and integrate nuclei level information.

3 Research Methodology

In this paper, we have presented a framework based over Graph Theory for the diagnosis of Hematolymphoid Cancer using three neoplastic lymphomas. This Section presents the details of the dataset and the brief description of methods used (Fig. 1).

Fig. 1.
figure 1

The illustration of proposed framework used in this research for the diagnosis of lymphocytic pathology images. (a) Cell feature extraction for local graphs on the basis of Intensity, Morphology and Region. (b) Identification of healthy and tumor affected local cells using Fuzzy C Means Clustering. (c) Super cell and local graph construction using super pixel algorithm - Gaussian Mixture Model. (d) Cell graph construction and cancer diagnosis using Graph Neural Networks.

3.1 Proposed Framework

The proposed framework used in this research for the diagnosis of lymphocytic pathology images can be explained as follows:

  1. 1.

    Cell feature extraction for local graphs on the basis of Intensity, Morphology and Region. In this step, 24 features were extracted for each neoplastic cell. These features exhibit the morphological, regional and intensity based patterns in nuclei cells. Intensity based features were related to mean, deviation, range and boundary. The morphological based features were related to shape and structure of nuclei in pathology slides such as circularity, elliptical deviation, orientation, axis length, perimeter and equiv diameter. The regional based features were related to boundary saliency, mass displacement, solidity and weighted centroid. Laplacian score method is used for eliminating the redundant features. Using multiple pass adaptive voting, nuclei was segmented on each image and then overlaid over original images. 10 features were selected for local graph construction.

  2. 2.

    Identification of healthy and tumor affected local cells using Fuzzy C Means Clustering. In this step, cells were clustered into tumor affected and healthy cells for each neoplastic cell and build a cell classifier model based on cell types. Results are illustrated in Fig. 2 in Sect. 4 of this paper.

  3. 3.

    Super cell and local graph construction using super pixel algorithm - Gaussian Mixture Model. In this step, super pixel segmentation was done at four scales 8\(\,\times \,\)8, 14\(\,\times \,\)14, 20\(\,\times \,\)20 and 26\(\,\times \,\)26. Highest number of superpixels were generated in CLL as compared to other neoplastic lymphomas. These segmentation results are overlaid over the original ones based on the super pixels generated. Then the model pools the supercell features and generates supercells on the basis of which local graphs are constructed. Labelled super cells were generated and overlaid with original images. Again, fuzzy c means clustering was performed to distinguish between healthy and tumor affected global cells. The number of superpixels generated at each cell were obtained. Results are illustrated in Fig. 3 in Sect. 4 of this paper.

  4. 4.

    Cell graph construction and cancer diagnosis using Graph Neural Networks. We focused on global graph construction, Python based library known as Histocartography is used for analyzing image level and nuclei level representations using entity-graph based analysis. This step is further divided into 3 sub steps: .

    • (i) Constructing Cell Graph using pathology slides - in which identification of nodes, edge, features per node and detection of patch level nuclei is done using pretrained HoverNet model. Highest number of nodes and edges were generated at CLL by the model. For characterizing the nuclei, node global features were extracted using ResNet with patch size 72 and resize size 224. For analyzing the intra and inter tumoral heterogeneity in nuclei with the help of edges, K-Nearest Neighbor graphs were constructed with k=5 and graph properties were obtained with outputs of number of nodes (CLL-5543, aCLL-3579 and RT-2376), edges (CLL-27715, aCLL-17645 and RT-11880) and features extracted per node (514). Results are illustrated in Fig. 4 in Sect. 4 of this paper.

    • (ii) Classification and Analysis of Cell Graphs - For classifying the tumor areas in neoplastic cells RT, aCLL and CLL using Cell Graphs, Graph Neural Network (GNN) was trained with node dimensions 514 and number of classes 3. The model was able to give highest relative node importance to CLL as compared to aCLL and RT. For the analysis of graph representation, GraphGradCAM modified version for feature attribution was integrated with Graph Neural Network, and was used for feature attribution. Node importance was extracted for neoplastic cells on the basis of constructed cell graphs. Results are illustrated in Fig. 4 in Sect. 4 of this paper.

    • (iii) Analysis of Results - For understanding the analysis of shape and size of nuclei and tumor cells in the images, quantitative analysis was conducted for nuclei analysis along with patch level analysis for each cell defining the importance of values generated by model for area, contrast, crowdness based over nuclei level analysis.

    Quantitative analysis was conducted in which pathological facts were observed and important scores were evaluated by the model for neoplastic cells. Following pathological facts were observed in this analysis:

    (i) “aCLL is bigger in size than CLL” - aCLL, an aggressive form of neoplastic lymphoma, is bigger in size than CLL, as analyzed in importance value (0.8033) of Area feature.

    (ii)“RT is more solid in shape as compared to CLL that has gas bubbles” - as analyzed in importance value (1.3161) of GLCM contrast feature.

    (iii) “RT is faster and grows rapidly as compared to aCLL” - as analyzed in importance value (1.2123) of crowdness.

    Image level and nuclei level analysis of neoplastic cells was conducted for understanding patch level nuclei in neoplastic lymphomas in which 20 most important nuclei of cells were visualized and then random nuclei were visualized. Results are illustrated in Fig. 4 in Sect. 4 of this paper.

3.2 Dataset Description

A dataset of digital pathology slides of The University of Texas MD Anderson Cancer Center (UTMDACC) were used in this research. We have 20 Digital Pathology slides for each neoplastic cells CLL, aCLL and RT-DLBL and are affected with Hematolymphoid Cancer. Each slide is associated with one patient.

3.3 Description of Methods Used in This Framework

Fuzzy C means clustering algorithm (FCM) - a type of soft clustering in which we assign each data point with a value of likelihood or probability belonging to a particular cluster. This algorithm is used for identifying the healthy cells versus the tumor affected cancer cells in CLL, aCLL and RT.

Gaussian Mixture Model (GMM) - A super pixel based algorithm used for segmenting images for identifying cellular regions and structural patterns. The main properties of this model are that it is able to handle pixels not identically distributed and Eigen decomposition is used for defining the covariance matrix [5]. This algorithm is used for segmenting the images based over super pixels that are further used for making supercells on the basis which local graphs are constructed.

Graph Neural Network - A type of Artificial Neural Network that deals with Graph based structure data. This architecture is well suited for analyzing the interactions between cells. Since, the cellular data does not lie in a grid format. Here Node can be considered as nuclei and edges as interactions between cells.

4 Results and Discussion

In this Section, we provide the results of proposed framework for diagnosing cancer using Digital Pathology and compared the results of this study with Hierarchical Graph Modelling study related to Hematolymphoid Cancer. Figure 2 shows Fuzzy C-mean Clustering results. Figure 3 shows generation of super pixels and super-cell constructed with local graphs. Figure 4 shows cell graph construction with node importance, and image and nuclei level visualization of 20 most important nuclei.

Fig. 2.
figure 2

FCM Clustering results of healthy (represented by blue color) versus tumor affected local cells (represented by green color) in CLL, aCLL and RT. (Color figure online)

Fig. 3.
figure 3

(i) Gaussian Mixture Model - Super pixel segmented results at four scales 8\(\,\times \,\)8, 14\(\,\times \,\)14, 20\(\,\times \,\)20 and 26\(\,\times \,\)26 scales for CLL, aCLL and RT cells, super pixels labelling is represented by red color - complex labelling CLL with 20255 (8\(\,\times \,\)8), 8254 (14\(\,\times \,\)14), 3340 (20\(\,\times \,\)20) and 1788 (26\(\,\times \,\)26), moderate in aCLL with 17818 (8\(\,\times \,\)8), 6532 (14\(\,\times \,\)14), 3361 (20\(\,\times \,\)20) and 1907 (26\(\,\times \,\)26) and low in RT with 17035 (8\(\,\times \,\)8), 6033 (14\(\,\times \,\)14), 3187 (20\(\,\times \,\)20) and 1796 (26\(\,\times \,\)26) number of superpixels generated at each scale - Highest number of superpixels were generated in CLL as compared to other neoplastic lymphomas. (ii) On basis of segmented results, Super cells were constructed with Local Graph Construction (CLL, aCLL, RT - moderate variation in CLL as compared to Complex in aCLL and RT. (iii) Labelled Super cells constructed with Local Graph Construction (CLL, aCLL, RT - better graphs constructed in aCLL and RT as compared to CLL. (iv) FCM Clustering results of healthy (represented by blue color) verses tumor affected super cells (represented by green color) in CLL, aCLL and RT.

Fig. 4.
figure 4

(i) Nuclei Detection (CLL, aCLL, RT) - represented by purple circles with black outline boundary. (ii) KNN Graph Construction (CLL, aCLL, RT) - graphs constructed with blue color with yellow labelling. (iii) Cell Graphs Construction with node importance, represented by blue color (CLL, aCLL, RT) - high variation in CLL, moderate variation in RT and least variation was seen in aCLL.(iv) Quantitative Analysis of Importance values - pathological facts were observed and important scores were evaluated by a model for neoplastic cells. Image level and Nuclei level analysis (CLL, aCLL, RT) - 20 most important nuclei of cells were visualized and then random nuclei were visualized. (Color figure online)

4.1 Discussion and Comparison of This Study with Related Existing Hierarchical Graph Modeling Study

The results of the proposed framework are compared with the results of Hierarchical Graph Modeling study [22] and other three graph methods. The comparative analysis is provided in Table 1. In Hierarchical Graph Modeling study, a multi scale framework was proposed for examining the tissue in lymphoid neoplasms accurately. Same lymphoid neoplastic cells were used in this study.

Hierarchical phenotyping was integrated with graph modeling for characterizing the spatial architecture in TME using digital pathology [87]. The proposed approach was able to decode the cellular and clonal hierarchy in the Tumor Micro environment (TME). Also, it extracted both the local and global cellular interactions and their intratumoral heterogeneity. The major limitation was that it was not able to capture neoplastic cell characteristics when dealing with critical cases.

Comparing our results with Hierarchical Graph Modeling ones, Fuzzy algorithm affected the clustering results and allowed to control the level of fuzziness by the Fuzziness parameter helping better classification of cell types at local and global level in TME. Whereas in Hierarchical Graph study, spectral clustering was used which lacked to cover the cell characteristics at cluster boundaries. The use of Gaussian Mixture allowed better segmentation of tumor affected areas and then super-pixels are generated, local graphs were constructed. For global graph analysis, we initially detected patch level nuclei at global level to measure the intra and inter tumoral heterogeneity and obtained the graph properties. Then with the help of Cell Graphs constructed by Graph Neural Network, we measured the relative node importance for each cell.

The hierarchical graph modeling lacks in integration of image and nuclei level analysis. Also the model was not able to cover complex spatial patterns when dealing with critical neoplastic cases. In Hierarchical Graph Modelling study, Delaunay triangulation was used for global graphs construction that only covered the edge and node information but lacked in covering nuclei level details at global level. The three sub steps workflow used in the four step of our proposed framework has solved the problem with detection of patch level nuclei at global level.

Quantitative analysis of pathological facts helped to observe the shape and structure of tumor with the help of integrated image and nuclei level analysis. The results show the effectiveness of the proposed approach, for understanding and characterizing the cellular morphology and spatial architecture both at global as well as local level. The integration of image and nuclei level analysis have allowed better graph representations of pathological slides for the diagnosis of Hematolymphoid Cancer.

Table 1. Comparison with other graph based methods

5 Conclusion and Future Work

Using the concept of local and global graph theory for cancer diagnosis using neoplastic lymphoid cells proved to be efficient method for characterizing spatial architecture and better understanding of cellular morphology. Integrating this concept with Graph Neural Networks has allowed getting more detailed analysis of pathological images at image and nuclei level. Image level analysis allows identifying complex microscopic structural patterns around tumor cells and nuclei level analysis helps to get better learning of cellular architectures more deeply. Entity based analysis of these graph representations allow to extract biological insights more deeply in tissue images. The hierarchical graph approach provides better results for cancer diagnosis based on both pixels and entity graph based classification of tumor in neoplastic cells. In future work, this approach can be used for other cancer types, cells and tissues. We will also explore multiple instance learning (MIL) algorithms to investigate the tumor micro-environment [29]. It also can also be implemented using other cancer related imaging modalities apart from digital pathology.