This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE SYSTEMS JOURNAL

1

Automated Screening System for Acute Myelogenous Leukemia Detection in Blood Microscopic Images Sos Agaian, Senior Member, IEEE, Monica Madhukar, and Anthony T. Chronopoulos, Senior Member, IEEE

Abstract—Acute myelogenous leukemia (AML) is a subtype of acute leukemia, which is prevalent among adults. The average age of a person with AML is 65 years. The need for automation of leukemia detection arises since current methods involve manual examination of the blood smear as the first step toward diagnosis. This is time-consuming, and its accuracy depends on the operator’s ability. In this paper, a simple technique that automatically detects and segments AML in blood smears is presented. The proposed method differs from others in: 1) the simplicity of the developed approach; 2) classification of complete blood smear images as opposed to subimages; and 3) use of these algorithms to segment and detect nucleated cells. Computer simulation involved the following tests: comparing the impact of Hausdorff dimension on the system before and after the influence of local binary pattern, comparing the performance of the proposed algorithms on subimages and whole images, and comparing the results of some of the existing systems with the proposed system. Eighty microscopic blood images were tested, and the proposed framework managed to obtain 98% accuracy for the localization of the lymphoblast cells and to separate it from the subimages and complete images. Index Terms—Acute myelogenous leukemia (AML), classification, feature extraction, segmentation.

I. I NTRODUCTION

W

HITE blood cells (WBCs) or leukocytes play a major role in the diagnosis of different diseases; as a result, extracting information about them is valuable for hematologists [50]. The term leukemia comes from the Greek word “leukos” meaning “white” and “aim” meaning “blood.” It refers to the cancer of the blood or the bone marrow (where blood cells are produced). Diagnosing leukemia is based on the fact that white cell count is increased with immature blast cells (lymphoid or myeloid), and neutrophils and platelets are decreased [41]. Therefore, hematologists routinely examine blood smear under microscope for proper identification and classification of blast

Manuscript received October 22, 2012; revised February 7, 2014; accepted February 15, 2014. This work was supported by the U.S. National Science Foundation under Grant HRD-0932339 given to The University of Texas at San Antonio. S. Agaian is with the Department of Electrical and Computer Engineering, College of Engineering, University of Texas at San Antonio, San Antonio, TX 78249 USA (e-mail: [email protected]). M. Madhukar is with Intrinsic Imaging LLC, San Antonio, TX 78229 USA. A. T. Chronopoulos is with the Department of Computer Science, University of Texas at San Antonio, San Antonio, TX 78249 USA (e-mail: anthony. [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JSYST.2014.2308452

cells [40]. The presence of the excess number of blast cells in peripheral blood is a significant symptom of leukemia. Leukemia is broadly classified as: 1) acute leukemia (which progresses quickly); and 2) chronic leukemia (which progresses slowly) [11]. Acute myelogenous leukemia (AML) is a heterogeneous clonal disorder of haemopoietic progenitor cells (“blasts”), which lose the ability to differentiate normally and to respond to normal regulators of proliferation. This loss leads to fatal infection, bleeding, or organ infiltration, typically, in the absence of treatment, within a year of diagnosis. AML is confirmed when the marrow contains more than 30% blasts. In this paper, only AML is considered. AML is a fast-growing cancer of the blood and bone marrow. It is fatal if left untreated, due to its rapid spread into the bloodstream and other vital organs [1]. Furthermore, AML is the most common myeloid leukemia, with a prevalence of 38 cases per 100 000 rising to 179 cases per 100 000 adults aged 65 years and older [49]. AML also makes up 15–20% of childhood leukemia, roughly 60% of cases occur in people aged younger than 20 years. That is about 500 children and adolescents in the U.S. each year are affected by AML [46], [49]. Survival in childhood acute lymphoblastic leukaemia is approaching 90%, but treatment in infants (i.e., children younger than 12 months) and adults needs improvement [46]. Early diagnosis of the disease is fundamental for the recovery of patients, particularly in the case of children [1]. In this paper, only acute lymphocytic leukemia is considered. AML is often difficult to diagnose since the precise cause of AML is still unknown. In addition, the symptoms of the disease are very similar to flu or other common diseases, such as fever, weakness, tiredness, or aches in bones or joints [1]. If the described symptoms are present, blood tests, such as a full blood count, renal function and electrolytes, and liver enzyme and blood count, have to be done [1]. Since there is no staging for AML, choosing the type of treatment can vary from chemotherapy, radiation therapy, bone marrow transplant, and biological therapy [46]. Fig. 1 shows six different images, three depicting healthy cells from non-AML patients and three from AML patients. This technique greatly depends on the operator’s ability and fatigue levels. Diagnostic confusion also occurs due to imitation of similar signs by other disorders [13]. Moreover, the identification task is usually difficult due to the variety of features and the often unclear images cause missing out on vital indicators as to which form of leukemia is being observed. Furthermore, due

1932-8184 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2

Fig. 1. Images from ASH. (a)–(c) Myeloblasts from AML patients. (d)–(f) Healthy cells from non-AML patients.

to the complex nature of blood smear images and variation in slide preparation techniques, much work has to be done to meet real clinical demands. Thus, these factors can lead to wrong diagnosis. In the past, digital image processing techniques have helped to analyze the cells that lead to more accurate, standard, and remote disease diagnosis systems. However, there are a few complications in extracting the data from WBCs due to wide variation of cells in shape, size, edge, and position [50]. Moreover, since illumination is imbalance, the image contrast between cell boundaries and the background varies depending on the condition during the capturing process [52]. Additionally, the cost of leukemia treatment can be overwhelming. The average cost of just one round of chemotherapy is $15 000. Unfortunately, many patients require several rounds of chemotherapy to recover. The National Cancer Institute’s investment in leukemia research increased from $223.5 million in fiscal year (FY) 2006 to $239.7 million in FY 2010. The proposed system attempts to present an effective tool that acts as an ancillary to the physicians in decision-making. Many attempts have been made in the past to construct systems that aid in acute leukemia segmentation and classification [1]–[16], [18]–[22], [30], [34]–[36], [58], [59]. There are four main categories in segmentation techniques: thresholding techniques and boundary-based and region-based segmentation and hybrid techniques that combine boundary and region criteria [42]. When it comes to peripheral blood or bone marrow smears, region-based or edge-based schemes are the most popular [43]. A proper combination of both boundary and region information may present better results than those obtained by either method on its own [42]. From a survey conducted by Ilea and Whelan [45] on color image segmentation algorithms, it was concluded that color images present more reliable image segmentations than gray-level images. Many segmentation algorithms were presented in literature, including [17], [33], and [57], where Otsu segmentation and automated histogram thresholding were employed to segment WBCs from the blood smear image. The work in [9] employed contour signature to identify the irregularities in the nucleus boundary. The work in [8] employed selective filtering to segment leukocytes from the other blood components. The work in [36] employed hue, saturation, and value (where hue represents color, saturation indicates the range of gray in the color

IEEE SYSTEMS JOURNAL

space, and value is the brightness of the color and varies with color saturation), color space, and expectation–maximization algorithm (which consists of two steps, i.e., expectation and maximization steps) to identify the cytoplasm and nucleus of the WBCs. A watershed segmentation algorithm to segment nucleus from the surrounding cytoplasm of cervical cancer images was proposed by Nallaperumal and Krishnaveni [10]. The work in [44] presented an unsupervised color segmentation to bring out the WBCs from acute leukemia images. A common drawback observed in these systems is that these classify only subimages. The goal of this paper is to implement a fully automated classifier system for AML. The constructed system is applied to complete blood smear images containing multiple nuclei. Two new features, such as cell energy and Hausdorff dimension (HD), have been used. The result is then compared with the results of the existing models. This paper is structured as follows. Section II focuses in detail on the process overview of the proposed model. Sections III and IV show the image processing methods being used to perform enhancement and segmentation. Section V builds the database of future vectors, which are used by the classification component of the system. Sections VI and VII present the experimental results of the classifier system based on the features extracted. Section VIII contains conclusions and future work. II. P ROCESS OVERVIEW The system proposed ensures step-by-step processing. Fig. 2 depicts the system overview. The system overview gives a detailed depiction of the sequence of steps that are to be followed for efficient classification of acute leukemia. The first step involves preprocessing the complete images to overcome any background nonuniformity due to irregular illumination. Preprocessing also includes color correlation where RGB images are converted to L∗ a∗ b color space images. This step ensures perceptual uniformity. This step is followed by k-means clustering to bring out the nucleus of each cell. Segmentation is followed by feature extraction based on which classification and validation are performed. III. P REPROCESSING Image Acquisition: For AML, we accessed the American Society of Hematology (ASH) for their online image bank of leukemia cells. The ASH image bank is a web-based image library that offers comprehensive and growing collections of images relating to a wide range of hematology categories. They provide high-quality images captured using different microscopes in different resolutions. Our database for AML comprised 80 images—40 from AML patients and 40 from non-AML patients. The resolution used for our classification was 184 × 138 pixels. CIELAB Color Features and Color Correlation: The images generated by digital microscopes are usually in RGB color space, which is difficult to segment. In practice, the blood cells and image background varies greatly with respect to color and intensity. This can be caused by multiple reasons such as camera settings, varying illumination, and aging stain. In order to make the cell segmentation robust with respect to these

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. AGAIAN et al.: AUTOMATED SCREENING SYSTEM FOR AML DETECTION IN BLOOD MICROSCOPIC IMAGES

Fig. 2.

3

AML classifier system overview.

IV. N UCLEI S EGMENTATION

Fig. 3.

RGB to CIELAB conversion and segmentation example.

variations, an adaptive procedure is used: the RGB input image is converted into the CIELAB or, more correctly, the CIEL∗ a∗ b∗ color space [38], [39]. The key reasons for these are, first, to reduce memory requirement and to improve the computational time. Second, the perceptual difference between colors is proportional to the Cartesian distance in the CIELAB color space. Therefore, the color differences between two samples can be calculated by using a Euclidean distance. Third, it has only two color components (a and b), and it is designed to approximate human vision (the L component closely matches human perception of lightness or it can be used to adjust the lightness contrast using the L component. Finally, the a and b components can be used to make accurate color balance corrections. In other words, the L∗ a∗ b∗ color space with dimension L that represents the lightness of the color, dimension a∗ that represents its position between red/magenta and green, and dimension b∗ that represents its position between yellow and blue. Due to its perceptual uniformity, L∗ a∗ b produces a proportional change visually for a change of the same amount in color value. This ensures that every minute difference in the color value is noticed visually. Being device independent is an added advantage of the L∗ a∗ b color space. Fig. 3 presents the result of RGB to CIELAB color conversion and segmentation procedure.

The goal of image segmentation is to extract important information from an input image. It plays a key role since the efficiency of subsequent feature extraction and classification relies greatly on the correct identification of the myeloblasts. Many algorithms for segmentation have been developed for gray-level images [51], [54]. Segmentation in this system is performed for extracting the nuclei of the leukocytes using color-based clustering. Cluster analysis is the formal study of methods and algorithms for grouping, or clustering, objects according to measured or perceived intrinsic characteristics or similarity. Cluster analysis does not use category labels that tag objects with prior identifiers, i.e., class labels. k-means, which is one of the most popular unsupervised learning algorithm and is also a simple clustering algorithm, was first published in 1955. k-means is still widely used. This speaks to the difficulty in designing a general-purpose clustering algorithm and the ill-posed problem of clustering [49], [53]. In this paper, we chose clusters corresponding to nucleus (high saturation), background (high luminance and low saturation), and other cells (e.g. erythrocytes and leukocyte cytoplasm). Here, every pixel is assigned to one of these classes using the properties of the cluster center. k-Means Clustering Algorithm: The k-means algorithm requires three user-specified parameters: the number of clusters k, cluster initialization, and distance metric. A k-means clustering procedure is used to assign every pixel to one of the clusters. Every pixel is assigned to one of these classes using the properties of the cluster center. Each pixel of an object is classified into k clusters based on the corresponding ∗ a and ∗ b values in the L∗ a∗ b color space. Therefore, each pixel in the L∗ a∗ b color space is classified into any of the k clusters by calculating the Euclidean distance between the pixel and each color indicator. These clusters correspond to nucleus (high saturation), background (high luminance and low saturation), and other cells (e. g., erythrocytes and leukocyte cytoplasm). Each pixel of the entire image will be labeled to a particular color depending on the minimum distance from each indicator. We consider only the cluster that contains the blue nucleus, which is required for the feature extraction. While performing k-means segmentation of complete images, it was observed that, in some of the segmented images, only the edges of the nuclei were obtained as opposed to the whole images of the nuclei. This shortcoming was overcome by employing morphological filtering [1]. An image is partitioned into several regions depending on the features to be extracted. Employing morphological filtering ensures that perceptibility

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4

IEEE SYSTEMS JOURNAL

Fig. 5. Superimposing of the nucleus with a grid of squares for box count measure.

Fig. 4. Feature set developed for the proposed system comprising of shape, color, texture features, and HD.

and visibility of these regions improve. The following actions were performed in order to obtain the desired outcome. Once these actions are performed, the following textureand shape-based features are then extracted from these whole images: edge enhancement (used by the Sobel operator), to enhance the borders of the membranes and the cells (this helps in segmenting grouped cells and subsequent edge detection [24], [25]); canny edge detection, to obtain outputs with continuous edges, in general [26]. [37], [42], [60]; dilation, to connect the separated points of the membrane in a better way (it gives a good outline of the perimeter of the nuclei [35]. In this paper, we use for dilation a 2 × 2 structuring element); hole-filling, to fill internal holes of the connected element having the largest area. V. F EATURE E XTRACTION Feature extraction in image processing is a technique of redefining a large set of redundant data into a set of features of reduced dimension. Transforming the input data into the set of features is called feature extraction. Feature selection greatly influences the classifier performance; therefore, a correct choice of features is a very crucial step. In order to construct an effective feature set, several published articles were studied, and their feature selection methodology was observed. It was noted that certain features were widely used as they gave a good classification. We implemented these features on whole images in our system. Those features were considered to boost the classifier performance. Fig. 4 gives the set of features chosen to classify the image database. HD: Fractals have been used in medicine and science in the past for various quantitative measurements [27], [28]. The fractal dimension D is a statistical quantity that gives an indication of how completely a fractal appears to fill space. There are many specific definitions of fractal dimension. The most important theoretical fractal dimensions are the Rényi dimension, the HD, and the packing dimension. Practically, the

box-counting dimension is widely used, partly due to their ease of implementation. In a box counting algorithm, the number of boxes covering the point set is a power-law function of the box size. Fractal dimension is estimated as the exponent of such power law. All fractal dimensions are real numbers that characterize the fractalness (texture/roughness) of the objects. Myeloblast can be differentiated using perimeter roughness of the nucleus. HD is considered an essential feature considered in our proposed system. The procedure for HD measurement using the box counting method is elaborated below as an algorithm: 1) binary image in obtained from the gray-level image of the blood sample; 2) edge detection technique is employed to trace out the nucleus boundaries; 3) edges are superimposed by a grid of squares; 4) the HD may then be defined as follows: HD =

log(R) log (R(s))

(1)

where R is the number of squares in the superimposed grid, and R(s) is the number of occupied squares or boxes (box count). Higher HD signifies higher degree of roughness. Fig. 5 illustrates the given algorithm. It shows how the nucleus from a noncancer cell is superimposed with a grid of squares to perform suitable box counting. The finer the grid gets, the more accurate is the shape approximated. Table I depicts the results of HD on subimages and complete images. It can be clearly observed that, for subimages, there is only a marginal difference in the HD value, whereas there is a distinct difference for complete images. Thus, HD turned out to be a crucial feature in our system particularly since we considered whole images of the blood sample. In the whole images, the number of nuclei under the field of view was much higher for a cancerous case as opposed to the noncancerous case. This resulted in steep difference in box count between the two cases and thereby proved to be an effective feature. LBP: The concept of local binary pattern (LBP) was introduced for texture classification [31], [47], [48]. This approach has many advantages. For example, the LBP texture features have the following characteristics: 1) They are robust against illumination changes; 2) they are very fast to compute; 3) they do not require many parameters to be set; 4) they are local

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. AGAIAN et al.: AUTOMATED SCREENING SYSTEM FOR AML DETECTION IN BLOOD MICROSCOPIC IMAGES

5

TABLE I R ESULTS OF HD

Fig. 6.

Circles.

features; 5) they are invariant with respect to monotonic grayscale transformations and scaling; and 6) they have performed very well in many computer vision image retrieval applications. The LBP method has proved to outperform many existing methods, including the linear discriminant analysis and the principal component analysis. In order to deal with textures at different scales, the LBP operator was later extended to use neighborhoods of different sizes. Defining the local neighborhood as a set of sampling points evenly spaced on a circle centered at the pixel to be labeled allows any radius and number of sampling points. When a sampling point does not fall in the center of a pixel, bilinear interpolation was employed In the LBP method where each pixel is replaced by a binary pattern that is derived from the pixel’s neighborhood. Each grayscale pixel P of an image is used as a center of a circle with radius R = 1 or 2 (radius R is usually kept very small). M represents the number of samples that determines the number of points that are taken uniformly from the contour of the circle. If needed, these points are interpolated from adjacent pixels. Each grayscale pixel P is compared with these sample points one by one. If the center point P is larger than the current neighborhood sample point I, the result is a binary zero; otherwise, the result is a binary one. When doing this operation, for example, clockwise from a certain starting point, the result will be a binary pattern with length M . This operation is illustrated in Fig. 6.

Fig. 7. LBP operator example.

For our database of images, an (8, 1) circular neighborhood was used. The segmented images were extracted using k-means clustering, and then the LBP operator was applied on them before calculating the HD (see Fig. 7). Two sets of values were extracted: first, HD of the 80 images without applying LBP, and second, HD of the images after applying LBP. When comparing these two data sets, it was observed that the LBP operator enhanced the overall performance by a very high margin. Additionally, the following features have been also chosen in our classification system: shape gray-level cooccurrence matrix (GLCM) and color features. The choice features were justified by extensive computer simulations in order to identify the ones that yielded maximum discrimination capability, thus achieving the optimal diagnostic performance. In this paper, we use several features, such as, shape features, GLCM features, and color features [2]–[8]. The diagnostic performance of some new individual features selected in this paper will be analyzed in the following. Shape features. One of the shape features that has proven to be a good measure for classifying AML by their shape is compactness [2]–[8]. The shape of the nucleus, according to haematologists, is an essential feature for discrimination

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 6

IEEE SYSTEMS JOURNAL

TABLE II S HAPE F EATURES W ITH T HEIR I LLUSTRATIONS

of myeloblasts. Region- and boundary-based shape features are extracted for shape analysis of the nucleus. All the features are extracted from the binary-equivalent image of the nucleus where the nucleus region is represented by the nonzero pixels. Table II displays the difference in the values of the shape features for a pair of cancer and noncancer nuclei. GLCM features [23], [24]. Texture is defined as a function of the spatial variation in pixel intensities [1], [10]. The GLCM and associated texture feature calculations are image analysis techniques. Gray-level pixel distribution can be described by second-order statistics such as the probability of two pixels having particular gray levels at particular spatial relationships. This information can be depicted in 2-D gray-level cooccurrence matrices, which can be computed for various distances and orientations. In order to use information contained in the GLCM, Haralick [23] defined some statistical measures to extract

Fig. 8.

Cell energy plot.

Fig. 9.

Standard deviation plot.

textual characteristics. Some of these features are the following. 1) Energy: Also known as uniformity (or angular second moment), it is a measure of homogeneity of image. 2) Contrast: The contrast feature is a difference moment of the regional cooccurrence matrix and is a measure of the contrast or the amount of local variations present in an image. 3) Entropy: This parameter measures the disorder of an image. When the image is not texturally uniform, entropy is very large. 4) Correlation: The correlation feature is a measure of regional-pattern linear dependence in the image. Color features. In addition to the features aforementioned, we have used the following color-based feature. 1) Cell Energy: Also known as the measure of uniformity, it is the different Lab image components. We define feature “δ” to be ⎞ ⎛ n   2 (xi − x ) ⎟ ⎜  ⎟ ⎜ i=1 √ 2 ⎟ (2) ⎜ P (i, j) + ( −1) ⎜ δ= ⎟ n − 1 ⎠ ⎝ i j  where x = ni=1 xi /n, P (i, j) represents the normalized GLCM for the ith row and jth   element column, and i j P 2 (i, j) represents the ASM. Fig. 8 shows the plot that indicates the margin by which this feature differentiates cancer from noncancerous images. Fig. 9 depicts the standard deviation plot for the given database of images.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. AGAIAN et al.: AUTOMATED SCREENING SYSTEM FOR AML DETECTION IN BLOOD MICROSCOPIC IMAGES

VI. C OMPUTER S IMULATION The selection of a classification technique for classification is a challenging problem because an appropriate choice given the available data can significantly help improving the accuracy in credit scoring practice. There is a plenty of statistical techniques, which aim at solving binary classification tasks. In this paper, we use a support vector machine (SVM) for constructing a decision surface in the feature space that bisects the two categories, i.e., cancerous and noncancerous, and maximizes the margin of separation between two classes of points. SVMs is a promising nonlinear nonparametric classification technique, which already showed good results in the medical diagnostics, optical character recognition, electric load forecasting, and other fields [31], [56]. Moreover, the SVM is a powerful stateof-the-art algorithm with strong theoretical foundations based on the Vapnik–Chervonenkis theory and with strong regularization properties. Regularization refers to the generalization of the model to new data. Much of the initial success of SVMs was attributed to the so-called kernel trick wherein training data are implicitly mapped to a high-dimensional feature space, and a margin maximizing linear classifier is learned in this mapped space [31]. An SVM is primarily a two-class classifier. It can be either linear or nonlinear. In this paper, we choose a linear SVM two-class classifier; because it is not computationally expensive, it does not employ the kernel trick explicitly, and it achieves, in general, a good performance [31], [55], [56]. Following the classification using the SVM, a statistical method called cross-validation is used for evaluating and comparing learning algorithms. Cross-validation is a technique for judging how the results of a statistical analysis will generalize to an independent data set. It is mainly used in settings where the goal is prediction and where one wants to estimate how precisely a predictive model will perform in practice. Three kinds of validation techniques have been used: K-fold, holdout, and leave-one-out [29]. Hold-out: The data are split into two nonoverlapped parts: one for testing and the other for training. The test data are held out and not looked at during training. K-fold: In K-fold cross-validation, the data are first partitioned into K equally (or nearly equally) sized sets or folds. Leave-one-out: Leave-one-out cross-validation is a special case of K-fold cross-validation where K represents the number of instances in the data. Performance evaluation: In order to ensure the effectiveness of the presented system, we employed certain measures based on which decisions were made. Precision, specificity, sensitivity, and f-measure: These are all defined in relation to the possible outcomes of the classifier system. When attempting to classify a specimen, there are four possibilities: true positives (TP), when cancer cells are correctly identified; false positives, when noncancer cells are identified as cancerous; true negatives, when noncancer cells are correctly identified; and false negatives, when cancer cells are identified as noncancerous. Table III summarizes these parameters.

7

TABLE III PARAMETERS FOR P ERFORMANCE E VALUATION

Fig. 10. Performance of HD with and without employing LBP.

VII. E XPERIMENTAL R ESULTS The proposed technique has been applied on peripheral blood smear images obtained from two places, as aforementioned. To evaluate the proposed method, the following four measures of accuracy were used in this paper. A microscopic blood image of size 184 × 138 is considered for evaluation. The superiority of the scheme is demonstrated with the help of an experiment. Feature extraction with and without the LBP operator presented very interesting results. The system constructed without having to employ the LBP operator gave an efficiency of 93.5%. All the three validation methods were incorporated into our system. However the performance of HD, in particular, after using LBP increased the classifier performance by 4%. By employing LBP, the edges of the nuclei of the myleoblasts were extracted in a very pronounced manner. This effective edge detection enhanced the HD, as the box count for AML was much more than the box count for non-AML images. To see the impact of HD in the feature set, the classifier was run with HD as the only feature. This was done twice, once with applying LBP operator and once without LBP operator. All the parameters for evaluation were extracted for both sets. The results are illustrated in Fig. 10. It was observed that, when LBP was not employed, the HD performance was only around 70%, whereas when LBP was employed, the percentage escalated to 93%. This clearly shows the influence of the LBP operator on the system. In order to see the effectiveness of the developed algorithms, a trial was run comparing the system’s performance on subimages and whole images. The obtained results further corroborated the impact of the LBP operator.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 8

IEEE SYSTEMS JOURNAL

VIII. C ONCLUSION AND F UTURE W ORK

Fig. 11. Overall classifier performance with and without LBP code.

This paper has reported the design, development, and evaluation of an automated screening system for AML in blood microscopic images. It uses 80 high-quality 184 × 138 size images obtained from the American Society of Haematology [32]. The presented system performs automated processing, including color correlation, segmentation of the nucleated cells, and effective validation and classification. A feature set exploiting the shape, color, and texture parameters of a cell is constructed to obtain all the information required to perform efficient classification. The impact of the LBP operator on the HD proved to be a promising feature for this analysis. Furthermore, a color feature called cell energy was introduced, and results show that this feature presents a good demarcation between cancer and noncancer cells. Further research will focus on collection of more samples to yield better performance and building an overall system for cancer classification. ACKNOWLEDGMENT The authors would like to thank the American Society of Haematology for providing a high-quality image database. R EFERENCES

Fig. 12. Accuracy of existing models versus proposed system while employing subimages and similar processing techniques.

Fig. 11 illustrates the overall classifier performance for different validation techniques on both subimages and whole images. The system presented not only enables classification of whole images but also presents a better performance for subimages when compared with some of the existing systems. A set of four existing systems that employ the same color correlation, segmentation, and classification techniques as the proposed system were taken into consideration. Experiment results presented comparisons between: 1) the proposed system with the existing systems; 2) performance of HD on the presented system before and after the influence of LBP; and 3) the impact of LBP on performance of the overall algorithm on subimages and complete images. Thus, the analysis demonstrates that the developed algorithm presents accuracy of 98% and thereby providing an effective and reliable source of classification of AML. The algorithms which were originally developed to segment whole images were applied on subimages and the results showed that these still manage to present higher accuracy than other systems employing similar techniques. Thus, this ensures that the system adopted gives excellent classification of subimages and complete images. Fig. 12 depicts the study that showcases the algorithms and accuracy obtained by each of the systems.

[1] F. Scotti, “Automatic morphological analysis for acute leukemia identification in peripheral blood microscope images,” in Proc. CIMSA, 2005, pp. 96–101. [2] V. Piuri and F. Scotti, “Morphological classification of blood leucocytes by microscope images,” in Proc. CIMSA, 2004, pp. 103–108. [3] M. Subrajeet, D. Patra, and S. Satpathy, “Automated leukemia detection in blood microscopic images using statistical texture analysis,” in Proc. Int. Conf. Commun., Comput. Security, 2011, pp. 184–187. [4] H. Ramoser, V. Laurain, H. Bischof, and R. Ecker, “Leukocyte segmentation and classification in blood-smear images,” in Proc. IEEE EMBS, 2006, pp. 3371–3374. [5] C. Reta, L. Altamirano, J. A. Gonzalez, R. Diaz, and J. S. Guichard, “Segmentation of bone marrow cell images for morphological classification of acute leukemia,” in Proc. 23rd FLAIRS, 2010, pp. 86–91. [6] G. Ongun, U. Halici, K. Leblebicioglu, V. Atalay, M. Beksac, and S. Beksac, “Feature extraction and classification of blood cells for an automated differential blood count system,” in Proc. IJCNN, 2001, vol. 4, pp. 2461–2466. [7] S. Mohapatra and D. Patra, “Automated leukemia detection using hausdorff dimension in blood microscopic images,” in Proc. Int. Conf. Emerg. Trends Robot Commun. Technol., 2010, pp. 64–68. [8] S. Mohapatra, S. Samanta, D. Patra, and S. Satpathi, “Fuzzy based blood image segmentation for automated leukemia detection,” in Proc. ICDeCom, 2011, pp. 1–5. [9] S. Mohapatra, D. Patra, and S. Satpathi, “Image analysis of blood microscopic images for acute leukemia detection,” in Proc. IECR, 2010, pp. 215–219. [10] S. Mohapatra, D. Patra, and S. Satpathi, “Automated cell nucleus segmentation and acute leukemia detection in blood microscopic images,” in Proc. ICSMB, 2010, pp. 49–54. [11] MedlinePlus: Leukemia.National Institutes of Health. [Online]. Available: http://www.nlm.nih.gov/medlineplus/ency/article/001299.htm [12] J. N. Jameson, L. K. Dennis, T. R. Harrison, E. Braunwald, A. S. Fauci, S. L. Hauser, and D. L. Longo, “Harrison’s principles of internal medicine,” JAMA, vol. 308, no. 17, pp. 1813–1814, Nov. 2012. [13] S. Serbouti, A. Duhamel, H. Harms, U. Gunzer, J. Mary, and R. Beuscart, “Image segmentation and classification methods to detect leukemias,” in Proc. Int. Conf. IEEE Eng. Med. Biol. Soc., 1991, pp. 260–261. [14] D. Foran, D. Comaniciu, P. Meer, and L. A. Goodell, “Computer-assisted discrimination among malignant lymphomas and leukemia using immunophenotyping, intelligent image repositories, and telemicroscopy,” IEEE Trans. Inf. Technol. Biomed., vol. 4, no. 4, pp. 265–273, Dec. 2000.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. AGAIAN et al.: AUTOMATED SCREENING SYSTEM FOR AML DETECTION IN BLOOD MICROSCOPIC IMAGES

[15] K. S. Kim, P. K. Kim, J. J. Song, and Y. C. Park, “Analyzing blood cell image do distinguish its abnormalities,” in Proc. ACM Int. Conf. Multim., 2002, pp. 395–397. [16] Q. Liao and Y. Deng, “An accurate segmentation method for white blood cell images,” in Proc. IEEE Int. Symp. Biomed. Imaging, Atlanta, GA, USA, 2002, pp. 245–248. [17] S. Suri, S. Setarehdan, and S. Singh, Advanced Algorithmic Approaches to Medical Image Segmentation: State-of-the-Art Application in Cardiology, Neurology, Mammography and Pathology. Berlin, Germany: SpringerVerlag, 2001, pp. 541–558. [18] N. Sinha and A. Ramakrishnan, “Automation of differential blood count,” in Proc. Conf. Convergent Technol. Asia-Pac. Region, 2003, vol. 2, pp. 547–551. [19] W. Shitong, K. F. L. Chung, and F. Duan, “Applying the improved fuzzy cellular neural network IFCNN to white blood cell detection,” Neurocomputing, vol. 70, no. 7–9, pp. 1348–1359, Mar. 2007. [20] M. Oberholzer, M. Ostreicher, H. Christen, and M. Bruhlmann, “Methods in quantitative image analysis,” Histochem. Cell Biol., vol. 105, no. 5, pp. 333–355, May 1996. [21] B. Nilsson and A. Heyden, “Model-based segmentation of leukocytes clusters,” in Proc. Int. Conf. Pattern Recognit., 2002, vol. 1, pp. 727–730. [22] P. Bamford and B. Lovell, “Method for accurate unsupervised cell nucleus segmentation,” in Proc. Eng. Med. Biol. Soc. Conf., 2001, vol. 3, pp. 2704–2708. [23] R. M. Haralick, “Statistical and structural approaches to texture,” Proc. IEEE, vol. 67, no. 5, pp. 786–804, May 1979. [24] S. Agaian and A. Almuntashri, “A new edge detection algorithm in image processing based on LIP-ratio approach,” in Proc. SPIE, Feb. 8, 2010, vol. 7532, pp. 753204-1–753204-12. [25] S. Agaian, K. Panetta, S. Nercessian, and E. Danahy, “Boolean derivatives with application to edge detection for imaging systems,” IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 40, no. 2, pp. 371–382, Apr. 2010. [26] K. Panetta, S. Agaian, S. Nercessian, and A. Almunstashri, “Shapedependent canny edge detector,” Opt. Eng., vol. 50, no. 8, pp. 087008-1– 087008-12, Aug. 2011. [27] T. Milne, “Measuring the fractal geometry of landscapes,” Appl. Math. Comput., vol. 27, no. 1, pp. 67–79, Jul. 1988. [28] C. Lopez and S. Agaian, “A new set of wavelet- and fractals-based features for Gleason grading of prostate cancer histopathology images,” in Proc. SPIE, Image Process., Algorithms Syst. XI, 2013, vol. 8655, pp. 865516-1– 865516-12. [29] P. Refaeilzadeh, L. Tang, and H. Liu, “Cross-validation,” in Encyclopedia of Database Systems (EDBS), L. Liu and M. T. Özsu, Eds. New York, NY, USA: Springer-Verlag, 2009, pp. 51–58. [30] C. E. Pedreira, L. Marcini, M. G. Land, and E. S. Costa, “New decision support tool for treatment intensity choice in childhood acute Lymphoblastic Leukemia,” IEEE Trans. Inf. Technol. Biomed., vol. 13, no. 3, pp. 284–290, May 2009. [31] C. Lopez and S. Agaian, “Iterative local color normalization using fuzzy image clustering,” in Proc. SPIE, Mobile Multim./Image Process., Security, Appl., 2013, vol. 8755, pp. 875518-1–875518-12. [32] ASH Image Bank: American Society of Hematology. [Online]. Available: http://imagebank.hematology.org/Default.aspx [33] F. Scotti, “Robust segmentation and measurement techniques of white cells in blood microscope images,” in Proc. IEEE Conf. Instrum. Meas. Technol., 2006, pp. 43–48. [34] A. Pentland, “Fractal based description of natural scenes,” IEEE Trans. Pattern Anal. Mach. Intell., vol. PAMI-6, no. 6, pp. 661–674, Nov. 1984. [35] S. Agaian, J. Astola, and K. Egiazarian, Binary Polynomial Transforms and Nonlinear Digital Filters. New York, NY, USA: Marcel Dekker, 1995. [36] N. Sinha and A. G. Ramakrishnan, “Blood cell segmentation using EM algorithm,” in Proc. 3rd Indian Conf. Comput. Vis., Graph., 2002, pp. 445–450. [37] A. Bordes, S. Ertekin, J. Weston, and L. Bottou, “Fast kernel classifiers with online and active learning,” J. Mach. Learn. Res., vol. 6, pp. 1579– 1619, Sep. 2005. [38] E. Wharton, K. Panetta, and S. Agaian, “Logarithmic edge detection with applications,” J. Comput., vol. 3, no. 9, pp. 11–19, Sep. 2008. [39] J. Hu, J. Deng, and M. Sui, “Color space conversion model from CMYK to LAB based on prism,” in Proc. IEEE GRC, 2009, pp. 235–238. [40] Hunter Labs. Hunter lab color scale, Hunter Associates Lab., Reston, VA, USA, Insight on Color 8 9 (August 1–15, 1996). [Online]. Available: http://www.hunterlab.com/appnotes/an08_96a.pdf [41] C. Haworth, A. Hepplestone, P. Jones, R. Campbell, D. Evans, and M. Palmer, “Routine bone marrow examination in the management of

[42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52]

[53] [54] [55] [56] [57] [58] [59] [60]

9

acute lymphoblastic leukaemia of childhood,” J. Clin. Pathol., vol. 34, no. 5, pp. 483–485, May 1981. M. Sezgin and B. Sankur, “Survey over image thresholding techniques and quantitative performance evaluation,” J. Electron. Imaging, vol. 13, no. 1, pp. 146–165, Jan. 2004. K. Nallaperumal and K. Krishnaveni, “Watershed segmentation of cervical images using multiscale morphological gradient and HSI color space,” Int. J. Imaging Sci. Eng., vol. 2, no. 2, pp. 212–216, Apr. 2008. R. Rangayyan, Biomedical Image Analysis. Series Title: Biomedical Engineering. Boca Raton, FL, USA: CRC Press, Dec. 2004. R. Adollah, M. Mashor, N. Nasir, H. Rosline, H. Mahsin, and H. Adilah, “Blood cell image segmentation: A review,” in Proc. IFMBE. Berlin, Germany: Springer-Verlag, 2008, ch. 39, pp. 141–144. A. Nasir, M. Mashor, and H. Rosline, “Unsupervised colour segmentation of white blood cell for Acute leukaemia images,” in Proc. IEEE IST, 2011, pp. 142–145. D. Ilea and P. Whelan, “Image segmentation based on the integration of colour-texture descriptors—A review,” Pattern Recognit., vol. 44, no. 10/11, pp. 2479–2501, Oct./Nov. 2011. H. Inaba, M. Greaves, and C. Mullighan, “Acute lymphoblastic leukaemia,” Lancet, vol. 381, no. 9881, pp. 1943–1955, Jun. 2013. O. Lahdenoja, “Local binary pattern feature vector extraction with CNN,” in Proc. 9th Int. Workshop Cellular Neural Netw. Appl., 2005, pp. 202–205. D. Mandal, K. Panetta, and S. Agaian, “Face recognition based on logarithmic local binary patterns,” in Proc. SPIE, Image Process., Algorithms Syst. XI, 2013, vol. 8655, pp. 865514-1–865514-12. Acute Myeloid Leukemia, National Cancer Institute, Bethesda, MD, USA, 2006. F. Sadeghian, Z. Seman, A. Ramli, B. Kahar, and M. Saripan, “A framework for white blood cell segmentation in microscopic blood images using digital image processing,” Biol. Procedures Online, vol. 11, no. 1, pp. 196–206, Jun. 2009. A. Jain, “Data clustering: 50 years beyond K-means,” Pattern Recognit. Lett., vol. 31, no. 8, pp. 651–666, Jun. 2010. S. Nercessian, K. Panetta, and S. Agaian, “A non-reference measure for objective edge map evaluation,” in Proc. IEEE Int. Conf. Syst., Man Cybern., 2009, pp. 4563–4568. A. Almuntashri, E. Finol, and S. Agaian, “Automatic lumen segmentation in CT and PC-MR images of abdominal aortic aneurysm,” in Proc. IEEE Int. Conf. Syst., Man Cybern., 2012, pp. 2891–2896. A. Agarwal, O. Chapelle, M. Dudik, and J. Langford, “A reliable effective terascale linear learning system,” arXiv:1110.4198, 2011, preprint, Cornell University Library. C. C. Chang and C. J. Lin, “LIBSVM: A library for support vector machines,” ACM Trans. Intell. Syst. Technol., vol. 2, no. 3, p. 27, Apr. 2011. R. D. Labati, V. Piuri, and F. Scotti, “ALL-IDB: The acute lymphoblastic leukemia image database for image processing,” in Proc. IEEE ICIP, Brussels, Belgium, Sep. 11–14, 2011, pp. 2045–2048. J. Tang and S. Agaian, “Computer-aided cancer detection and diagnosis: Recent advances,” in Proc. SPIE. Bellingham, WA, USA: SPIE, 2014, p. 341. E. Wharton, K. Panetta, and S. Agaian, “Logarithmic edge detection with applications,” in Proc. IEEE ISIC, 2007, pp. 3346–3351.

Sos Agaian (M’98–SM’00) received the M.S. degree (summa cum laude) in mathematics and mechanics from Yerevan State University, Yerevan, Armenia, and the Ph.D. degree in mathematics and physics and the D.E.Sc. degree from the Russian Academy of Sciences, Moscow, Russia. He is currently the Peter T. Flawn Distinguished Professor with the Department of Electrical and Computer Engineering, College of Engineering, The University of Texas at San Antonio, San Antonio, TX, USA. He is the author of more than 550 scientific papers and seven books, and is a holder of 19 patents. His current research interests include signal/image processing systems, bioinformatics, image–video measurements, cancer imaging, information security, computer vision, mobile imaging, and secure communications. Dr. Aigan is a Fellow of the International Society for Optics and Photonics Engineers (SPIE), the Society for Imaging Science and Technology (IS&T), and the American Association for the Advancement of Science. He currently serves as an Associate Editor for several publications, including the Journal of Real-Time Imaging, the Journal of Electronic Imaging, and the IEEE S YSTEMS J OURNAL.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 10

IEEE SYSTEMS JOURNAL

Monica Madhukar received the M.S. degree in electrical engineering from The University of Texas at San Antonio, San Antonio, TX, USA, in 2012. She is currently an Imaging Research Associate with Intrinsic Imaging LLC, San Antonio. Her current research interests include signal/image processing systems and cancer imaging.

Anthony T. Chronopoulos (M’87–SM’98) received the Ph.D. degree in computer science from the University of Illinois at Urbana–Champaign, Urbana, IL, USA, in 1987. He is currently a Professor with the Department of Computer Science, The University of Texas San Antonio, San Antonio, TX, USA. He is the author of 50 journals and 66 peer-reviewed conference papers. His current research interests include distributed computing, grid and cloud computing, highperformance computing, and scientific computing. Dr. Chronopoulos has been awarded with 15 federal/state government research grants.

Automated Screening System For Acute Myelogenous Leukemia ...

Loading… Whoops! There was a problem loading more pages. Whoops! There was a problem previewing this document. Retrying... Download. Connect more apps... Automated Sc ... mia ieee.pdf. Automated Scr ... emia ieee.pdf. Open. Extract. Open with. Sign In. Main menu.

2MB Sizes 0 Downloads 256 Views

Recommend Documents

Automated Screening System For Acute Myelogenous Leukemia.pdf ...
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Automated ...

Acute Lymphoblastic Leukemia
May 13, 2002 - VP16 + allo/auto SCT. —. 8. 2. % ...... that suggests that an inherited defect in mismatch repair genes may ... code MSH3, a mismatch repair enzyme, and RSU1, a ..... the patient at a single class I or class II antigen generally.

System Global Area: The Focal Point for Automated Database Tuning
DBMS those are responsible for poor response time. These may be categories as software component (database design, SQL query parsing and optimize etc.) ...

Development of a fully automated system for delivering ... - Springer Link
Development of a fully automated system for delivering odors in an MRI environment. ISABEL CUEVAS, BENOÎT GÉRARD, PAULA PLAZA, ELODIE LERENS, ...

Automated computer integrated manufacturing system 2013.pdf ...
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Automated ...

Self-trained automated parking system
5. Automated Parking System. A 3D driving simulator, illustrated in Figure 2, was developed to gather driving data from human drivers [2]. This data is used first to train the car controller and subsequently to gauge its driving capability. Feedback

PU.1 and Junb: Suppressing the formation of acute myeloid leukemia ...
Oct 18, 2006 - PU.1 is a transcription factor that is required for normal myelomonocytic ... ferentiation through positively regulating expression of the AP-1.

Evaluation of an automated furrow irrigation system ...
crop (63.14 kg/ha/cm) was considerably higher than of conventional method (51.43 kg/ha/cm). Key words ... no need to go to the field at night or any other ...

Screening for Colorectal Cancer - Kurunegala Clinical Society
CRC- Potentially totally preventable. •Breast – Screening is aimed at detecting the early cancer. •CRC – Screening is aimed at detecting the polyp ...

Evaluation-Screening for the following Vacant PositionsPrincipal ...
Evaluation-Screening for the following Vacant PositionsPrincipal II_Ass. Principal II.pdf. Evaluation-Screening for the following Vacant PositionsPrincipal II_Ass.

Bezoar : Automated Virtual Machine-based Full-System ...
detecting attacks disrupt service and current recovery approaches ... the memory monitor component that tracks down network bytes, for five SPEC INT 2000 ...

Screening Designs for Drug Development
consider a variety of extensions leading to 2-stage designs and fully sequential ..... meaningful comparison, we use U2 since it does not depend on model-based ...

Corticosteroids for Acute Bacterial Meningitis
Dec 13, 2007 - Access to the complete text of the Journal on the Internet is free to all subscribers. To use this Web site ... their mailing labels. After this one-time ...

Screening Ex$Ante or Screening On$the$Job?
Jan 6, 2008 - Rees (1966) analyzes the role of the recruitment method as an information$ ... be suboptimal if it prevents bad workers from mis$behaving and, ...

Chronic Myeloid Leukemia: Origin, Development ...
Origin, Development, Response to Therapy, and Relapse. David Dingli, MD ... progenitors. Persistence of CML progenitors, however, is responsible for the rapid.

Viewer's Guide for First Look Vision Screening for Infants.pdf ...
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Viewer's Guide ...

Screening for Internet Addiction: An Empirical ... - ScienceDirect.com
Internet use is a convenience in modern life. However,. 11.67–19.8% of adolescents have developed an addiction to. Internet use, which impairs these ...

Screening for prostate cancer (Review)
Analysis 01.01. Comparison 01 Screening vs control, Outcome 01 Prostate cancer specific mortality ... This version first published online: 19 July 2006 in Issue 3, 2006. ... controlled trials were identified as meeting the review's inclusion criteria

Screening for Colorectal Cancer - Kurunegala Clinical Society
Establishing a polyp-free colon in the future. Colonoscopy at regular intervals –. Average risk – once per 5 - 10 years. Moderate risk – once per 2 to 5 years. High risk - once per 3months to 2years ...

Automated Methods for Evolutionary Pavé Jewellery Design
Jan 15, 2006 - Keywords Jewellery design, evolutionary algorithm, aesthetics, ..... Whilst the more natural application of this algorithm might appear to be in ...... to aid in the automated construction of Pavé jewellery exist, although at a price.

Automated Device Pairing for Asymmetric Pairing Scenarios
5. Prior Work. ▫ Seeing-is-Believing by McCune et al. [Oakland'05] o Based on protocol by Balfanz et al. [NDSS'02]. A. B pk. A pk. B. H(pk. A. ) H(pk. B. ) Insecure Channel. ▫. Secure with: o. A weakly CR H() o. An 80 bit permanent key o. A 48 bi