Medical Image Annotation using Bag of Features ...

Viewer
Transcript

Medical Image Annotation using Bag of Features Representation

Angel Alfonso Cruz Roa

National University of Colombia Medicine School

Master’s Thesis

Medical Image Annotation using Bag of Features Representation by

Angel Alfonso Cruz Roa

Submitted to the Medicine School of the National University of Colombia, in partial fulfillment of the requirements for the degree of

Master of Science in Biomedical Engineering

Under the guidance of Fabio A. Gonz´alez

Bogot´a D.C., May 17, 2009

Part I

Preliminaries

1

Chapter 1

Introduction This project aims to find a new strategy to automatically annotate the visual content of large collections of medical images. These images comprise different modalities (X-rays, MRI, ultrasound, histology, microscopy, etc.) And are structurally variable with respect to their content (biological variability), the method of acquisition (noise and variability in the capture of the positioning of the source or object) and pathological circumstances. In the last decade, hospitals have been overrun by hundreds of thousands of these images to be stored, transmitted and managed. Databases of medical images have become key components in teaching, development of support systems for diagnostic, research and study of diseases. These issues are central in developing the strategy that is directed to propose to strengthen the process of medical undergraduate and postgraduate level, and the quality of the doctor through the use of support systems for the diagnosis. The proposed approach is part of novel techniques being used today in the areas of image processing and machine learning, which are recently being explored by the group. For the representation of the image is proposed to use global features of the visual information based on color, texture and shape [?] and the local image structure under a completely new approach and recently explored in the group, called the bag of features [?]. The process of annotation of the concepts will be carried out using visual techniques Supervised Learning [4] and Semisupervisado [?]. The latter addresses the problem of making the knowledge available when a set of images recorded by a number of concepts on all raw data records automatically. There will be a special emphasis on Kernel Methods [?], which is one of the main approaches in machine learning context.

1.1

Goals

The main goal of this research was to design a computational strategy to find and retrieve, from a repository, histopathology images using an example query image. The following is the description of the general purposes of this research:

2

• To implement different schemes of representation of visual words and assess their performance on a representative set of radiological images of the IRMA project. • To adapt or propose a methodology for visual vocabulary extraction in medical imaging. • To adapt or propose one or more similarity measures between images represented by bags of visual words. • To propose a machine learning method for automatic annotation of medical images represented as visual bag of words. • To evaluate the performance of the system for automatic annotation in real databases of radiological images

1.2

Main Contributions

3

Chapter 2

Problem Statement 2.1

Scope

The well-known semantic gap between visual information and high-level concepts has been a topic widely discussed in medical imaging domain and the objective of the various previous works has been to search to reduce this gap. In this domain this gap is much wider than in the natural scenes images and objects because of heterogeneity of the images and the expertise and structured medical domain. The visual content in images is heterogeneous, and in the medical domain is even higher due to the particularities of the acquisition techniques and the anatomy of the patient or biological regions. Furthermore, medical knowledge has the particularity to be specialized and different levels of expertise has become increasingly sophisticated as the area of medical specialty. These levels range from anatomical regions, diagnosis methods and pathology, which usually have a structured, hierarchical, or ontologies on a semantic network of concepts such as dictionaries, thesauruses and specialized domain vocabularies (UMLS, MeSH, FMA, RadLex, etc.). The problem statement is how to made automatic annotation of radiological images from visual information of images of a collection represented as a bag features using images from the collection annotated with concepts in a domain of radiology terminology or ontology specific.

2.2

Previous work

In medical images classification have been used strategies of computer science and machine learning. Antonie et al. in 2001 [17] used artificial neural networks and association rules mining for identification of tumors in mammography images with results above 70% accuracy. Categorization and automatic annotation of medical images by modality refers to the association of one or more images of the same acquisition technique

4

such as X-rays, MRI, ultrasound, microscopy, photography, medical diagrams, etc. Mojsilovic and Gomes [18] in 2002 explored the automatic categorization of medical images by modality from visual information. In this work proposes to extract visual information by applying three segmentation techniques. The first one is a texture segmentation that is done with edge-orientation maps together with a growth regions algorithm. The second segmentation is based on color using an unsupervised clustering technique known as mean-shift algorithm. The third segmentation both previous segmentation is combined to determine if there are relevant objects and split them of background. For each region shape features are extracted, texture and color that are combined to do a single global descriptor of the image formed by the number of regions, number of ’blobs’, number of color regions, measures of global and local contrast and color histogram. Pinhas and Greenspan [21] proposed a general framework for representation and comparison of medical images using Gaussian Mixture Models (GMM). This generative method of machine of learning is used to build a Gaussian model of gray levels spatial distribution in the image, called ’blobs’, by each anatomical category (skull, chest, hands, etc.). To determine the category of an image, this is compared with each model using the Kullback Leibler divergence (KL), which is a metric of information theory to compare probability distributions. In 2007 presented an improved version of the system [9] and in [8] propose an improvement algorithm to reduce the GMM representation based on UnscentedTransform. In 2005 Lehmann et al. [13] presented a system for categorization of medical images to classify images from over 80 categories that describe the modality information, direction, body part and biological system examined. Features combine global texture features with scaled images. For the classification using a simple k-NN using dateset of IRMA project [12]. Florea et al. [7] propose and evaluate a symbolic representation of the image based on statistical characteristics of texture and 16 blocks on a scaled image to 256x256 and also used as a k-NN classifier. Recently more complex representation of the image in the medical domain is explored. Silva et al. in [23] used wavelet transform to represent the medical imaging and its coefficients distribution at low frequencies as a vector of features in the categorization task using a Self-Organized Map in the learning stage. Another machine learning approach used in annotation task of medical images is semi-supervised learning. This approach aims to use the method of learning, not only data but also labelled the structure of the unlabelled data to improve the construction of the model. One of the assumptions of semisupervised learning is that the labelled and not labelled data are from the same distribution, hence their representation (vector of features) should be similar to the same labels. In this way, the problem of medical images annotation has been previously explored from the semi-supervised learning approach. Yuen and Li in [15] propose a clustering algorithm based on stochastic graph by contraction between nodes, where each node represents each image and similarity between two images is represented by the edge that joins which is a color similarity 5

measure between them. The propagation of the labels associated with only a few images are done when a node label is bound to a node without a label that is closest, that is where the similarity between nodes is the highest. These labels refer to symptoms present at the pictures and are intended to tongue diagnosis support. Najjar et al. in 2003 [19] propose a method for the features selection for content-based retrieval in a database of medical images with diagnostic purposes. Recently in 2008 Segu´ı et al. [22] proposed a method which is based on a stability criterion which consists in properly selecting of examples are not labeled to be added to the set of training data which is also there are the labelled data, by applying perturbation methods to modify the data space of hypotheses generated. Semi-supervised learning is done from this training set with a Support Vector Machine (SVM) for the conventional automatic diagnosis of disorders of intestinal motility using video endoscopy. Yao et al. [29] use the semi-supervised approach and active learning in a context of image retrieval by deductible or semantic (not based on visual appearance), this approach seeks to infer the label of the query image to retrieve medical images with similar labels. Some properties of the bag features are useful in the analysis of medical images, in fact, this approach has been used in this domain in a satisfactory way in some specific tasks. Anna Bosh [1] used the concept of bag of words for the annotation of mammography images. The visual words used were textons and SIFT descriptors, where the first ones obtained a better performance in the classification stage. The image is represented by the histogram occurrence of the visual vocabulary in the image, which was built by the k-means algorithm from the visual features. The classification is performed using latent semantic analysis (pLSA) of the representation of image classifier training a K-NN and SVM, where the latter obtained better results in the task of annotation of breast tissue according to BI-RADS categories in the MIAS and DDSM databases. Tommasi et. al. [27] have adapted the bag of features representation in the classification according to the IRMA code for X-ray images in the ImageCLEF1 . challenge of automatic annotation. Taking into account that the radiological images have a strong influence of spatial relationships, they proposed a simple strategy that takes into account using a histogram of this bag features the image of each quadrant, by dividing the image into four parts the SIFT descriptors is extracted in each block that are quantified as a histogram of the visual vocabulary. The classification methods are kernels making a linear combination of the representations bag features quadrants. Iakovidis et. al [11] proposed a scheme for efficient retrieval of content-based medical images on a framework of bag of features. The purpose was to identify the semantic meaning of the patterns in the collection to build the representation of the image. Using a collection of images of the IRMA project the feature extraction of low level features was made from regions into images which are then grouped in the features space to identify high-level visual patterns for each of the 116 semantic concepts. 1 Medical

Image Annotation task in CLEF. http://imageclef.org/

6

2.3

Radiological Images

7

Chapter 3

Visual Features From a topological features used to describe the information contained in an image can be classified into global and local. Parameters such as half tone, the spatial distribution of intensities, etc. Are often used as global features allowing the description of information common to the image as a whole. Equally, the patterns of color, texture, shape and orientation of edges and patterns allows the description of local content in a localized picture. The easiest way to extract the local branch of the image into blocks of fixed size distributed [25,26]. The construction of this partition is made independently of the distribution of intensity in the image (or any other form of representing the information contained therein). However, when the intervention of the user in selecting Rois (supervised approach) you can draw from this selection process semantic information external to the self-image. Some works that use this strategy, addressing the problem through direct selection of regions of interest [6], through the delineation of objects [20] or by pre-segmentation of the image in areas with similar visual properties [28]. The features extracted in these regions contain additional information about the objects in the image or the underlying structures, that can be used for semantic interpretation by association with the visual information.

3.1

Global features

random variables with a given probability distribution for different feature spaces. The following low-level features have been proposed [3] to analyze the content of an image: 1. Grayscale histogram: The histogram is represented using a resampling of the entire color space into a discrete scale of intensity segments. 2. Histogram Color: For color images, the RGB cube is divided into segments, segments for each color component. 3. Histogram LBP (Local Binary Pattern): This is a texture feature used in some systems, image retrieval by content. For each pixel in the image, its

8

neighbors are examined by comparing their intensity values to the intensity of the pixel. If the neighboring pixel is greater, a value is assigned to the corresponding position of the neighbor, otherwise it is assigned a value. The calculated values are used to construct a binary string of pixel positions. The binary string can take different values, it calculates a histogram segments. 4. Histogram Sobel: The application of the Sobel operator is one of the simplest techniques for edge detection in digital images. Calculate the difference of intensity in the vicinity of one pixel in horizontal and vertical directions, this difference can be interpreted as the derivative of the local digital image represented as a discrete function. In this implementation, an operator is used to analyze the-neighbors of each pixel. A histogram of the segments is calculated using the gradient estimate obtained by applying the operator. 5. Tamura texture histogram: Tamura [10] proposed six different characteristics to describe the texture of the image in a given region: roughness, contrast, direction, linearity, regularity and roughness. An approach adapted from the original Tamura for the calculation of each local feature was proposed in [3] by dividing the space generated by the three characteristics into segments to construct a histogram of texture.

3.2

Local features

One of the approaches applied in more recent work for the representation of images is the method of the Exchange features (or bag-of-Features in English) because of its simplicity and its good performance in applications such as classification of natural scenes and identification of objects. This approach is a combination of representation based on textones [2] and the concept of a bag of words used in text classification and retrieval in the context of Artificial Intelligence [14]. The bag of words approach, used in classification and retrieval of textual information, on the basis of the selection of a dictionary for identification and differentiation of different types of documents. Once this dictionary, each text document is processed and the occurrence of each word in the document is quantified. This representation of the document ignores the relationship between words and therefore does not take into account the structure (syntax). In the case of the representation of images, an analogy of the standard dictionary is implemented [?, 5], using the visual characteristics for the building of vocabulary through the selection of patterns relevant to the visual representation of each image within the collection. The representation of the image as a bag of features has proven its applicability in different tasks of classification, categorization and retrieval of images [2, 5, 24] using visual words based on blocks or SIFT points (Scale-Invariant Feature Transform) [16]. Bag features has the advantage of adapting to a particular collection, because the visual vocabulary is generated from it. Moreover, this approach has proven to be robust to changes in objects in images by occlusion or phenomena related changes [Csurka2004]. In this work, we studied the computational efficiency of

9

the method of bag features, with satisfactory results.

3.3

Features Detection and Description

Feature detection is the process in which the relevant components of an image must be identified. Usually, the goal of feature detection is set to identify a spatially limited image region that is salient or prominent. Different strategies have been proposed by the computer vision community to detect local features, that are motivated by different visual properties such as corners, edges or saliency. Once local features are detected, the next step is to describe or characterize the content of such local regions. Ideally, two local features should have the same descriptor values if they refer to the same visual concept. That motivates the implementation of descriptors that are invariant to affine transformations and illumination changes. A comprenhensive survey about image detectors and descriptors can be found in [?]. In this work two feature detection strategies with their corresponding feature descriptor have been evaluated. The first strategy is dense random sampling. The goal of this strategy is to select points in the image plane randomly and then, define a block of pixels around that coordinate. The size of the block is set to pixels, and the descriptor for these blocks is the vector with explicit pixel values in gray scales. This descriptor is sometimes known as texton or raw pixel descriptor. The advantage of this strategy is its simplicity and computational efficiency. In addition, a large number of blocks may be extracted from different image scales, and that sample is a good approximation of the probability distribution of visual patterns in the image [?]. The second strategy is based on Scale-Invariant Feature Transform (SIFT) points [16]. This strategy uses a keypoint detector based on the identification of interesting points in the location-scale space. This is implemented efficiently by processing a series of difference-of-Gaussian images. The final stage of this algorithm calculates a rotation invariant descriptor using a predefined orientations over a set of blocks. We use SIFT points with the most common parameter configuration: 8 orientations and blocks of cells, resulting in a descriptor of 128 dimensions. The SIFT algorithm has demonstrated to be a robust keypoint descriptor in different image retrieval and matching applications, since it is invariant to common image transformations, illumination changes and noise.

3.4

Visual Words definition

10

Chapter 4

Visual Codebook 4.1

Visual Codebook Construction

The visual dictionary or codebook is built using a clustering or vector quantization algorithm. In the previous stage of the bag of features framework, a set of local features has been extracted. All local features, over a training image set, are brougth together independently of the source image and are clustered to learn a set of representative visual words from the whole collection. The k-means algorithm is used in this work to find a set of centroids in the local features dataset. Nowak et. al [?] have reported that the application of a clustering algorithm has not a big improvement in the classification of natural images, compared with a random selection of codeblocks. In this work, randomly selected codeblocks is also evaluated. An important decision in the construction of the codebook is the selection of its size, that is, how many codeblocks are needed to represent image contents. According to different works on natural image classification, the larger the codebook size the better [?, 5]. However, Tomassi et. al [27] found that the size of the codebook is not a significant aspect in a medical image classification task. We evaluated different codebook sizes, to analyze the impact of this parameter in the classification of histopathology images. The goal of the codebook construction is to identify a set of visual patterns that reflects the image collection contents. Fei-Fei et. al [?] have illustrated a codebook for natural scene image categorization that contains several visual primitives, such as orientations and edges. That codebook is consistent with the contents of that image collection. In the same way, we illustrate a codebook extracted from the collection of histopathology images. This codebook, composed of 150 codeblocks is shown in Figure [fig:Codebook]. In this case he codeblocks are also reflecting the contents of the histopathology image collection. Visual primitives in this case may be representing cells and nuclei of different sizes and shapes. That codebook has been generated using the textons descriptor presented in the previous Subsection.

11

Figure 4.1: A codebook with 150 codeblocks for the histopathology image collection. Codeblocks are sorted by frequency.

4.2

Codebook size definition

4.3

Statistical Analysis between Visual Words and Concepts

4.3.1

Correlation Analysis

The correlation analysis is based in the following assumptions: • and are random variables. Then, there is not an explanatory variable and other explained. • The main assumption is that data are generated by a Bivariable Normal Distribution. • There is a linear relationship between variables, which is measured by the correlation coefficient. The correlation coefficient measures the degree of relationship between two variables vary joined and is defined by: where y is a variable is bivariate normal distribuited with parameters:

4.3.2

Visual Words and Semantic Concepts

In the previous subsection we compare the same random variable like visual words. In this subsection we used the codeblocks like a random variable, and the semantic concept like other random variable. We can suppose that the semantic concepts are a Bernoulli distribution but for correlation analysis the normal distribution is supposed. 12

4.3.3

Clustering Analysis

Agglomerative Hierarquical Clustering Agglomerative hierarchical clustering is a bottom-up clustering method where clusters have sub-clusters, which in turn have sub-clusters, etc. The classic example of this is species taxonomy. Gene expression data might also exhibit this hierarchical quality (e.g. neurotransmitter gene families). Agglomerative hierarchical clustering starts with every single object (gene or sample) in a single cluster. Then, in each successive iteration, it agglomerates (merges) the closest pair of clusters by satisfying some similarity criteria, until all of the data is in one cluster. The hierarchy within the final cluster has the following properties: • Clusters generated in early stages are nested in those generated in later stages. • Clusters with different sizes in the tree can be valuable for discovery. Cluster distances In order to decide which clusters should be combined a measure of distance between sets of points is required. A measure of distance between pairs of data, in this case images, is typically Euclidean distance and a linkage criteria which specifies the distance of clusters as a function of the pairwise distances of images in the sets is used. For this work the linkage used is the averange of the data by each cluster.

4.4

Machine learning approaches to Analyze Visual Codebook

4.5

Selecting codebook for specific concepts

13

Chapter 5

Bag of Features Image Representation 5.1

Bag of Feature construction

The bag of features framework is an adaptation of the bag of words scheme used for text categorization and text retrieval. The key idea is the construction of a codebook, that is, a visual vocabulary, in which the most representative patterns are codified as codewords or visual words. Then, the image representation is generated through a simple frequency analysis of each codeword inside the image. Csurka et. al [5] describe four steps to classify images using a bag of features representation: (1) Feature detection and description, (2) Codebook generation, (3) the bag of features construction and finally (4) training of learning algorithms. Figure [fig:Overview] shows an overview of those steps. The bag of features approach is a novel and simple method to represent image contents using collection dependent patterns. This is also a flexible and adaptable framework, since each step may be determined by different techniques according to the application domain needs. The following subsections present the particular methods and techniques that have been evaluated in this work.

5.2

Importance of codebook selection

5.3

TF-IDF vs Absolutely Frequencies

5.4

Visual words weighting strategy

5.5

An index based in Bag of Features representation 14

Figure 5.1: Overview of the Bag of Features framework

15

Chapter 6

Automatic Annotation using Machine Learning 6.1

Visual Similarity Metric definition

6.2

Feature Combination based in Bag of Features

6.3

A Visual Kernel based in Bag of Features

6.4

Latent Semantic Indexing for Bag of Features

6.5

Semi-supervised learning approach

6.6

Medical Image Annotation Task Challenge

6.7

Automatic Medical Image Annotation System

16

Chapter 7

Validation 7.1

Radiological image database

7.2

Others medical image databases

7.3

Kernel Methods to Learning Concepts

7.4

Performance Measures for annotation task

7.4.1

Effectiveness Evaluation

7.4.2

Efficiency Evaluation

7.4.3

Computational Time cost Evaluation

7.5

Impact of the codebook in annotation task

17

Chapter 8

Conclusion and Future Work 8.1

Discussion

8.2

Global vs local features

8.3

Visual codebook

8.4

Relation between visual words and semantic

8.5

Kernel methods and Bag of Features

8.6

Semi-supervised learning and Bag of Features

8.7

Future Work

18

Bibliography [1] Arnau Oliver Anna Bosch, Xavier Munoz and Joan Martii. Modeling and classifying breast tissue density in mammograms. Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR06), 2006. [2] Robert Marti Anna Bosch, Xavier Munoz. Which is the best way to organize/classify images by content? Image and Vision Computing, 25:778–791, 2007. [3] Juan C. Caicedo, Fabio A. Gonzalez, and Eduardo Romero. Content-based medical image retrieval using low-level visual features and modality identification. Lecture Notes in Computer Science, LNCS 5152:615–622, 2008. [4] Juan C. Caicedo, Fabio A. Gonzalez, and Eduardo Romero. A semantic content-based retrieval method for histopathology images. Information Retrieval Technology. Lecture Notes in Computer Science, 4993:51–60, 2008. [5] Gabriella Csurka, Christopher R. Dance, Lixin Fan, Jutta Willamowski, and C´ edric Bray. Visual categorization with bags of keypoints. In Workshop on Statistical Learning in Computer Vision, 2004. [6] D. Foran D. Comaniciu, P. Meer and A. Medl. Bimodal system for interactive indexing and retrieval of pathology images. In Proceedings of the Fourth IEEE Workshop on Applications of Computer Vision (WACV’98), pages 76–81, Princeton, NJ, USA, 1998. [7] F. Florea, E. Barbu, A. Rogozan, A. Bensrhair, and V. Buzuloiu. Medical image categorization using a texture based symbolic description. In Image Processing, 2006 IEEE International Conference on, pages 1489–1492, 2006. [8] J. Goldberger, H. Greenspan, and J. Dreyfuss. An optimal reduced representation of a mog with applicatios to medical image database classification. In Computer Vision and Pattern Recognition, 2007. CVPR ’07. IEEE Conference on, pages 1–6, 2007. [9] H Greenspan and AT Pinhas. Medical image categorization and retrieval for pacs using the gmm-kl framework. Information Technology in Biomedicine, IEEE Transactions on, 11(2):202, 190, 2007. [10] S. Mori H. Tamura and T. Yamawaki. Texture features corresponding to visual perception. IEEE Trans. Syst. Man Cybern., 8(6):460–473, 1978. [11] Dimitris K. Iakovidis, Nikos Pelekis, Evangelos E. Kotsifakos, Ioannis Kopanakis, Haralampos Karanikas, and Yannis Theodoridis. A pattern similarity scheme for medical image retrieval. Information Technology in Biomedicine, IEEE Transactions on, 2008. [12] Fischer Lehmann. The IRMA reference database and its use for Content-Based image retrieval in medical applications. http://www.egms.de/de/meetings/gmds2004/04gmds088.shtml, 2004. [13] Thomas M Lehmann, Mark O G¨ uld, Thomas Deselaers, Daniel Keysers, Henning Schubert, Klaus Spitzer, Hermann Ney, and Berthold B Wein. Automatic categorization of medical images for content-based retrieval and data mining. Computerized Medical Imaging and Graphics: The Official Journal of the Computerized Medical Imaging Society, 29(2-3):143–55, 2005. PMID: 15755534.

19

[14] David D. Lewis. Naive (bayes) at forty: The independence assumption in information retrieval. pages 4–15. Springer Verlag, 1998. [15] Chun Li and Pong Yuen. Semi-supervised Learning in Medical Image Database, pages 154–160. 2001. [16] David G Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60:91—110, 2004. [17] Maria luiza Antonie. Application of data mining techniques for medical image classification. In In In Proc. of Second Intl. Workshop on Multimedia Data Mining (MDM KDD’2001) in conjunction with Seventh ACM SIGKDD, pages 94–101, 2001. [18] Ra Mojsilovic and Jose Gomes. Semantic based categorization, browsing and retrieval in medical image databases. In In Proceedings International Conference Image Processing, 2002. [19] M. Najjar, C. Ambroise, and J.-P. Cocquerez. Feature selection for semisupervised learning applied to image retrieval. In Image Processing, 2003. ICIP 2003. Proceedings. 2003 International Conference on, volume 2, pages II–559–62 vol.3, 2003. [20] S. T. Perry and P. H. Lewis. A novel image viewer providing fast object delineation for content based retrieval and navigation. In R. C. Jain I. K. Sethi, editor, Storage and Retrieval for Image and Video Databases VI, volume 3312 of SPIE Proceedings, 1997. [21] A. Pinhas and H. Greenspan. A continuous and probabilistic framework for medical image representation and categorization. In Proceedings of SPIE Medical Imaging, volume Proceedings of SPIE Medical Imaging, San Diego, 2004. [22] Santi Segu´ı, Laura Igual, Petia Radeva, Carolina Malagelada, Fernando Azpiroz, and Jordi Vitri` a. A Semi-supervised Learning Method for Motility Disease Diagnostic, pages 773–782. 2008. [23] Leonardo Augusto da Silva, Ramon Alfredo Moreno, Sergio Shiguemi Furuie, and Emilio Del Moral Hernandez. Medical image categorization based on wavelet transform and self-organizing map. In ISDA ’07: Proceedings of the Seventh International Conference on Intelligent Systems Design and Applications, pages 353–356, Washington, DC, USA, 2007. IEEE Computer Society. [24] J. Sivic and A. Zisserman. Video google: a text retrieval approach to object matching in videos. In Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on, pages 1470–1477 vol.2, 2003. [25] A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain. Content-based image retrieval at the end of the early years. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 22:1349–1380, 2000. [26] D. M. Squire, W. Muller, H. Muller, and T. Pun. Content-based query of image databases: in-spirations from text retrieval. In B.K. Ersboll and P. Johansen, editors, Pattern Recognition Letters (Selected Papers from The 11th Scandinavian Conference on Image Analysis SCIA ’99), volume 21, pages 1193–1198, 2000. [27] T. Tommasi, F. Orabona, and B. Caputo. Clef2007 image annotation task: An svm - based cue integration approach. In Working Notes of the 2007 CLEF Workshop, Budapest, Hungary, 2007. [28] A. Winter and C. Nastar. Differential feature distribution maps for image segmentation and region queries in image databases. In IEEE Workshop on Content-based Access of Image and Video Libraries (CBAIVL’99), pages 9–17, Fort Collins, Colorado, USA, 1999. [29] Jian Yao, Zhongfei (Mark) Zhang, Sameer Antani, Rodney Long, and George Thoma. Automatic medical image annotation and retrieval. Neurocomput., 71(10-12):2012–2022, 2008.

20

Image Annotation Using Bi-Relational Graph of Images ...