Interactive Cluster-Based Personalized Retrieval on Large Document Collections Petros Belsis1, Charalampos Konstantopoulos2, Basilis Mamalis1, Grammati Pantziou1, and Christos Skourlas1 1 2

Department of Informatics, TEI of Athens Department of Informatics, University of Piraeus [email protected], [email protected], {pantziou,vmamalis,cskourlas}@teiath.gr

Abstract. Lately, many systems and websites add personalization functionalities among their provided services. However, for large document collections it is difficult for the user to direct effective queries from the beginning of his/her search, since accurate query terms may not be known in advance. In this paper we describe a system that applies a hybrid approach to assist a user identify the most relevant documents: at the beginning it applies dynamic personalization techniques based on user modeling to initiate the search on a large document and multimedia content collection; next the query is further refined using a clustering based approach which after processing a sub-collection of documents presents the user with more categories to select from a list of new keywords. We analyze the most prominent implementation choices for the modular components of the proposed architecture: a machine learning approach for personalized services, a clustering based approach towards a user directed query refinement and a parallel processing module that supports document clustering in order to decrease the system’s response times.

1 Introduction The continuous growth of data stored in different types of systems such as information portals, digital libraries etc, has created an overwhelming amount of information that a user has to deal with. Many approaches have emerged towards query-refinement to facilitate the user towards a more efficient retrieval process in respect to his/her personal interests. A wide variety of systems also integrate personalization features that aim to assist the user to identify knowledge items that match the user’s preferences. Among else, digital libraries, document management systems and multimedia datawarehouses with focus on scientific data-storage grow significantly in size as new scientific content is gathered on a daily basis on different areas of research. Considering each user has specific areas of expertise or interest, a digital library consists of a good test-bed domain where personalization techniques may prove to be beneficial. Still, in large document collections it is hard to identify an efficient user model that contains adequate sub-categories to support the user preferences for two reasons: first, it would be difficult to identify appropriate sub-categories in respect to the number of existing users; second, classification of incoming documents would require a significant overhead for the system. G.A. Tsihrintzis et al. (Eds.): New Direct. in Intel. Interac. Multimedia, SCI 142, pp. 211–220, 2008. © Springer-Verlag Berlin Heidelberg 2008 springerlink.com

212

P. Belsis et al.

In this paper, we describe a hybrid approach that utilizes personalization techniques at the initiation of the user interaction with the system, while it proceeds towards a more user-oriented interaction, where the user is participating in the dynamic clustering process by selecting the sub-categories that arise dynamically after processing subsets of the documents. In order to keep the system’s response times low, we also apply parallel processing techniques when processing the selected by the user document sub-clusters. The rest of the paper is organized as follows. Section 2 presents related work in context; Section 3 presents the main principles that drive the design of the proposed architecture and discusses the structure of its modular components, while Section 4 concludes the paper.

2 Related Work A wide variety of research prototype systems as well as commercial solutions have emerged lately offering personalized services to their users. Many of the successful deployments use machine learning methods, which aim in integrating among the system’s features the ability to adapt to the user’s needs and to perform many of the necessary tasks in an automated way [7]. 2.1 User Models, Stereotypes, Communities and Personalization Systems Personalization technology aims to adapt information systems, information retrieval and filtering systems, etc. to the needs of the individual user. A user model may contain personal details about the user, such as occupation, interests, etc. and information gathered through the interaction of the user with the system. User community models are generic models that apply to the needs of groups of users and usually do not use explicitly provided personal information. If personal information is given the community models are called stereotypes. Machine learning techniques have been applied to construct all these types of models and are used in digital library services, and in personalized news services, etc. For example: The MyView system [1] collects bibliographic data and facilitates the user in his/her browsing digital libraries. MyView supports direct on-line reorganization, browsing and selection as specified by the user. Among its strong features are that it can support browsing in heterogeneous distributed repositories. It does not store the actual data-sources but metadata pointing to actual sources. It also supports user directed browsing. The PNS [4] is a generic system that offers to its users personalized news services. Its architecture consists of sub-modules that collect user related data, either explicitly inserted by the user or implicitly by monitoring a user’s behavior. A personalization module builds the user’s model and makes recommendations on topics that fall within the user’s interests. The PNS also contains a content management module that collects information about the actual content sources and indexes them without though storing the actual sources but instead the indexing information as collected by specific purpose wrappers.

Interactive Cluster-Based Personalized Retrieval on Large Document Collections

213

2.2 Document Clustering and Parallel Techniques There exists a large number of document clustering algorithms. They are usually classified into two main categories – hierarchical algorithms and partitional algorithms. Partitioning assigns every document to a single cluster iteratively [17] in an attempt to determine k partitions that optimize a certain criterion function [18]. Partitional clustering algorithms usually have better time complexity than hierarchical algorithms. The K-means algorithm [21] is a popular clustering method of this category. A hierarchical clustering is a sequence of partitions in which each partition is nested into the next partition in the sequence. Hierarchical clusterings generally fall into two categories: splitting and agglomerative methods. Splitting methods work in a top down approach to split clusters until a certain threshold is obtained. The more popular agglomerative clustering algorithms use a bottom-up approach to merge documents into a hierarchy of clusters [19]. Agglomerative algorithms typically use a stored matrix or stored data approach [20]. There also exist several algorithms that combine the accuracy of the hierarchical approach with the lower time complexity of the partitioning approach to form a hybrid approach. Such a popular algorithm is the Buckshot algorithm [8] (see also section 3.2). A detailed overview of sequential document clustering algorithms can be found in [9] and [16]. Many authors have also examined parallel algorithms for both hierarchical clustering and partitional clustering [22]. In [23], Olson provides a comprehensive review on parallel hierarchical clustering algorithms. Two versions of parallel Kmeans algorithms are discussed in recent literatures. In [21], Dhillon and Modha proposed a parallel K-means algorithm on distributed memory multiprocessors. Xu and Zhang [24] designed a parallel K-means algorithm to cluster high dimensional document datasets, which has low communication overhead. Besides K-means, some other classical clustering algorithms also have their corresponding parallel versions, such as the parallel PDDP algorithm [24] and the parallel Buckshot algorithm (given earlier in [15] and most recently in [9]). 2.3 The Scatter/Gather Approach Scatter/Gather was first proposed by Cutting et al [8], as a cluster-based method for browsing large document collections. The method works as follows: In the beginning, the system scatters the initial document collection into a small set of clusters (i.e., document groups) and presents to the user short descriptive summaries of these clusters. The summaries may include text that characterizes the cluster in general, as well terms that sample the contents of the cluster. Based on these summaries, the user can select one or more of the clusters for further examination. The clusters selected by the user are gathered together into a subcollection. In the sequel, on line clustering is applied again to scatter the subcollection into a new small set of clusters, whose summaries are presented to the user. The above process may be repeated while after each iteration the clusters become smaller and more detailed. With the Scatter/Gather method the user is not forced to provide query terms but from the beginning he is presented with a set of clusters. The successive iterations of the method help the user to find the desired information from a large document collection. Therefore, the Scatter/Gather approach is very useful when the user cannot

214

P. Belsis et al.

or does not want to express a query formally. In addition, as Hearst and Pedersen showed in [13],[14] the Scatter/Gather method can also significantly improve the retrieval results over a very large document collection. Since each iteration of the Scatter/Gather method requires online clustering on a large document collection, fast clustering algorithms should be employed. Cutting et al [8] have proposed and applied to Scatter/Gather two clustering procedures: Buckshot (which is used also in our hybrid approach) and Fractionation. In [12], a scheme is proposed that after near linear time pre-processing (O(kNlogN)), it requires constant time for the online phase for arbitrarily large document collections. The method involves the construction of a cluster hierarchy. Liu et al in [16] also proposed a new algorithm for Scatter/Gather browsing which achieves constant response time for each Scatter/Gather iteration. Their algorithm requires (as the algorithm in [12]) the construction of a cluster hierarchy.

3 System Architecture The proposed architecture consists of three sub-modules: i) the personalization submodule which collects user related data and recommends initially categories containing documents related to the user’s interests, ii) the content repository which is actually responsible to store the documents and facilitates a user directed search by performing a scatter/gather approach and iii) the parallel processing module which is responsible for speeding up online clustering procedures as well as for preprocessing of documents in real time. In the following sections we explain the main concepts behind the functionality of each sub-module. Fig. 1 shows a generic overview of the proposed architecture.

Fig. 1. Generic overview of the system’s architecture

3.1 Personalization Module In order to build an accurate and effective user model there are two main tasks that the system should support: either i) the user at the time of registration should be able to provide details about his/her personal preferences so as to create easily his/her

Interactive Cluster-Based Personalized Retrieval on Large Document Collections

215

model / stereotype, or ii) the sequence of the topics that she/he usually selects are monitored and could be used to create a model which will direct the system to classify him/her to one community. In general the personalization module should provide support for the following operations: • • •

Provide support in respect to new user’s registration Keep track of user’s preferences in respect to the topics of interest that affect their interaction with the system Present personalized information to users that have similar interests or in general could be classified in a common behavior stereotype.

In respect to the classification model adopted, the system must support the creation of user models / communities using a feature-based representation. Towards this, from a list of generic sub-categories that the user usually explores a user model / community is created using a machine learning approach. Typical algorithms which have been successfully applied towards this direction are the COBWEB algorithm and the Cluster Mining Algorithm [3] and its variations [4][6]. Paliouras et al [5] studied a free-text query based information retrieval service and they constructed models of user communities using these two algorithms. They compared the two approaches using two evaluation criteria: 1) coverage, the proportion of features covered by the models and 2) distinctiveness, the number of distinct features that appear in at least one model divided by the sum of the sizes of all models. Eventually, they concluded that “the cluster mining method is doing consistently better than COBWEB, in terms of coverage, and distinctiveness”. The main principle of the Cluster Mining algorithm is to create - from a graph that contains all the possible features - a sub-graph with weights containing all the features associated with a given user model. In other words, the algorithm constructs a weighted graph G(A,E,wA,wE), where the set of vertices A contains all the features and the set of edges E corresponds to the coexistence of two features in the corresponding model. Then, weights are assigned to both edges E and to vertices A as aggregate usage statistics. In order to lower the complexity of the graph a threshold can be imposed which results in rejecting the edges with an assigned value below that threshold. In figure 2 considering we have a threshold of 0.09 the edge between the categories hardware and databases that has a lower value is rejected (it means that there is no strong evidence that the user or set of user in this specific stereotype are

Fig. 2. The feature-based graph that allows creation of the personalization model

216

P. Belsis et al.

interested in both categories). The remaining subset of the graph results in the construction of the feature group. 3.2 Clustering - Based Browsing for Large Document Collections In addition to personalization, our system also provides effective automatic browsing using the known Scatter/Gather approach (which is mainly based on iterative application of document clustering procedures – see section 2) in order to further facilitate the user search procedure. Moreover, we apply parallelism over a distributed memory environment in order to gain better (and acceptable) total performance for very large document collections. Specifically, we first follow the typical scatter/gather approach proposed in [8], slightly changed due to the fact that in our system personalized documents categorization for each user has already been done via the personalization module of the system. The predefined categories for each specific user (e.g. user model / stereotype based) can serve here as the basic initial clusters for the scatter/gather procedure. Thus, initially, the documents belonging into the specific user-profile categories (in other words, the set of initial clusters assigned to the specific user) are gathered together to form a dynamic (for the specific user) subcollection. An appropriate reclustering procedure is then applied to scatter the user subcollection into a number of document groups, and short summaries of them are presented to the user. Based on these summaries, the user selects one or more of the groups for further study. The selected groups are gathered together again to form a new (smaller) subcollection. The system then applies clustering (re-clustering via the same procedure as above) again to scatter the new subcollection into a small number of document groups, which are again presented (summaries of them) to the user. The user selects again, etc. With each successive iteration the groups become smaller, and therefore more detailed. Ultimately, when the groups become small enough, this process bottoms out by enumerating individual documents. Note that, since an initial (via the personalization module) document recommendation and selection has already been done (assignment of specific categories to each user), initial heavy (and more accurate) clustering (i.e. such as fractionation proposed in [8]) is not necessary. This initial personalized filtering can serve as the basic initial (via a different and more accurate procedure) clustering step proposed in the above reference. Thus, in our hybrid approach, only fast online reclustering procedures have to be considered. Towards this direction, we’ve used a customized version of the Buckshot algorithm (see [8] for a general description) which is a typical quite fast clustering algorithm suitable for the online re-clustering essential for scatter/gather. The Buckshot algorithm is a combination of hierarchical and partitioning algorithms designed to take advantage of the accuracy of hierarchical clustering as well as the low computational complexity of partitioning algorithms. Specifically, it assumes the existence of some (i.e. hierarchical) algorithm which clusters well, but which may run slowly. This procedure is usually called ‘the cluster subroutine’. In our system, we use single-link hierarchical agglomerative clustering method for this subroutine (instead of group-average or complete-link), in order to obtain not very tight initial clusters. Hierarchical conceptual clustering plays an important role in our

Interactive Cluster-Based Personalized Retrieval on Large Document Collections

217

work because we plan, in the future, to combine knowledge acquisition with machine learning to extract semantics from resources found on the Web [2]. Then, the algorithm takes a random sample of s = √kn documents from the collection and uses the specific ‘cluster subroutine’ (hierarchical single-link) as the high-precision clustering routine to find initial centers from this random sample. The initial centers generated from the hierarchical agglomerative clustering subroutine can be used as the basis for clustering the entire collection in a highperformance manner, by assigning the remaining documents in the collection to the most appropriate initial center. The original Buckshot algorithm gives no specifics on how best to assign the remaining documents to appropriate centers, although various techniques are given. In our work we use an iterated assign-to-nearest algorithm with two iterations similar to the one proposed in [9]. The Buckshot algorithm typically requires linear time (since s = √kn, the total time is O(kn) where k is much smaller than n) which is very satisfactory. This establishes the feasibility of the scatter/gather method for browsing moderately large document collections. But for very large document collections the linear time requirement for the online phase makes the use of the scatter/gather browsing method not very efficient. On the other hand, in our system the Buckshot procedure is usually expected to run over controlled-size subcollections (since the user subcollections are the results of personalized filtering procedures). However, in order to face this inefficiency in either case, we apply parallelism over a distributed memory parallel environment, aiming at gaining acceptable performance even for very large document collections. 3.3 Parallel Processing Module As mentioned above, even the Buckshot algorithm in sequential execution tends to be quite slow for today’s very large (huge) collections. Even the most simplistic modern clustering techniques tend to be quite slow too. Naturally, a promising approach could be parallel processing. In our proposed system, we use such efficient parallel techniques in order to achieve acceptable performance even for very large document collections. Moreover, using distributed memory architecture we can reduce the time and memory complexity of the sequential algorithms by a factor of p where p is the number of nodes used. Specifically, towards an efficient design and implementation of the proposed (in the previous section) scatter/gather clustering techniques, we follow the parallel approach presented in [9]. First, an efficient implementation of the underlying hierarchical agglomerative clustering subroutine is constructed (initially based on the parallel calculation of the pair-wise documents similarity matrix – in a distributed manner over the multiple processors, and then iterating to build the cluster hierarchy using the single-link criterion). Based on the above parallel execution of the underlying clustering subroutine, we build an efficient parallel implementation of the Buckshot algorithm (similar again to the one proposed in [9]). The first phase of the parallel Buckshot algorithm uses the parallel hierarchical clustering subroutine to cluster s random documents. The second phase for the parallel version of the Buckshot algorithm groups the remaining documents in parallel. After the clustering subroutine has finished, k initial clusters have been created from the random sample of s = √kn documents. From the total collection n−s

218

P. Belsis et al.

documents remain that have not yet been assigned to any cluster. The second phase of the Buckshot algorithm assigns these documents according to their similarity to the centroids of the initial clusters. This phase of the algorithm is trivially parallelized via data partitioning. First, the initial cluster centroids are calculated on every node (with use of appropriate collective parallel functions – aiming at properly reducing the total communication cost). After centroids calculation is complete, each node is assigned approximately (n−s)/p documents to process. Each node iterates through these documents in place, (comparing the document’s term vector to each centroid and making the assignment) until all documents are assigned. The second phase is iterated two times. The second iteration recalculates the centroids and reassigns all the documents to one of the k clusters. Moreover, we also apply parallelism during the documents’ preprocessing phase, based on previous works of ours (see [10],[11]). As a part of these techniques, a more accurate off-line clustering algorithm (partitional clustering based on the iterative calculation of connected components of the documents similarity matrix – as a specialization of the single-link hierarchical agglomerative clustering algorithm) is also given. This global initial clustering method is quite useful if the user wish to perform global searches from the beginning (entering natural language keywords etc.) without using any personalization-based categorization feature. The specific used document indexing process (essential as part of the off-line setup/preprocessing phase of the system, in order to be able to apply effectively the similarity-based clustering techniques) follows the basics of the Vector Space Model (construction of weighted document vectors, based on the statistical extraction of word-stems, phrases and thesaurus classes). For speeding up the similarity calculations we also extract and extensively use a global inverted index (as well as partial local inverted lists when needed for parallel clustering procedures). Some of our parallel processing methods (see [10],[11]) have been extensively tested over real distributed memory environment, yielding to very good performance. As the underlying distributed memory platform we use a beowulf-class linux-cluster environment with use of the MPI-2 (MPICH implementation) passing message library. Specifically, our cluster consists of 8 Pentium-4 based processors with 1GB RAM and a dedicated Myrinet network interface which provides 2Gbps communication speed. The corresponding experiments have been done over a part the known TIPSTER/TREC standard document collections.

4 Conclusion Adding personalization features in websites or other systems becomes a very popular approach. Towards this direction machine learning algorithms have proved to be an effective solution. Personalization models base their operation on a limited set of features. In large document collections though it is not sufficient to direct the userqueries based on the generic categories that help build up the personalization model. We have presented a hybrid approach that initiates the user-system interaction by making propositions to the user based on the user model created either by userfeedback or by a machine learning approach based on tracking her/his previous past interaction with the system; accordingly the selected sub-clusters of documents are

Interactive Cluster-Based Personalized Retrieval on Large Document Collections

219

processed and new keywords arise which help build up a new set of sub-clusters of the remaining documents. This process proceeds repetitively with the user’s participation until an adequately limited number of documents have been refined through the user directed queries. The benefit of our approach is that it proceeds in a highly dynamic manner, not limiting the number of features that arise in each step of the query process. Thus, the associated with the resources feature set is updated frequently, resulting in an effective and dynamic re-clustering of documents. In order to keep the response times low, parallel processing techniques are employed. We have described the modular components of a proof of concept architecture that encompasses the basic principles of our approach and we have described good selection choices towards the system’s implementation which is still under continuous development; still, based on previous experimentation of some of its sub-modules [10][11] we provide adequate evidence about the validity of our approach. Acknowledgments. We are grateful to George Paliouras for his helpful comments on the early version of this article.

References 1. Wolff, E., Cremers, A.: The MyVIEW Project: A Data Warehousing Approach to Personalized Digital Libraries. Next Generation Information Technologies and Systems, 277–294 (1999) 2. Godoy, D., Amandi, A.: Modeling user interests by conceptual clustering. Information Systems 31, 247–265 (2006) 3. Perkowitz, M., Etzioni, O.: Learning and revising user profiles: The identification of interesting Web sites. Machine learning 27, 313–331 (1998) 4. Paliouras, G., Papatheodorou, C., Karkaletsis, V., Spyropoulos, C.D.: Clustering the Users of Large Web Sites into Communities. In: Proceedings of the InternationalConference on Machine Learning (ICML), pp. 719–726 (2000) 5. Paliouras, G., Papatheodorou, C., Karkaletsis, V., Spyropoulos, C.D.: Discovering User Communities on the Internet Using Unsupervised Machine Learning Techniques. Interacting with Computers 14(6), 761–791 (2002) 6. Paliouras, G., Mouzakidis, A., Ntoutsis, C., Alexopoulos, A., Skourlas, C.: PNS: Personalized Multi-source News Delivery. In: Proceedings of the 10th International Conference on Knowledge-Based & Intelligent Information & Engineering Systems (KES), Bournemouth, UK, October 2006, pp. 1152–1161 (2006) 7. Langley, P.: User modeling in adaptive interfaces. In: Proceedings of the 7th International conference on user modeling, pp. 357–370. Springer, Heidelberg (1999) 8. Cutting, D.R., Pedersen, J.O., Karger, D., Tukey, J.W.: Scatter/gather: A cluster-based approach to browsing large document collections. In: Proceedings of the Fifteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 318–329. ACM, New York (1992) 9. Cathey, R., Jensen, E., Beitzel, S., Frieder, O., Grossman, D.: Exploiting parallelism to support scalable hierarchical clustering. JASIST 58(8), 1207–1221 (2007) 10. Kehagias, D., Mamalis, B., Pantziou, G.: Efficient VSM-based Parallel Text Retrieval on a PC-Cluster Environment using MPI. In: Proceedings, ISCA 18th Intl. Conf. on Parallel Distributed Computing Systems (PDCS’05), Las Vegas, Nevada, USA, September 12-14, pp. 334–341 (2005)

220

P. Belsis et al.

11. Gavalas, D., Konstantopoulos, C., Mamalis, B., Pantziou, G.: Efficient BSP/CGM Algorithms for Text Retrieval. In: Proceedings of the 17th IASTED Intl. Conf. on Parallel and Distributed Computing and Systems (PDCS’05), Phoenix, Arizona, USA, November 14-16, pp. 301–306 (2005) 12. Cutting, D.R., Karger, D.R., Pedersen, J.O.: Constant interaction-time scatter/gather browsing of very large document collections. In: Proceedings of the Sixteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 126–134. ACM, New York (1993) 13. Hearst, M.A., Karger, D., Pedersen, J.O.: Scatter/gather as a tool for the navigation of retrieval results. In: Burke, R. (ed.) Working Notes of the AAAI Fall Symposium on AI Applications in Knowledge Navigation and Retrieval, Cambridge, MA. AAAI, Menlo Park (1995) 14. Hearst, M.A., Pedersen, J.O.: Reexamining the cluster hypothesis: scatter/gather on retrieval results. In: SIGIR 1996: Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 76–84. ACM Press, New York (1996) 15. Jensen, E.C., Beitzel, S.M., Pilotto, A.J., Goharian, N., Frieder, O.: Parallelizing the buckshot algorithm for efficient document clustering. In: CIKM 2002: 11th int. conf. on Information and knowledge management, pp. 684–686. ACM Press, New York (2002) 16. Liu, Y., Mostafa, J., Ke, W.: A Fast Online Clustering Algorithm for Scatter/Gather Browsing (2006) 17. Hartigan, J.A.: Clustering Algorithms. Wiley, Chichester (1975) 18. Guha, S., Rastogi, R., Shim, K.: CURE: An Efficient Clustering Algorithm for Large Databases. In: Proceedings of the 1998 ACM-SIGMOD, pp. 73–84 (1998) 19. Jardine, N., van Rijsbergen, C.J.: The Use of Hierarchical Clustering in Information Retrieval. Information Storage and Retrieval (1971) 20. Dash, M., Petrutiu, S., Sheuermann, P.: Efficient Parallel Hierarchical Clustering. In: Danelutto, M., Vanneschi, M., Laforenza, D. (eds.) Euro-Par 2004. LNCS, vol. 3149. Springer, Heidelberg (2004) 21. Dhillon, I.S., Modha, D.S.: A data-clustering algorithm on distributed memory multiprocessors. In: Zaki, M.J., Ho, C.-T. (eds.) KDD 1999. LNCS (LNAI), vol. 1759, pp. 245–260. Springer, Heidelberg (2000) 22. Heckel, B., Hamann, B.: Divisive parallel clustering for multiresolution analysis. In: Geometric Modeling for Scientifc Visualization, Germany, pp. 345–358 (2004) 23. Olson, C.: Parallel algorithm for hierarchical clustering. Parallel Computing 21 (1995) 24. Xu, S., Zhang, J.: A hybrid parallel web document clustering algorithm and its performance study (366-03) (2003)

Interactive Cluster-Based Personalized Retrieval on ... - Springer Link

techniques based on user modeling to initiate the search on a large ... personalized services, a clustering based approach towards a user directed ..... Communities on the Internet Using Unsupervised Machine Learning Techniques. ... Distributed Computing Systems (PDCS'05), Las Vegas, Nevada, USA, September 12-14,.

204KB Sizes 1 Downloads 271 Views

Recommend Documents

Interactive Cluster-Based Personalized Retrieval on ... - Springer Link
consists of a good test-bed domain where personalization techniques may prove ... inserted by the user or implicitly by monitoring a user's behavior. ..... As the underlying distributed memory platform we use a beowulf-class linux-cluster .... Hearst

Hooked on Hype - Springer Link
Thinking about the moral and legal responsibility of people for becoming addicted and for conduct associated with their addictions has been hindered by inadequate images of the subjective experience of addiction and by inadequate understanding of how

3D articulated object retrieval using a graph-based ... - Springer Link
Aug 12, 2010 - Department of Electrical and Computer Engineering, Democritus. University ... Among the existing 3D object retrieval methods, two main categories ...... the Ph.D. degree in the Science of ... the past 9 years he has been work-.

TRENDS: A Content-Based Information Retrieval ... - Springer Link
computer science and artificial intelligence. This growing ... (2) More recently, design knowledge and informational processes have been partly .... Table 1 Sectors of influence classified by frequency of quotation by designers. Year. 1997 ..... McDo

A Content-Based Information Retrieval System for ... - Springer Link
This paper deals with the elaboration of an interactive software which ... Springer Science + Business Media B.V. 2008 .... Trend boards offer a good representation of the references used ..... function could be fulfilled by mobile devices.

Neighboring plant influences on arbuscular ... - Springer Link
tation of the fluor, providing quantitative data about each ... were purified using UltraClean PCR cleanup kits ... lysis indicated that the data exhibited a linear,.

Grand unification on noncommutative spacetime - Springer Link
Jan 19, 2007 - Abstract. We compute the beta-functions of the standard model formulated on a noncommutative space- time. If we assume that the scale for ...

Parallel sorting on cayley graphs - Springer Link
This paper presents a parallel algorithm for sorting on any graph with a ... for parallel processing, because of its regularity, the small number of connections.

Instructional design of interactive multimedia: A cultural ... - Springer Link
device. Advertisements, for instance, provide powerful artifacts that maintain, manipulate, and transform ... among others, video, audio, glossaries, text, and main ...

leaf extracts on germination and - Springer Link
compared to distil water (control.). ... lebbeck so, before selecting as a tree in agroforestry system, it is ... The control was treated with distilled water only.

On Community Leadership: Stories About ... - Springer Link
Apr 19, 2004 - research team with members of the community, how research questions emerged, method- ologies were developed, ways of gathering data ...

Interactive and progressive image retrieval on the ...
INTERNET, we present the principle of an interactive and progressive search ... make difficult to find a precise piece of information with the use of traditional text .... images, extracted from sites of the architect and providers of building produc

Tinospora crispa - Springer Link
naturally free from side effects are still in use by diabetic patients, especially in Third .... For the perifusion studies, data from rat islets are presented as mean absolute .... treated animals showed signs of recovery in body weight gains, reach

Chloraea alpina - Springer Link
Many floral characters influence not only pollen receipt and seed set but also pollen export and the number of seeds sired in the .... inserted by natural agents were not included in the final data set. Data were analysed with a ..... Ashman, T.L. an

GOODMAN'S - Springer Link
relation (evidential support) in “grue” contexts, not a logical relation (the ...... Fitelson, B.: The paradox of confirmation, Philosophy Compass, in B. Weatherson.

Bubo bubo - Springer Link
a local spatial-scale analysis. Joaquın Ortego Æ Pedro J. Cordero. Received: 16 March 2009 / Accepted: 17 August 2009 / Published online: 4 September 2009. Ó Springer Science+Business Media B.V. 2009. Abstract Knowledge of the factors influencing

Quantum Programming - Springer Link
Abstract. In this paper a programming language, qGCL, is presented for the expression of quantum algorithms. It contains the features re- quired to program a 'universal' quantum computer (including initiali- sation and observation), has a formal sema

BMC Bioinformatics - Springer Link
Apr 11, 2008 - Abstract. Background: This paper describes the design of an event ontology being developed for application in the machine understanding of infectious disease-related events reported in natural language text. This event ontology is desi

Candidate quality - Springer Link
didate quality when the campaigning costs are sufficiently high. Keywords Politicians' competence . Career concerns . Campaigning costs . Rewards for elected ...

Mathematical Biology - Springer Link
Here φ is the general form of free energy density. ... surfaces. γ is the edge energy density on the boundary. ..... According to the conventional Green theorem.

Artificial Emotions - Springer Link
Department of Computer Engineering and Industrial Automation. School of ... researchers in Computer Science and Artificial Intelligence (AI). It is believed that ...

Bayesian optimism - Springer Link
Jun 17, 2017 - also use the convention that for any f, g ∈ F and E ∈ , the act f Eg ...... and ESEM 2016 (Geneva) for helpful conversations and comments.

Contents - Springer Link
Dec 31, 2010 - Value-at-risk: The new benchmark for managing financial risk (3rd ed.). New. York: McGraw-Hill. 6. Markowitz, H. (1952). Portfolio selection. Journal of Finance, 7, 77–91. 7. Reilly, F., & Brown, K. (2002). Investment analysis & port