Semantic Digital Library Services Nabonita Guha [email protected]; [email protected] Documentation Research & Training Centre Bangalore (India) Abstract To a great extent, ‘Semantics’ depends upon the context in which the user is seeking information. The proposed model aims at delivering information in the light of ontology-based document annotation, user annotation and domain ontology. The present study addresses the problem of semantic interoperability in digital library environment. In a heterogeneous environment, a mediation-based service is of immense help to resolve the ontology-level, schema-level and service-level ambiguity, for which WSMO Framework has been chosen. This study is confined to bring interoperability at schema and ontology level for the domain of agricultural. Further studies can be done for service level interoperability among various digital library services.

INTRODUCTION The faster and meaningful information retrieval has been the driving aim for the information retrieval systems since the beginning of automated information retrieval. During the database information systems, Artificial intelligence, and other computer-aided retrieval systems have made a very optimistic start. In the Internet era, Semantic Web came as a model of semantic retrieval in the web environment. Traditional libraries are in a stage of transition towards making the library without boundary with global access with Internet. Many information storage and retrieval systems were been used for a meaningful retrieval in print-based libraries. In a Web environment, the traditional means and techniques for information storage and retrieval are required to be modified to suite the changed needs. The classification systems for book classification has been changed into Ontologies to represent domain knowledge in machine processable form; the cataloguing codes have taken shape of Metadata Schema for the description web resources. The model of Semantic Web goes further to inferencing and proofing which is essential for information retrieval in web environment. With all these components (XML, RDF, Ontology, Inferencing, proofing, etc) [1], the vision of meaningful retrieval on Web seems quite practical. Still the following two core aspects has to be looked into for getting meaningful retrieval: • User’s context; and • Document context The vision of semantic retrieval can be accomplished with an ontological model of user’s interest areas, and the modeling of context of information which has been dealt in the document. This will make the match of user’s context with the document context easier at the search stage. Due to the distributed and heterogeneous nature of the Web, interoperability at semantic level became a great challenge before the system developer. Hence the present study aims at addressing both the challenges using tools and technology available to implement Semantic

1

Web based services. The present work intends to bring semantic retrieval in digital libraries using semantic web technologies for the key tasks like ontology-based annotation, semantic mediation and inferencing. CONTEXT-BASED SERVICE MODEL

3

classaraus

4

2

Inference Engine & Rule Base 1

Domain Ontology

classaraus

Digital Docs with Metadata

Onto-based Document Annotation classaraus

Search Interface

Ontobased User Annotation

5

Context Based Index

Fig. 1: Proposed Architecture of Context-based Service Model Semantic Annotation: In the given model above, the ontology has not been only for the domain knowledge modeling, but also for document annotation and the annotation of the registered users of the repository. Context of the information been dealt in the document can be made explicit by document annotation, which can be represented in OWL or WSMO. Similarly the user profiles can be represented as ontology-based annotation. Classaurus [2]: It is a faceted thesaurus used as a vocabulary controlled mechanism for automated permuted indexing in traditional bibliographic databases. METADATA INTEROPERABILITY FOR SEMANTIC RETRIEVAL There is wide variety of metadata schemas available for different kind of digital resources. In the classification table for metadata, Kashyap and Seth [3] has categorized the various types of metadata under the following broad categories: • Content independent metadata • Content dependent metadata o Direct content-based metadata o Content descriptive metadata - Domain independent metadata - Domain specific metadata With this categorization make it quite clear that to bring semantic interoperability among bibliographic repositories, various types of metadata schemas has to be considered. JeromeDL [4] project has made an effort to bring semantic interoperability among the digital repositories using different bibliographic metadata standards like Dublin Core, BibTeX, MARC21 etc. The WSMO framework is a well define model to bring semantic interoperability among

2

heterogeneous automated retrieval systems by it’s the key components like Web Service Modeling Ontology (WSMO) [5], Web Services Modeling Environment (WSMX) [6] including various mediators. VIRTUAL DOCUMENT DELIVERY SERVICE The above two approaches are planned to be incorporated as a virtual document delivery service. The proposed service aims at context-based and interoperable service at metadata level. The domain of agriculture has been chosen for implementing this service among various repositories from agriculture domain. Various agriculture metadata schemas and thesaurus has been studies which are used by the classaraus

Domain Ontology classaraus

Inference Engine & Rule Base

Onto-based Document Annotation

Context Matching

Digital Docs with Metadata

CAB Thesaurus

classaraus

Onto-based User Annotation Mediation

AGROVOC

Mediation

Digital Docs with Metadata

WSMX

ASFA Thesaurus

Service Registry

Fig. 2: Ontology Meditation and context matching for Semantic Document Retrieval METHODOLOGY In the above model of virtual document delivery (fig. 2), the key tasks are: service discovery, ontology mediation and context matching. The semantic digital library searches for the service registry [7, 8] to locate appropriate repositories. After the service discovery to map the ontology and metadata schema, ontology mediators are used. The ontology mediator converts the mapped ontology to the native syntax. The retrieved document description used to generate context-based index. This context-based index is then matched with the user query and annotation. RELATED WORKS Context-based retrieval: Context-based information retrieval system has been a major focus of research. The Context Ontology Language (CoOL) [11] is an ontology-based context modeling approach, which uses the Aspect-Scale-Context (ASC) model where each aspect (e.g. spatial distance) can have several scales (e.g. kilometer scale or mile scale) to express some context information (e.g. 20). Chen et al. [12] propose a context broker architecture (CoBrA) using an ontology to describe persons, places and intentions. Less emphasis is put on the notion of

3

services and related aspects, such as user interfaces and mobile devices on which these services are deployed. Semantic Annotation: For making the content and context of the information explicit to the system many tools and techniques have been developed for semantic annotation of the web resources. KIM Semantic annotation platform and KIM plug-in is one of the useful tools available for digital resource annotation. Maedche and Staab [10] have proposed a semiautomatic acquisition of ontologies from domain texts. My proposed study aims in not only to annotate the digital resources semantically but also how to interpret the annotations described in different annotation languages. Semantic interoperability in Digital Libraries: JeromeDL [13] project developed a software aiming at bringing personalized services to the users and search algorithm based on the user profiles. My present study also considering user’s interest areas as the key component of the retrieval system to decide the context of the retrieved information, but here domain ontology, user’s annotation and the document annotation are supported by Classaurus mechanism which makes the communication between all three components easier. In metadata interoperability aspect also my study doesn’t aim at bringing any new ontology language but trying to bring interoperability among existing ontology languages, thesaurus and metadata schemas. Ontology based mediation in digital libraries: MarcOnt Initiative [14] is one of the on-going projects aiming at creating a new bibliographic description standard (MarcOnt) and mediation services to support different legacy bibliographic formats. But the outcome of the project proposal has yet to come. FUTURE RESEARCH AND CONCLUSION Interoperability in a heterogeneous environment is a broad connotation encompassing syntactic, as well as semantic interoperability. Interoperability at semantic level is a challenging task. In context-sensitive query processing over heterogeneous information resources requires the matching of concepts. Vocabularies, semantic relationships and mappings are information objects themselves, their life cycle: creation, acquisition, collection, modeling, identification, integration, mediation, search, use, maintenance and preservation etc. is of primary importance and a necessary prerequisite to improved semantic interoperability [9]. Steps are to be taken in all future researches in this regard.

REFERENCES [1] Tim Berners-Lee, James Hendler and Ora Lassila. The Semantic Web: A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities. May 17, 2001. [2] G. Bhattacharya. Fundamentals of subject indexing languages. In: Proceedigns of Third International Study Conference on Classification Research. Bombay, Jan. 6-11, 1975. Bangalore, In: DRTC 1979, pp 86-98. [3] Vipul Kashyap and Amit Sheth. Semantic heterogeneity in global information systems: The role of metadata, context and ontologies, Cooperative Information Systems. Academic Press: San Diego, 1998, pp. 139-178.

4

[4] Sebastian Ryszard Kruk, Stefan Decker, and Lech Zieborak. JeromeDL reconnecting digital libraries and the Semantic Web. WWW2005, May 10-14, 2005, Chiba, Japan. http://www.marcont.org/marcont/pdf/www2005_jeromedl.pdf [5] John Domingue, Dumitru Roman, and Michael Stollberg. Web Service Modeling Ontology (WSMO) - An ontology for Semantic Web services. Position paper at the W3C Workshop on Frameworks for Semantics in Web Services, June 9-10, 2005, Innsbruck, Austria. http://www.w3.org/2005/04/FSWS/Submissions/1/wsmo_position_paper.html [6] Emilia Cimpian and Michal Zaremba (Eds). Web Service Execution Environment (WSMX). W3C Member Submission 3 June 2005. http://www.w3.org/Submission/WSMX/ [7] Ann Apps. A middleware registry for the discovery of collections and services. In The First International Conference on e-Social Science, Manchester, UK, 22-24 June 2005. http://epub.mimas.ac.uk/papers/ncess2005/apps-ncess2005.pdf [8] Eric Lease Morgan, Jeremy Frumkin and Edward A. Fox. The OCKHAM Initiative Building component-based digital library services and collections. D-Lib Magazine, 10 (11), Nov. 2004. http://dlib.org/dlib/november04/11inbrief.html#FOX [9] Manjula Patel et al. Semantic interoperability in digital libraries. Task 3: Semantic Interoperability. WP5: Knowledge Extraction and Semantic Interoperability DELOS2 Network of Excellence in Digital Libraries July 2004 - June 2005. Project no. 507618, UKOLN, University of Bath. [10] Alexander Maedche and Steffen Staab. Semi-automatic engineering of ontologies from text. In: Proceedings of the Twelfth International Conference on Software Engineering and Knowledge Engineering (SEKE'2000). Chicago, July 6-8, 2000. [11] T.Strang, C. Linnhoff-Popien, and K. Frank. CoOL: A Context Ontology Language to enable Contextual Interoperability. In: J.B. Stefani, I. Dameure, D. Hagimont, (Eds). In: LNCS 2893: Proceedings of 4th IFIP WG 6.1 International Conference on Distributed Applications and Interoperable Systems DAIS2003). Volume 2893 of Lecture Notes in Computer Science (LNCS). Springer Verlag: Paris, 2003, pp. 236-247. [12] H. Chen, T. Finin, and A. Joshi. An ontology for context-aware pervasive computing environments. Special Issue on Ontologies for Distributed Systems, Knowledge Engineering Review, 2003. [13] Sebastian Ryszard Kruk, Stefan Decker, and Lech Zieborak. JeromeDL: Reconnecting Digital Libraries and the Semantic Web. http://www.marcont.org/marcont/pdf/www2005_jeromedl.pdf [14] Sebastian Kruk and Marcin Synak and Kerstin Zimmermann. MarcOnt Initiative Mediation services for digital libraries. http://library.deri.ie/servlet/showDoc?docId=http://library.deri.ie/pages/show.jsp?id=51c4 c7ff&chapter=0&view=pdf

5

Semantic Digital Library Services

system many tools and techniques have been developed for semantic annotation of the web ... http://epub.mimas.ac.uk/papers/ncess2005/apps-ncess2005.pdf.

522KB Sizes 4 Downloads 187 Views

Recommend Documents

practice - ACM Digital Library
This article provides an overview of how XSS vulnerabilities arise and why it is so difficult to avoid them in real-world Web application software development.

Ingest - HathiTrust Digital Library
Nov 15, 2013 - The HathiTrust Research Center is seeking proposals for ... HathiTrust has prepared a FAQ to accompany the recent call for US federal gov-.

Ingest - HathiTrust Digital Library
Nov 15, 2013 - You can follow HathiTrust on Twitter or Facebook · Subscribe to email .... Most-accessed volumes. The psychology of selling and advertising, by.

Download PDF - HathiTrust Digital Library
Nov 14, 2014 - The California Digital Library (CDL) loaded 58,128 new or updated bibliographic ... All Deter- minations. Public ... Boston College. 53. 3,263.

Download PDF - HathiTrust Digital Library
Jul 22, 2013 - The two-year project will focus on enriching and augmenting ... tools for enriching and augmenting metadata for the HathiTrust corpus. ... The California Digital Library (CDL) team began to load all current ... Boston College. 0.

Development Updates - HathiTrust Digital Library
May 6, 2015 - HathiTrust Digital Library. June 24, 2015 .... See more on Eric's blog post http://blogs.nd.edu/emorgan/2015/05/htrc-workset- · browser/. Eleanor ...

Development Updates - HathiTrust Digital Library
Nov 11, 2015 - metadata from the HaithiTrust database of published works. Finding meaningful trends in a large corpus of big data. Looking Ahead for HTRC.

Download PDF - HathiTrust Digital Library
Jul 22, 2013 - details are available at http://www.hathitrust.org/htrc_uncamp2013. Program Steering ... HathiTrust discussed future deposits of Internet Ar-.

Download PDF - HathiTrust Digital Library
Apr 25, 2014 - ... (slides | webinar), THAT-. Camp, Gainesville, FL, April 24, ... All Deter- minations. Public. Domain .... March. Overall. Boston College. 0. 3,111.

Download PDF - HathiTrust Digital Library
Feb 23, 2015 - We are grateful for all of .... HathiTrust corresponded with Boston College, Northwestern University, Univer- .... University of Florida. 0. 9,866.

Kinetic tiles - ACM Digital Library
May 7, 2011 - We propose and demonstrate Kinetic Tiles, modular construction units for kinetic animations. Three different design methods are explored and evaluated for kinetic animation with the Kinetic Tiles using preset movements, design via anima

Download PDF - HathiTrust Digital Library
Mar 24, 2014 - California Digital Library (CDL) loaded 51,669 new or updated bibliographic re- .... rect suggestions for nearly all queries. ... Boston College.

Download PDF - HathiTrust Digital Library
Oct 10, 2014 - The HathiTrust Research Center released a Request for Proposals for Advanced. Collaborative .... Made further enhancements to the search index update and release process that will be ... Twitter or Facebook · Subscribe to ...

Download PDF - HathiTrust Digital Library
Dec 6, 2013 - by the Internet Archive (IA), and Boston College completed steps for HathiTrust to ... California Digital Library (CDL) loaded 143,552 new or updated ... Development staff tested all HathiTrust applications in the upgraded.

HathiTrust update - HathiTrust Digital Library
May 9, 2014 - ... the features the HTRC intends to make available across all ... ton College, Emory University, the University of California, and the University of.

Download PDF - HathiTrust Digital Library
Jun 3, 2014 - For now we ask all ... of Illinois and prepared to ingest materials from Boston College. HathiTrust also .... University of California. 20,514.

Download PDF - HathiTrust Digital Library
Jan 29, 2015 - The California Digital Library (CDL) loaded 23,635 new and 63,135 updated biblio- ... Domain. All Deter- minations .... Boston College. 0. 3,263.

Download PDF - HathiTrust Digital Library
Sep 2, 2015 - Twitter or Facebook ... by adding an advanced search and displaying additional fields in ... Semantic-enhanced Search and Disambiguation.

Download PDF - HathiTrust Digital Library
Feb 23, 2015 - HathiTrust will hold elections later this year to fill this seat and to replace two other ... California Digital Library welcomed Dana Jemison as the new Zephir team ... Please join us for the third annual HTRC UnCamp at the University

Download PDF - HathiTrust Digital Library
Mar 24, 2014 - to support topical clustering, and application development for ... Begin development of a consoli- .... able from HathiTrust's mobile interface.

Download PDF - HathiTrust Digital Library
Mar 23, 2016 - This documentation is intended to make it easier for Google ... group email address has been created in order to facilitate communication with ...

Download PDF - HathiTrust Digital Library
Jul 12, 2014 - We ask all official. Member ... The California Digital Library loaded 98,850 new or updated bibliographic records .... Boston College. 13. 3,210.

Download PDF - HathiTrust Digital Library
Oct 23, 2013 - HathiTrust is issuing a broad call for bibliographic records for US federal ... print disabilities, the HathiTrust Research Center, the Executive ...

HathiTrust update - HathiTrust Digital Library
May 9, 2014 - Approved allocation of nearly $1,000,000 over four years to support the ... ton College, Emory University, the University of California, and the ...