LIBR 244 -02 Online Searching Folksonomy and Social Tagging: When is it useful? By Quincy Dalton McCrary

LIBR 244-02_McCrary_FinalPaper_Spring 2009

Abstract

Many open internet sites have begun allowing users to submit items to a collection and tag them with keywords. This technique has been called “folksonomy” (also known as collaborative tagging, social classification, social indexing, and social tagging). Thomas Vander Wal (2006) defined folksonomies as "The result of personal free-tagging of information and objects (anything with a URL) for one's own retrieval. The tagging is done in a social environment (shared and open to others).” In contrast to professionally developed controlled vocabularies (also called taxonomies), folksonomies are unsystematic and, from an information scientist's point of view, unsophisticated (GordonMurnane 2006). For Internet users, they dramatically lower content categorization costs because there is no complicated, hierarchically organized nomenclature to learn. Combining standard classification schemes with user-added content indicators can produce materials with more access points and richer metadata, making results more findable (Arch 2007). How can this be done to the open internet? This paper investigates one method, “folksonomies”, and addresses the issue of cataloging the massive amount of data currently on the Internet. The intention of such a move is to make the internet truly searchable. Can the development of a folksonomy make online searching of the open internet a legitimate resource?

Quincy D McCrary San Jose State University School of Library & Information Science Spring 2009 (01/22/2009 – 5/13/2009) Instructor: Jean Bedord

2

LIBR 244-02_McCrary_FinalPaper_Spring 2009

Introduction The invention of networked databases and the open internet has revolutionized brick and mortar libraries (Mann 2005). More and more information is being provided in a digital format, and more and more information is being “published” in digital format, often on the open internet (Thompson 2005). The resulting content frequently does not exist in a static state, or in one unique and fixed location (Bolter 1991). As a result, providing access to digital content can be complicated. Making such materials accessible to an average library user, whether they are doing a “known-item search” or just simply browsing, is not always straightforward. How does one classify a webpage vs. an online document? How does one attempt to classify the 74,300,000 “hits” Google returns for a search on “diabetes”? Combining standard classification schemes with user-added content indicators can produce materials with more access points and richer metadata, making results more findable (Arch 2007). How can this be done to the open internet? This paper investigates one method, “folksonomies”, and addresses the issue of cataloging the massive amount of data currently on the Internet. The intention of such a move is to make the internet truly searchable. Can the development of a folksonomy make online searching of the open internet a legitimate resource?

A Short History of Library Catalogs Library catalogs mostly began as manuscript lists, arranged by format or in alphabetical order by author (Strout 1956). Printed catalogs were the next evolutionary step, sometimes called “dictionary catalogs”, that enabled scholars outside a library to gain an idea of its contents (Carpenter 1986). These dictionary catalogs would sometimes contain blank leaves where additions could be recorded, or bound as “guard-books” Quincy D McCrary San Jose State University School of Library & Information Science Spring 2009 (01/22/2009 – 5/13/2009) Instructor: Jean Bedord

3

LIBR 244-02_McCrary_FinalPaper_Spring 2009 containing slips of paper for new entries (Carpenter 1986). Slips were also sometimes kept loose in cardboard or tin boxes and stored on shelves (ibid). The first card catalogs appeared by the nineteenth century, enabling much more flexibility. Even the earliest catalogs used controlled vocabulary and classification schemes as methods to describe the materials collected, curated, and shared (Strout 1956). These materials have historically been “known” items, occupying a fixed spatial location in the library and with static content. Items cataloged in this matter require that metadata (data about other data) be added to the record as a way of describing the item in the catalog (Martin 1982). Metadata is added to an item in two primary ways: by the creator of the content and by trained professionals who are cataloging content in order to provide access to that content (ibid). By the end of the twentieth century, the Online Public Access Catalog had developed (Mann 2005). Ambitious as the idea of cataloging and offering online access to metadata about the item was, catalogers now looked to a vast sea of “published” material on the Web with the same level of interest they once did monographs. How this was to be done was as confusing and contested as cataloging the published works of the world once was. Could the internet be indexed? Could it be cataloged? If the internet were to be cataloged, would it be possible to automate the process?

Classification There are substantial issues involved with automated classification, not as many with automated indexing. To classify is to organize a body of information according to some meaningful scheme (Mann 2005). The body of information may be as small as a single article or as large as a society’s entire corpus of recorded knowledge. The Quincy D McCrary San Jose State University School of Library & Information Science Spring 2009 (01/22/2009 – 5/13/2009) Instructor: Jean Bedord

4

LIBR 244-02_McCrary_FinalPaper_Spring 2009 classification scheme involved may be as simple as the table of contents of a book, or as intricate and widespread as the Dewey Decimal System. The distinctive feature of classification is; it should reflect a system of meaningful relationships among the components of the body of information being classified (Hanson 2007). An index is a finding device that connects a symbol for a topic (usually in the form of an image or a word) with whatever material is pertinent to that topic (Rosenfeld and Morville 2002). Indexing conveys nothing about relationships that may exist among different topics. Pure indexing results in a single topic with no identified connection to any other topic. When pure indexing will not describe a topic properly, entries are internally organized as general topics and divided into categories. This hybrid approach is called classified indexing, and is believed by many to be the most efficient and relevant method to catalog the open internet (Mann 2005). Artificial intelligence is poor at classifying because it can deal only with controlled vocabularies (Hanson 2007). It cannot understand multiple meanings that are couched in metaphor, satire, double entendre, or that depend on context or nuances (ibid). Indexing operates by locating matches for particular topics or queries. Automation of indexing has lead to massive increases in speed and quality of indexing.

Organizing the Internet Traditionally, librarians have been collecting, organizing, and annotating “hyperlinks” (links between webpages on the open internet) in the form of subject or course pages, indexes, bibliographies, pathfinders, and the link (Gordon-Murnane 2006). Library users were presented with lists of links, which may be ordered alphabetically, chronologically, by subject or format. Sometimes librarians included feature descriptions Quincy D McCrary San Jose State University School of Library & Information Science Spring 2009 (01/22/2009 – 5/13/2009) Instructor: Jean Bedord

5

LIBR 244-02_McCrary_FinalPaper_Spring 2009 or access instructions. Highly structured, hierarchical, and often in a linear format, these bibliographic tools are not dynamic, multi-faceted, or holistic. Furthermore, in the era of social networking and intense personalization, users' responses to information, usage patterns or even specific information needs are not incorporated into current internet cataloging (Guy and Tonkin 2006).

Folksonomies Today, many open internet sites have begun allowing users to submit items to a collection and tag them with keywords. This cataloging technique has been labeled “folksonomy” (also known as collaborative tagging, social classification, social indexing, and social tagging). A specific definition of folksonomy was developed by Thomas Vander Wal (2006), "The result of personal free-tagging of information and objects (anything with a URL) for one's own retrieval. The tagging is done in a social environment (shared and open to others).” In contrast to professionally developed controlled vocabularies (also called taxonomies), folksonomies are unsystematic and, from an information scientist's point of view, unsophisticated (Gordon-Murnane 2006). For Internet users, they dramatically lower content categorization costs because there is no complicated, hierarchically organized nomenclature to learn. Folksonomy also describes a classification system that emerges from the process of social tagging described above. Folksonomies became popular around 2004 as part of the Web 2.0 social software application phenomena (Gordon-Murnane 2006). Tagging, which is characteristic of many Web 2.0 services, allows non-expert users to collectively classify and find information on the open internet. Some websites include "tag clouds" as a way to visualize tags in a folksonomy (Sinclair and Cardew-Hall 2008). Typically, Quincy D McCrary San Jose State University School of Library & Information Science Spring 2009 (01/22/2009 – 5/13/2009) Instructor: Jean Bedord

6

LIBR 244-02_McCrary_FinalPaper_Spring 2009 folksonomies are Internet-based, although they can be used in other contexts (GordonMurnane 2006). Aggregating the tags of many users creates a folksonomy. Aggregation is the pulling together of all of the tags in an automated way. Folksonomic tagging is intended to make an un-indexed body of information easier to search, discover, and navigate. A well-developed folksonomy is accessible as a shared vocabulary that is both originated by, and familiar to, its primary users. As folksonomies develop, users can discover who has used a given tag and see the other tags that this person has used and/or produced (Sinclair and Cardew-Hall 2008). Folksonomy users can discover the tag sets of another user who interprets and tags content in a way that makes sense to them. Folksonomy creation and searching tools are not part of the underlying World Wide Web protocols. Folksonomies arise in Web-based communities where provisions are made at the site level for creating and using tags (Vander Wal 2006). These communities were established to enable internet users to label and share user-generated content. Websites like CiteULike [http://www.citeulike.org/] and Connotea [http://www.connotea.org/] are designed for academics to "share, store, and organize the academic papers they are reading" [http: / /www.citeulike.org/faq/all.adp] (Arch 2007). The software automatically extracts the citation details from sites with HTML/CSS headers. Faculty can use tools like this to simplify disseminating reference lists, bibliographies, papers, and other resources among peers or to students. Connotea is also designed for the scientific community. It pulls bibliographic information from scientific articles and journals, developing tags that are relevant to the users.

Quincy D McCrary San Jose State University School of Library & Information Science Spring 2009 (01/22/2009 – 5/13/2009) Instructor: Jean Bedord

7

LIBR 244-02_McCrary_FinalPaper_Spring 2009

Criticisms Folksonomies are not taxonomies. Folksonomies don't support searching and other types of browsing nearly as well as tags from controlled vocabularies applied by professionals (Morrison 2008). Folksonomies are flat and unstructured. Tags are sometimes messy. They often lack precision and have no ability to control synonyms or related terms. Without vocabulary controls, the search engine cannot determine from bunny, bunnies, rabbit, Easter, or a proper name? Many other variants of this problem can arise: plural, singular, spelling errors, and typos. There are different approaches taken with tags composed of multiple words, and this will result in even more variations. Is it "Santa Rosa" or santa_rosa or SantaRosa? A final criticism one could make of folksonomies as classification systems is this: supporters assume everything on the internet needs to be organized and classified. Folksonomy advocates seem not to recognize that critical, first decision. The free labor available to create folksonomies is appealing only to those who have already agreed that the entire internet needs to be structured, organized and cataloged. However, unlike a traditional library, many Internet items could be eliminated, ignored, or allowed to die off. Most people put into the wastebasket (physically or digitally) flyers, ads and newsletters, and would not bother to organize ephemera. There is no difference between trashing a flyer and trashing a single use website. The question we should ask is; is this single use website of enough historical significance to take up the digital space and energy use it will require? In comparison to folksonomies, a traditional classification scheme based on established categories yields search results that are more exact. However, traditional Quincy D McCrary San Jose State University School of Library & Information Science Spring 2009 (01/22/2009 – 5/13/2009) Instructor: Jean Bedord

8

LIBR 244-02_McCrary_FinalPaper_Spring 2009 cataloging can be more time consuming. It is by definition more limiting, but it does result in consistency within its scheme. Folksonomy allows for variant opinions. Most information seekers want the most relevant hits when keying in a search query. However, because folksonomies are a scheme based on relativism, and therefore the scheme will always include the failings of relativism. A traditional classification scheme will consistently provide better results to information seekers. The choice to use folksonomy for organizing information on the Internet is not a simple, straightforward decision. Although folksonomy advocates are beginning to correct some linguistic and cultural variations when applying tags, inconsistencies within the folksonomic classification scheme will always be present. There is no right or wrong classification terms in a folksonomic world and the system can break down when applied to databases of journal articles or dissertations. I find folksonomies to be a confusing cataloging structure interjected with personal opinions and stereotypes. Traditional classification is best for certain types of searches, for example ones where you want precision over recall and relevancy, and especially where there is a confined “domain of contents” that you have to be sure you’ve searched thoroughly. On the other hand, they are not as good as a folksonomy for other types of searches. Using emotionally loaded language as keyword searches can provide many, many variant results. But those searches conducted in an environment where folksonomies have been used to catalog information may result in more relevant returns than a traditional controlled environment. In the end, neither traditional controlledvocabulary nor folksonomic classifications are “the best.” Each is “the best” for something. In building taxonomies of documents, the librarian tries to capture the Quincy D McCrary San Jose State University School of Library & Information Science Spring 2009 (01/22/2009 – 5/13/2009) Instructor: Jean Bedord

9

LIBR 244-02_McCrary_FinalPaper_Spring 2009 “author’s” intent; but in a folksonomy of the same documents you capture the “reader’s” understanding. Both are relevant and useful in modern cataloging, and even more so in relation to the open internet. Drawing a line in the sand, or developing rigid dicotomies do nothing for science, but often act to limit evolution of an idea. Folksonomies are no different. There is no one best way to catalog the internet, but I do feel it should be attempted. For good or bad, the information monster that the internet has become is here to stay. I am not advocating taming the monster, but rather giving us children a vibrant trail of breadcrumbs with which we can find our way…

Quincy D McCrary San Jose State University School of Library & Information Science Spring 2009 (01/22/2009 – 5/13/2009) Instructor: Jean Bedord

10

LIBR 244-02_McCrary_FinalPaper_Spring 2009 Works Cited Bolter, D.J., (1991). Writing space the computer hypertext and the history of writing. Hillsdale NJ: Lawrence Erlbaum. Dye, J., (2006). Folksonomy: A game of high-tech (and High-Stakes) tag. EContent 29(3): 8-43. Etches-Johnson, A., (2006). The Brave New World of Social Book marking: Everything You Always Wanted to Know But Were Afraid to Ask. Feliciter no. 2, www.blogwithoutalibrary.net/talk/brave_new_world.pdf (accessed Apr. 1, 2008). Fichter, D. (2006). Intranet Applications for Tagging and Folksonomies. Online. May/June. 30(3): 43-45.

Gordon-Murnane, L., (2006). Social Bookmarking, Folksonomies, and Web 2.0 Tools. Searcher. 14(6): 26-38. Gordon, M., and Pathak, P. (1999). Finding information on the World Wide Web: The retrieval effectiveness of search engines. Information Processing and Management. 35: 141–180. http://www.cindoc.csic.es/cybermetrics/pdf/60.pdf. (Accessed Apr. 1, 2008). Gordon-Murnane, L. (2006). Social Bookmarking, Folksonomies, and Web 2.0 Tools. Searcher. 14(6): 26–38. Guy, M., and Tonkin, E., (2006). Folksonomies: Tidying Up Tags? D-Lib Magazine. 12: 1. http://www.dlib.org/dlib/january06/guy/01guy.html. (Accessed Apr. 1, 2008). Mann, T., (2005). Oxford guide to library research. Oxford: Oxford University Press. Marieke, G., and Tonkin, E,. (2006). Folksonomies: Tidying Up Tags? D-Lib Magazine. 12(1). http://www.dIib.org/dlib/january06/guy/O Iguy.html. (Accessed Apr. 1, 2008). Martin, J., (1982). Strategic Data Planning Methodologies. New Jersey: Prentice-Hall, Inc. Morrison, J.P., (2008). Tagging and searching: Search retrieval effectiveness of folksonomies on the World Wide Web. Information Processing and Management. 44: 1562–1579. Rosenfeld, L., & Morville, P. (2002). Information architecture for the World Wide Web. Sebastopol: O’Reilly. Quincy D McCrary San Jose State University School of Library & Information Science Spring 2009 (01/22/2009 – 5/13/2009) Instructor: Jean Bedord

11

LIBR 244-02_McCrary_FinalPaper_Spring 2009

Sinclair, J. and Cardew-Hall, M., (2008). The folksonomy tag cloud: when is it useful? Journal of Information Science. 34(1):15–29. Strout, R.F., (1956). The development of the catalog and cataloging codes. Library Quarterly, 26(4): 254-275. Tan, W., (2001). Cataloging Websites for a Library Online Catalog. Journal of Educational Media & Library Sciences 39(2): 98-105. Thompson, J., (2005). Books in the Digital Age: the transformation of academic and higher education publishing in Britain and the United States. Brookshire: Polity Vander Wal, T. (2009). Optimizing Tagging UI for People & Search. www.vanderwal.net. http://www.vanderwal.net/random/category.php?cat=153. (Accessed Apr. 1, 2008). ---(2007). Folksonomy Coinage and Definition. www.vanderwal.net. http://vanderwal.net/folksonomy.html. (Accessed Apr. 1, 2008). Xan, Arch. (2007). Creating the Academic Library Folksonomy: put social tagging to work at your institution. C&RL News. 68:2. http://www.ala.org/ala/mgrps/divs/acrl/publications/crlnews/2007/feb/libraryfolks onomy.cfm. (Accessed Apr. 1, 2008).

Quincy D McCrary San Jose State University School of Library & Information Science Spring 2009 (01/22/2009 – 5/13/2009) Instructor: Jean Bedord

12

Traditionally, librarians have been collecting ...

citation details from sites with HTML/CSS headers. Faculty can use tools like this to simplify disseminating reference lists, bibliographies, papers, and other ...

48KB Sizes 0 Downloads 213 Views

Recommend Documents

Urban librarians - Presentation - The Urban Librarians Conference
Page 2. Dangerous. Librarianship. Urban Librarians Conference. April 7, 2017 - Brooklyn, NY. Page 3. Whatever you do for me but without me, you do against ...

Ulli Boegershausen - It Could Have Been (Pro).pdf
S-Gt. 4. 1 2-3. 3. 0. 2. 2. 2. 0 2. 0. 0. 3 1 1. 2. 3. 0. 3. 0. 0 0. 2. 0 2 3 3. 0. 2. 2. 2. 0 2. 0. 0 2 3. 7. VII%barchord:16 X%barchord:16. 3. 3. 0. 0. 0. 0. 7 8 10 7. 8. 7. 7. 9.

'Double Tick' On WhatsApp Prima Facie Shows Summons Have Been ...
... is being passed against the respondent, in the. larger interest of justice. Put up for further proceedings on 04.05.18. (Surabhi Sharma Vats). MM/Mahila Court01/East/KKD/Delhi. 11.04.18. Page 2 of 2. Main menu. Displaying 'Double Tick' On WhatsAp

Have you been dealing comfrey, sonny? - www ...
Charlotte Mitchell, who almost singlehandedly rescued the Soil Association from bankruptcy and. oblivion back in 1991, has su. ered the ever-increasing impact ...

"rihanna where have you been".pdf
... apps below to open or edit this item. "rihanna where have you been".pdf. "rihanna where have you been".pdf. Open. Extract. Open with. Sign In. Main menu.

Because I Have Been Given Much.pdf
bread,. thee. 3. 1. show. roof's. ev. 1. 1. I'll. My. With. broth. shel. love. 1. 1. 1. that. o. word. 1. 1. 1. er. by and. 1. 1. 1. 4. 5. shall. glow. shall. 3. 1. 3. 3. 3. love. fire,.

'Double Tick' On WhatsApp Prima Facie Shows Summons Have Been ...
'Double Tick' On WhatsApp Prima Facie Shows Summons Have Been Delivered.pdf. 'Double Tick' On WhatsApp Prima Facie Shows Summons Have Been ...

MEMO TO STUDENTS WHO HAVE BEEN ...
Let me ask you some questions about how you prepared for the test. Answer them as honestly as you can. If you answer "No" to many of them, your ...

[Download] [PDF] Better Never to Have Been: The ...
Online PDF Better Never to Have Been: The Harm of Coming into Existence, Read PDF Better Never to Have Been: The Harm of Coming into Existence, Full ...

Over 20K Removal Clients have been Served by Godwins ...
Godwin's Removal Firm, one of London's leading man using a van solution, is celebrating seven years. of service to over ... Godwin's customers also are celebrating soon after enjoying the high-quality services of an expert. business. ... Page 2 of 2.

Humans have always been connected with the ocean ...
allows it to store extra water before a trek, is completely unnecessary for a rainforest monkey. Our environment is constantly changing. Fortunately, most of these ...

Collecting-Nature-Collecting-Histories.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Collecting-Nature-Collecting-Histories.pdf. Collecting-Nature-Collecting-Histories.pdf. Open. Extract. Open

Presentation - The Urban Librarians Conference
societies in Latin America a summary of two decades of experiences edgardo civallero. Page 2. Dangerous. Librarianship. Urban Librarians Conference. April 7, 2017 - Brooklyn, NY. Page 3. Whatever you do for me but without me, you do against me. Prove

The librarians s02e06
Santana:legacy edition.Carib 032215 833.Allabout history ... Theantlers discography.大戦略パーフェクト 戦場の覇者 ... Urdu dictionary pdf.Windows xp pro con.

Similar tasks that have been studied in the area of skill ...
Visual feedback is displayed in a noisy fashion as a cloud of dots, and subjects are incentivized to keep the cursor ... By manipulating the variance of the visual feedback we can probe how subjects integrate information over ... Only when body postu

The following colleges have been fixed as centres for the MMS(Sem IV )
May 6, 2015 - 790. = 150. Research, 163/10 to 179/10,. Lakhamsi Napoo Road,. Matunga, Mumbai - 400 019. 6 Vidyalankar Institute of Tech.. Nos. 791.

31 school shootings and/or school-related acts of violence have been ...
Jan 15, 2013 - 2. Snohomish County, Washington – October 24, 2011: A 15-year-old .... 2011, http://www.komonews.com/news/local/132829548.html; Diana ...

Game Over: You have been beaten by a GRUE
reactive programming framework originally developed in ... Programmer-specified priorities replaced re- ... ditional properties as required by the application.