Conditions of Trust for Completely-Remote Methods: A Proposal for Collaboration Catherine L. Smith Kent State University PO Box 5019 – Library Room 314v Kent, OH 44242
[email protected] ABSTRACT Client-side logs are an important data source for research in HCIR. While methods for privacy protection are essential, and development continues on this front, an additional challenge remains. In order to collect client-side log data, volunteers must consent to participate in research; consent requires significant trust. For completely-remote methods, where volunteers download and install logging software, it is particularly difficult to establish the trustworthiness of a study. This paper proposes collaboration in a community of practice for remote methods, with emphasis on expressing the trustworthiness of our methods in a manner that can be understood by the public.
1. INTRODUCTION Client-side logs are an essential data source for much research in HCIR. Logs may be collected on a laboratory computer, or on a remote basis on a research volunteer’s machine. Researchers may install logging software on volunteers’ machines, or, in other cases volunteers must download and install the software on their own; we term the latter case a completely-remote protocol. Remote methods are important for the observation of exploratory and complex search in situ and for longitudinal designs. Completely-remote methods are essential for studies focused on geographically disbursed populations. Although a first generation of client-side logging systems has been developed, and technological advances continue, sociological concerns remain. As a research community, we focus on the important need for robust privacy protections, however, the population at large remains skeptical about allowing researchers to collect log data. Access to willing volunteers is critical, and this requires sufficient trust among the public. As we continue to enhance privacy protections and security, we may also wish to build a community of practice that expresses the trustworthiness of our methods. The remainder of this paper is structured as follows. We briefly review methods for log-data collection. We then discuss sociological concerns and the role of trust in relationships with research volunteers. We then propose a collaborative effort to develop a community of practice for remote methods. We conclude with an example of methods-oriented information that might be exchanged in such a collaborative.
2. BACKGROUND Log analysis is essential to our understanding of how people use the web to search for information [17]. In information retrieval research, the analysis of server-side query logs and click-through data is fundamental. These data are often supplemented with client-side browser logs, which are gathered routinely from users who install proprietary browser add-ons and opt-in to data-sharing agreements [e.g. 2].
While industry labs have access to both client-side and server-side logs, academic researchers have very limited access, generally only through collaboration in industry labs [e.g. 29, 31] or in special programs [27, 32]. The lack of accessible logs impedes progress on many research questions [1, 15]. Because industry logs are essentially unavailable, academic researchers have attempted to collect equivalent logs by other means, with little success [6]. For academics, the power of learning over immense datasets has been possible only when industry shares its data; this situation stimulates research on methods for the collection of anonymized log data [11]. On the other hand, for many research questions in HCIR, small-scale client-side logs may be sufficient [13, 21], and this motivates the development of client-side methods for work at this scale. Client-side logging systems may collect data at various levels of a user’s operational environment. Typically, data collection occurs within a single browser application, however, systems may reach beyond the browser, conceivably recording all applications and processes active on a volunteer’s machine [18], or collecting screenshots at the application or machine level [29]. The logging function is often integrated with other components. For example, in order to control the flow of a protocol, a clientside system may monitor events in the browser. Upon detecting a an event of interest, the system may initiate an action such as opening a dialogue box to request a response from the user, or starting a data upload, for example. In observational studies, components of this type may be used to collect contextual information from volunteers, such as the description of a task [22], an evaluation of success, or a reason for switching search engines [14]. A toolbar interface is often included; it allows volunteers to control the client-side system, with actions such as switching logging on and off, setting preferences, or actively initiating a data upload [23]. Client-side components may also be integrated with a secure website to control the assignment of experimental factors [24] or a sequence of tasks [30]. A proxy server may also be used to control experimental manipulations on both inbound and outbound transactions [24]. Client-side logging software may run on a machine in a laboratory [4, 10] or it may run on a research volunteer’s own machine. In the latter case, a researcher may install the software on volunteers’ machines [20, 22, 29], or volunteers may be required to download and install the software on their own [19, 29, 30]; we refer to the second case as a completely-remote protocol. In HCIR, remote client-side methods are particularly important for the observation of exploratory and complex search in situ, and for longitudinal designs. Completely-remote methods are essential when a population of interest is geographically dispersed, and it is
infeasible for a researcher to physically install logging software on research participants’ machines.
3. SOCIOLOGICAL CONCERNS While there are technical challenges in developing and fielding completely-remote protocols, there are also sociological concerns. The most prominent issues pertain to risks to privacy in the collection and sharing of log-data [5, 7, 11, 25]. In addition, the need to download and install logging software raises security concerns. Any protocol collecting search logs, no matter the scale, must protect privacy, and downloadable software must not create security problems, however, as researchers, we cannot guarantee the elimination of all risk. When participating in a completely-remote study, a potential volunteer (PV) is likely to consider risks to informational privacy, “the ability to control who gathers and disseminates information about one’s self … and under what circumstances” [3, p. 134], accessibility privacy, “cases where acquisition … of information involves gaining access to the individual” [9, p.76], as well as risks to computer security [16]. The role of informed consent is to enable PVs to understand the risks, so that they may assess their particular vulnerabilities prior to consent. For a completelyremote study, participation is a trust-requiring situation [28]. Participation requires sufficient trust, “an attitude of confident expectation in an online situation of risk that one’s vulnerabilities will not be exploited” [8, p. 740]. Of course, trust is necessary, but not sufficient for a decision to participate. Online trust is a complex, multidimensional concept. We focus here on conditions of trust [26]. Factors associated with the formation of trust apply in many different dimensions of humancomputer interaction, including, for example, trust developed during repeated use of a system, or cues to trustworthiness as expressed in an interface [8]. In this discussion we apply Nissenbaum’s [26] five general conditions of trust to the trustrequiring situations encountered in a completely-remote protocol.
4. TRUST AND PARTICIPATION In applying Nissenbaum’s conditions of trust, we define the PV as trustor in relationship to the study, which we define as the trustee, that is, the study is the object of the PV’s trust. The relationship is complex, in that the study comprises the researcher (as represented in electronic communications), the logging software (which must be downloaded), and other components such as instruments and attendant systems. We summarize Nissenbaum’s [26] five general conditions of trust as: (1) the reputation of the trustee, (2) inference of the trustee’s trustworthiness based on shared personal characteristics, (3) mutuality of a common condition and the expectation of reciprocity, (4) knowledge of the trustee’s qualifications for a specific role, and (5) contextual factors such as community norms for disclosure, rewards, punishments, and values. For a completely-remote study, unless the PV has had personal contact with the researcher, it is unlikely that trust will be predicated on shared personal characteristics. The mutuality of a common condition is absent, given that in a typical protocol, the researcher assumes no risk comparable to that taken by the participant. The most salient factors may be reputation, qualifications and roles, and contexts. The promise of a monetary reward may create an expectation of reciprocity, however the lack of mutuality may make this a weak factor.
The PV’s initial assessment of the trustworthiness of a study is likely to be based on an announcement such as a recruiting email. If the researcher is unknown to the PV, conditions of trust may depend on knowledge of the community from which the study originates. Knowledge of the researcher’s role and qualifications, as well as the reputation of the researcher’s community, may also be important. If a PV has sufficient trust in an initial communication, the next step in participation is consent. A standard disclosure provides specific information about risks, while also conveying contextual information. It may provide information on roles and qualifications associated with the study, while expressing norms for human-subjects research in the community from which it originates. In order to consent, the PV must have sufficient trust that the disclosure conveys the risks, and that vulnerability to those risks will not be exploited. The PV’s decision to download and install client-side software requires sufficient trust that the software will cause no harm, that systems will perform as disclosed, and that privacy will be protected. Prior to this juncture in the relationship, risk is hypothetical. At download, the risks will be encountered directly, and vulnerabilities will be exposed. Here, the lack of mutuality may be the key factor in a decision not to participate. After downloading the software, continued participation requires sustained trust. As the study proceeds and the system is used repeatedly, it accrues a reputation, which affects trust [8]. The volunteer must continue to have sufficient trust that the system is operating as disclosed, and that privacy will not be breached. For completely-remote protocols, absence of personal knowledge of the researcher or the researcher’s community, a lack of mutuality, and weak contextual factors, create weak conditions of trust over the progression of the relationship.
5. PROPOSAL FOR COLLABORATION Dumais, Kelley, and Pedersen [21] have called for collaborative development efforts, and a “living laboratory” for evaluation of information-seeking support systems. Bar-Ilan suggests, “the IR community set up rules for the proper conduct of research” [1, sect. 3, para. 2]. These ideas extend naturally to HCIR, and more specifically to those in the research community using completelyremote methods. In order for completely-remote studies to reach volunteers outside researcher-affiliated communities, a stronger basis for trust may be required. We see the need for collaboration in the development of completely-remote methods. A group focused on this objective might undertake the review or development of reusable client-side data collection systems, with attention to assessing and reducing risk to research volunteers. It might also provide information on the roles and qualifications of those involved in remote research. Such a group might provide contextual information by expressing the community’s norms, or by serving as an information resource on informational risks and privacy protections involved in completely-remote studies. Ideally, by its actions, the group would enhance a reputation of trustworthiness for the research methods within its purview. We conclude this paper with a small example of the type of methods-oriented information that might be exchanged in such a collaborative endeavor.
6. EXAMPLE: SUBJECT RECRUITING
6.2 Participants
For studies with relatively small populations of interest, concerns about participation rates are particularly acute. Few studies using completely-remote methods report the details of recruiting and participation rates. Known exceptions include the study of Guo, et al. [14], which reports a 10% participation rate among Microsoft employees. Russell and Oren [29] comment on the difficulty of recruiting research volunteers, and suggest best practices for completely-remote methods. We conclude this paper by describing a small exploratory study in which we examined attitudes toward a completely-remote study, and factors that may affect the decision to participate in such a study.
Sixteen students completed the test session (5 undergraduate, 5 masters students in library science, and 6 graduate students in other areas). All were registered at a large mid-western university, and were recruited via departmental listservs within the university. An anonymous eligibility-screening questionnaire was used to fill quotas for each group.
6.1 Procedure As part of a usability test for a completely-remote protocol, we sought to learn about attitudes toward the protocol and factors likely to affect a decision to participate. The protocol involved downloading and installing logging software, and using a password secured website to complete four assigned searches. We conducted the test by bringing volunteers into the lab, where we observed them completing what we termed a mock-study, in which they stepped through the remote protocol using our machines as if they were their own. At various points during the test session, we collected data on attitudes. Due to limited space, we report here on three data points, two collected before, and one collected after, the mock-study. (Q1) Response to protocol description (questionnaire A). At the beginning of the test session volunteers completed a questionnaire. The concluding question described a completelyremote study, and mentioned a payment of $60. The question then asked: How likely are you to volunteer for this study if you learn about it in an email from:…”. A list eight entities followed (name of volunteer’s affiliated university, your best friend, Harvard University; Google; Microsoft; a friend on Facebook; a U.S. government agency; someone you don’t know). The order of the entities was randomized. Volunteers answered by sliding markers along a continuous horizontal rating scale for each entity. The scales ranged from -10 (highly unlikely) to +10 (highly likely), and were initialized at a central neutral zero. (Q2) Response to recruiting email (verbal exchange). As an introduction to the mock-study, volunteers read a one-page recruiting email, which contained detailed information about the completely-remote study, including mention of a $40 payment. Before reading the letter, volunteers were asked to pretend the email had been sent from within their university to their university email address. Once the volunteer finished reading the email, the researcher asked, “How would you respond to this email if you received it in your university email inbox?” and “Why would you respond that way?.” Responses were audiotaped and the recordings were later transcribed and hand-coded. (Q3) Response to remote vs. lab protocol (questionnaire B). After completing the mock-study, volunteers completed a final questionnaire. The questionnaire started as follows: "Thank you for testing the [protocol]. Thinking about the work, time, and risk involved in participating, please answer the following questions: ”How likely would you be to participate in the study if you had to… (1) install the add-on software on your own personal computer? (2) visit an office on the [local] campus, as you did today?” Volunteers responded using a seven-point Likert scale, ranging from Very Unlikely to Very Likely. We also asked about expected payment, but we omit these questions due to space constraints.
Figure 1. Likelihood of participating, by invitation source.
6.3 Analysis and Results In response to Q1, “How likely are you to volunteer for this study if you learn about it in an email from:”, volunteers expressed a clear bias toward participating if the invitation came from within their affiliated university (see Figure 1). Volunteers also had a clear bias against participating if the invitation came from someone they didn’t know. With the exception of an invitation from a best friend, on average, all other sources evoked responses around a neutral zone. Verbal responses to the recruiting email (Q2) fit four basic categories. Two volunteers stated they would not participate. Four were undecided. Five said they were interested or would take action to learn more, and five said they would sign-up or participate. Volunteers who said they would participate were most likely to cite interest in the topic and the $40 payment as their reasons. Among the ten volunteers who said they would participate or learn more, university affiliation was mentioned four times, and the credibility of university email was mentioned three times. Three of the six who were undecided or indicated they would not participate cited concerns about downloading software and “tracking.” Table 1. Likelihood of Remote/Lab Participation (Q3) vol. (Q2) vol. (Q2) remote remote lab lab 1 n-part --+++ 9 learn -+ 2 n-part + +++ 10 learn + ++ 3 undec -o++ 11 learn +++ +++ 4 undec -++ 12 part --+ 5 undec -13 part + +++ 6 undec ++ +++ 14 part + +++ 7 learn + ++ 15 part ++ ++ 8 learn + 16 part ++ +++ n-part: will not participate; undec: undecided; learn: will take action to learn more; part: will participate; - - - very unlikely to participate; - - unlikely; - somewhat unlikely; -o- undecided; + somewhat likely; ++ likely; +++ very likely
The final questionnaire (Q3) captured changes in attitude after the mock-study, as well as differences in attitudes toward remote and laboratory protocols (see Table 1). Not one volunteer preferred the remote protocol over the option to participate in a lab. Among the nine volunteers who were initially (in response to Q2) undecided or who wanted to learn more, after completing the mock-study, seven were somewhat-likely to very-likely to participate in the lab, while only four were somewhat-likely to very-likely to participate in the remote protocol. Four of the five volunteers who had an initial inclination to participate remained likely or somewhat likely to do so.
[8]
6.4 Discussion and future work
[14]
An important limitation of this study is the effect of the researcher, who conveyed encouragement throughout the usability test. It is possible that this produced effects resulting in overstatement of the likelihood of participation in the remote protocol. If we treat the above results conservatively, we may assume only volunteers 15 and 16 (see Table 1, above) would be likely to participate in the remote protocol, as both were initially positively disposed, and both were likely to participate after experiencing the mock-study. A participation rate of 2/16 is consistent with a study done within Microsoft, which reports a 10% participation rate among Microsoft employees [14]. Clearly, recruiting a sufficient sample is challenging, even within a researcher-affiliated community. In addition, the relative rarity of people willing to volunteer raises questions about the characteristics of this population, and any biases this may produce in research results. Our results also echo the discussion of Russell and Oren [26], who found that privacy concerns made volunteer recruiting challenging. They recommend providing PVs with clear, thorough, and detailed information about the logging system, privacy protections, and the data to be collected.
[9] [10] [11] [12]
[13]
[15] [16]
[17]
[18]
[19] [20]
[21]
[22]
In future work we plan to investigate differences in the characteristics of those who volunteer for remote studies, and those who refuse to participate. In particular, we are interested in how models of privacy concern and online trust might reveal and explain factors that affect participation.
[23]
7. ACKNOWLEDGMENTS
[25]
This work is supported in part by a Google Faculty Research Award. Thanks to Guan Wang for software development, to Melanie Kowolski and Merriam Kahn for testing, and to our sixteen hardworking research volunteers.
8. REFERENCES [1] [2]
[3]
[4] [5] [6] [7]
Bar-Ilan, J. (2007). Position paper: Access to query logs-an academic researcher's point of view. WWW. Bilenko, M. and White, R. W. (2008). Mining the search trails of surfing crowds: identifying relevant websites from user activity. WWW. Burgoon, J.K., Parrott, R., LePoire, B.A., Kelley, D.L., Walther, J.B., & Perry, D. (1989). Maintaining and restoring privacy through communication in different types of relationship. Journal of Social and Personal Relationships, 6, 131–158. Capra, R. (2009) HCI Browser: a tool for studying web search behavior. ASIS&T Annual Meeting. Clough, P. and Berendt, B. (2009). Report on the TrebleCLEF query log analysis workshop 2009. SIGIR Forum. Community Query Log Project Results (2010). http://lemurstudy.cs.umass.edu/ Cooper, A. (2008). A survey of query log privacy-enhancing techniques from a policy perspective. ACM Transactions on the Web, 2(4), 1-27.
[24]
[26] [27]
[28]
[29] [30]
[31] [32] [33]
[34]
Corritorea,C.L., Krachera, B., and Wiedenbeck, S. (2003). On-line trust: concepts, evolving themes, a model. International Journal of Human-Computer Studies, 58(6), 737-758 DeCew, J. (1997). In pursuit of privacy: Law, ethics, and the rise of technology. Ithaca, NY: Cornell University Press. Feild, H., A, Allan, J. and Jones, R. (2010). Predicting searcher frustration. SIGIR. Feild, H. A., Allan, J. and Glatt, J. (2011). CrowdLogging: distributed, private, and anonymous search logging. SIGIR. Friedman, B., Kahn, P.H., & Howe, D. C. (2000). Trust online. Communications of the ACM, 43(12), 34-40. Grimes, C., Tang, D. and Russell, D. M. (2007). Query Logs Alone are Not Enough. Workshop on Query Log Analysis at WWW. Guo, Q., White, R. W., Zhang, Y., Anderson, B. and Dumais, S. (2011).Why searchers switch: understanding and predicting engine switching rationales. SIGIR. Hearst, M., A. (2009). Search User Interfaces. Cambridge University Press. Huang, D.-L., Rau,P.-L.P.,Salvendy, G. (2010).Perception of information security. Behaviour & Information Technology 29(3), 221–232. Jansen, B. J. (2006). Search log analysis: What it is, what's been done, how to do it. Library & Information Science Research, 28(3), 407-432. Jansen, B. J., Ramadoss, R., Zhang, M. and Zang, N. (2006). Wrapper: An application for evaluating exploratory searching outside of the lab. SIGIR. Jones, T., Hawking, D. and Sankaranarayana, R. (2010). Live web search experiments for the rest of us. WWW. Kelly, D. Measuring online information-seeking context, part 1. Journal of the American Society for Information Science & Technology, 57(13), 1729-1739. Kelly, D., Dumais, S. T, and Pedersen, J. (2009). Evaluation Challenges and Directions for Information-Seeking Support Systems. Computer. Keller, M., Watters, C., and Shepard, M. (2007). A field study characterizing web-based information seeking tasks. Journal of the American Society for Information Science & Technology, 58(7), 9991018 Lemur Query Log Toolbar. http://www.lemurproject.org/querylogtoolbar/ Matthijs, N. and Radlinski, F. (2011). Personalizing web search using long term browsing history. WSDM. Murray, G. C. and Teevan, J. (2007). Query log analysis: social and technological challenges. SIGIR Forum. Nissenbaum, H. (2001). Securing trust online: Wisdom or oxymoron? Boston University Law Review , 81(3), 635-664. Pushing the Boundaries of Search City, 2006. http://www.microsoft.com/presspass/features/2006/may06/0531LiveLabs.mspx Riegelsberger, J, Sasse, M. A. and McCarthy, J.D. (2005). The mechanics of trust: A framework for research and design. International Journal of Human-Computer Studies, 62(3), 381-422 Russell, D. M. and Oren, M. (2009). Retrospective cued recall: A method for accurately recalling previous user behaviors. HICCS. Singer, G., Norbisrath, U., Vainikko, E., Kikkas, H. and Lewandowski, D. (2011). Search-logger analyzing exploratory search tasks. SAC. Singla, A., White, R. W. and Huang, J. (2010). Studying trailfinding algorithms for enhanced web search. SIGIR. The Shared Dataset. (2009). http://research.microsoft.com/enus/um/people/nickcr/wscd09/ Toms, E. G., Freund, L. and Li, C. WiIRE: the Web interactive information retrieval experimentation system prototype. Information Processing and Management, 40(4), 655-675. White, R. W. and Drucker, S. M. (2007). Investigating behavioral variability in web search. WWW.