You Are What You Use: An Initial Study of Authenticating Mobile Users via Application Usage Jonathan Voris

Yingbo Song

Department of Computer Science New York Institute of Technology

Allure Security Technology

[email protected]

[email protected] Malek Ben Salem

Allure Security Technology

[email protected]

[email protected]

ABSTRACT Mobile smartphone devices are vulnerable to masquerade attacks because they can be easily lost or stolen. This paper introduces a technique for detecting unauthorized users by modeling the legitimate user’s typical behavior when using their mobile phone. The user’s behavior model augments typical authentication mechanisms (such as PINs or fingerprints) to provide continuous authentication while a device is in use. A preliminary human user study was conducted in order to assess the viability of our application usage oriented authentication approach. The results of our initial experiment demonstrate that our system is capable of detecting an unauthorized user within 2 minutes.

CCS Concepts •Security and privacy → Mobile platform security;

1.

Salvatore Stolfo

Accenture Technology Labs

INTRODUCTION

Much research has focused on methods for preventing unauthorized access to information. Yet despite the development of various access control mechanisms and the body of work which exists on authentication, a masquerader problem still exists. Credentials are easily stolen. This issue is exacerbated in a mobile setting, where devices with sensitive information may be more easily misplaced or stolen. Further, mobile screen unlock passwords are easier to guess [3]. Over a billion dollars of mobile devices are stolen each year [15], which underscores the need to provide users with strong authentication mechanisms to protect their mobile data. However, there are usability challenges associated with authenticating users in a mobile context. As a result, many users do not utilize the authentication mechanisms which are currently available to them. A recent study revealed that 47% of smart phone users don’t control access

to their device with an authentication mechanism [4]. The goal of our work is to enhance existing standard methods for authenticating users on mobile devices, such as text or graphical passwords, with behavioral models by periodically validating user identity. This validation is performed by comparing the latest set of device interactions with a model which is calculated based on previously observed interactions. The core advantage of this approach is the ability to detect attacks which take place after traditional security mechanisms have been bypassed and to reduce an attacker’s window of opportunity by shortening time to detection. The proposed technique is also easily composable with other security mechanisms, such as using decoy applications as behavioral sensors [11]. As the results of our study indicate, mobile usage patterns differ between users to such an extent that is highly unlikely that two distinct individuals with similar habits could masquerade as each other undetected. In order to gain an initial sense of the viability of authentication based on mobile device user behavior, we developed monitoring software for Android devices which collects a variety of information on phone activity, such as application usage, phone calls, and SMS activity. A preliminary IRBapproved study was conducted, in which 50 participants installed our software on their personal devices in order to monitor how they interacted with their devices. Observations from this study allowed us to construct an active authentication prototype which forces users to re-authenticate to their devices if their observed usage deviates too far from their established pattern of behavior. Our approach can also be applied to track different users of a device. The rest of this paper is organized as follows. Section 2 summarizes pertinent related work. Section 3 describes our threat model. Section 4 covers the implementation of the active authentication prototype software we developed. Section 5 discusses the human subject study of our proposed model-based authentication approach. Section 6 describes our behavior model design. Section 7 covers the results of our user study. Lastly, Section 8 presents our ideas for investigating this topic further.

2. 2.1 ACM ISBN . DOI:

RELATED WORK Mobile Authentication

Mobile device authentication is a well-studied topic with research that dates back to 2002 [14]. Many authentication

mechanisms which were designed with desktop systems in mind can be translated to a mobile context, although this may result in poor usability due to the relatively limited display and input capabilities which are available on these platforms. Passwords, for example, can be difficult to enter on devices with small touchscreens [5]. Though traditional biometrics based on physical user properties like face or finger prints have long been deployed on mobile systems, they have generally served as ”point-of-entry” authentication mechanisms, i.e. authenticating mobile users at the beginning of a user session only. Furthermore, these physical biometric techniques have been shown to be circumvented [19, 8]. Graphical passwords [20], which replace text strings with patterns or images drawn or selected by the user, seem particularly well suited to mobile authentication scenarios. Unfortunately, due to user selection biases, the secrets that users share when using graphical techniques are frequently lower in entropy than anticipated [18].

2.2

Behavioral Biometrics

As an alternative to physical biometrics, research has focused on analyzing how users interact with a system in order to distinguish them from potential adversaries [16]. Both methods involve deriving features from user characteristics which can be compared against previously observed feature sets. Since behavioral biometrics are derived from the very act of using the device they protect, they do not require any additional hardware to function. Further, this allows authentication to be performed continuously and transparently throughout a session rather than authenticating a user once when his or her session begins. Examples of behavioral biometric mechanisms include touchscreen usage [1, 7, 6, 17], gait recognition, handwriting recognition, and linguistic analysis, typically referred to as stylometry [12, 13]. These techniques have their own limitations, for instance, stylometry is more suitable for continuous authentication on laptops and desktops, rather than on small mobile devices, while gait and handwriting recognition only work while the mobile device user is engaging in a specific activity (walking or writing by hand).

2.3

Modeling Users based on OS Interactions

The approach that we present is based on previous research investigating authentication by modeling a user’s interactions with the OS on desktop and laptop computers [10]. This approach was found to be viable in a desktop setting; the modeling approach was capable of detecting an attack with 95% accuracy within 15 minutes with at most one false positive per 40-hour work week [9]. This detection performance was achieved in part by analyzing the manner in which users accessed files. In a mobile setting, however, file systems are less robust and are often less accessible by end users. As a result, our current research relies on an application-centric rather than file-centric approach to user modeling. To augment the discriminative power of application use modeling, applications which appear to be authentic but are in fact spurious may be deployed on a system as “tripwires” to detect attacks when accessed. An example of such a countermeasure would be installing banking applications which a user does not have an account with and monitoring these applications for access events. The true owner of

the device knows which banking institutions he or she does business with and will seldom access these applications as a result. An attacker with no insight into the real owner’s banking habits would have a more difficult time making such a determination, however. The idea of using “decoy applications” has previously been suggested [11]; assessing the efficacy of an authentication solution which combines decoy applications with the modeling approach described in this paper is an area of future work.

3.

THREAT MODEL

The authentication solution presented in this paper is designed for scenarios where a mobile device is acquired by an attacker who is able to bypass any required authentication challenges and present himself or herself to the device as though he or she was a legitimate user. It is also meant to address “lunchtime” attacks where an adversary picks up a device which has been left unlocked. In this paper, we refer to this type of malicious activity as a “masquerade attack.” Examples of masquerade attacks on mobile devices include situations in which a device is lost or stolen without proper hardware security, such as tamper-resistant mechanisms in place, which prevent adversaries from using debug interfaces in order to bypass operating system level security measures. Assuming an attacker can bypass the device’s initial authentication challenge, our solution is intended to detect that such an attack has taken place by noticing ways in which the attacker’s actions deviate from how the legitimate owner of the device would utilize it. Whether the unexpected user is a malicious actor or a “curious” friend or jilted lover, our modeling technique is intended to detect differences between the way in which various people utilize their mobile devices.

4.

MOBILE SENSOR ARCHITECTURE

We present a mobile sensor for the Android platform to implement our authentication approach. The sensor consists of three core components as depicted in Figure 1: an Activity Collector, an Identity Engine, and a Verifier component. The Activity Collector senses Android events that are triggered by user actions, particularly those related to application usage. This component is also responsible for filtering out Android log events which are not the result of user initiated activity. The Activity Collector stores behavioral data in a local database which is sand-boxed with the sensor. The Identity Engine reads activity data from this table and uses it to incrementally construct user models. The Activity Collector logs data items that can make discriminative features for profiling unique user behavior on mobile devices, namely: application usage, application installation and removal, GPS location data, phone calls, text messages, and contact list access. Note that no content from any of these sources was collected during our experiment. That is, we have collected the time at which phone calls and text messages were made or sent, but the sensor does not monitor the content of those calls and messages. Similarly, the sensor records what applications were running on a device at any given time, but did not gather any details about how the application was being used.

ZIP archive to a central data collection server every hour using a WiFi network. Each participant ran the sensor for an average of four weeks on his or her mobile device. Table 2 shows summary statistics of the captured dataset.

5.2

Figure 1: Architecture of the Continuous Authentication Engine. The Identity Engine adjusts its model of the user while the sensor is running. It continuously updates its user model as new activity data is generated. During this process, older data will be expunged for both model freshness and to enforce a low memory and power consumption footprint. The frequency of modeling is adjustable based on operational requirements. The Identity Engine module updates its user model and detects abnormal behaviors not included in training data by using the results of re-authentication to “label” data as a true or false positive. If a user is successfully able to re-authenticate after suspicious activity is detected, the questionable session is marked as a false positive and the new activity which triggered the alert will be incorporated into the model of the user’s regular behavior. If re-authentication fails, this confirms the alert and the user’s model is not updated to include the new behavior. The Activity Collector and Identity Engine components of our mobile sensor were built to be extensible in order to permit arbitrary user data features to be analyzed by various machine learning techniques. This provided flexibility regarding the modalities and algorithms which our solution was capable of supporting. A single machine learning algorithm was selected for use in our prototype based upon the results of our user study, which are described in Section 7. Our final component, the Verifier, presents an authentication challenge whenever the Identity Engine observes behavior that is abnormal; this feature was implemented using the native Android screen locking functionality, and will be enhanced in our future work to use configurable challenge question sets.

5.

USER STUDY

An initial IRB-approved user study was conducted among a population of IT professionals. The main objectives of the user study were: (1) to collect user behavior data in order to build unique personalized user behavior models, and (2) to identify the most discriminative features for modeling based on real usage data.

5.1

Study Design

Volunteers were solicited from a pool of 1000 IT professionals working for the same corporation. A total of 50 employees with similar roles (i.e. same job function, but possibly a different seniority level) finally participated in the user study. Each of these participants was provided with instructions regarding how to download and install the Android sensor app on their personal device. The sensor uploads a

Study Sensor Modifications

The focus of the user study was on gathering data for offline analysis and model construction rather than actively authenticating users. To meet this need, the sensor was modified. The modified version minimized CPU usage and battery consumption. Any application components which were not needed for the study were removed, leaving a smaller sensor consisting of an Activity Collector which outputs directly to logs and the application’s GUI. The Verifier and the Identity Engine components were both disabled. The study sensor app monitors and records volunteers’ device usage behaviors and uploads collected data to a study server periodically when WiFi is available. If no WiFi connection is available during the scheduled upload time, the sensor logs are stored locally on the device until a WiFi connection becomes available. The sensor app comprises different loggers which monitor different aspects of device usage. Table 1 lists the loggers used by the sensor app and their descriptions, along with each logger’s default logging time interval.

6.

USER MODEL DESIGN AND EVALUATION

Our fundamental hypothesis is that individual users interact with their mobile devices in different and statistically discernible ways, and that by leveraging certain statistical measurements of such interaction it would be possible to identify foreign access via anomaly detection. The user study described in Section 5 was designed to provide data to refine the authentication system as well as to validate our methods. The study yielded data from 50 different users, drawn from a pool of IT professionals from one company. A range of behavior characteristics was exhibited by these users, each differing from one another by varying degrees. For example, certain users used their phones more frequently while others moved around more frequently. Table 2 shows a summary of the data collected from this user study. Each row represents a particular study participant. • The Days column specifies the number of days during the study period a user’s sensor uploaded data. • The Hours column specifies how many hours of data were collected for a particular volunteer. • The Apps column shows how many applications were started and terminated on a user’s device over the course of the study. • The Contact List columns show how many times a participant accessed his or her contact list during the study. • The GPS column contains the number of GPS samples collected by the sensor for the duration of the experiment. • The phone column displays how many phone calls the sensor recorded over the course of the user study.

Activity Logger AppMonitor PhoneLogReceiver GPSLogMonitor SMSIn SMSOut AppInstall ScreenLock ContactService

Description Monitors apps that are on the top of the process stack. Monitors call logs for phone dials. Monitors GPS coordinates. Monitors incoming SMS messages. Monitors outgoing SMS messages. Monitors application installation and uninstallation. Monitors screen lock / unlock Monitors whether user has inserted/viewed contacts.

Logging Frequency (msec) 5000 3000000 5000 3000000 3000000 3000000 3000000 3000000

Table 1: List of Sensor Acitvity Loggers

User 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50

6 22 24 3 27 20 27 10 90 46 14 27 26 16 24 24 3 19 8 24 27 26 5 16 24 21 14 24 19 21 3 6 6 22 19 14 17 24 97 27 27 29 24 27 12 22 3 27 23 0

Days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days days

Hours 11:20:00 20:12:00 0:41:00 3:19:00 17:44:00 5:49:00 12:00:00 11:06:00 0:07:00 21:27:00 21:13:00 13:28:00 22:09:00 21:31:00 19:32:00 1:54:00 10:38:00 21:54:00 21:22:00 6:08:00 13:41:00 6:33:00 9:09:00 6:00:00 0:59:00 2:42:00 6:27:00 8:01:00 18:39:00 3:50:00 18:44:00 9:05:00 9:21:00 5:50:00 19:01:00 18:12:00 2:21:00 5:36:00 2:12:00 18:10:00 1:49:00 4:23:00 6:40:00 1:47:00 2:58:00 8:53:00 6:28:00 21:22:00 7:13:00 21:56:00

Apps 2,718,911 3,490,217 15,425,996 3,767,563 9,415,586 3,142,314 5,628,189 9,311,562 9,793,582 14,548,709 4,659,372 35,406,045 27,127,850 7,335,354 5,216,493 17,703,599 2,739,994 12,941,415 8,623,558 9,884,105 14,385,298 23,098,123 2,174,814 5,402,378 19,911,837 10,352,455 1,705,646 21,069,378 3,563,176 4,384,186 1,300,835 2,108,198 5,838,549 10,327,190 10,606,083 9,454,240 4,191,359 7,242,166 11,180,641 13,494,011 7,151,134 17,107,484 31,312,912 23,612,473 15,124,051 2,891,920 853,956 27,009,815 6,811,432 787,440

Contact List 3,728 143 2,861 83 412 10 2,353 119 16,261 717 72 131 1,081 863 77 4,189 103 318 131 518 99 835 103 0 493 1,160 168 7,447 61 337 503 130 178 448 4,891 257 344 480 12 2,070 1,708 497 2,641 3,967 347 254 3 572 2,506 5

GPS 2,051 7,940 13,938 1,870 16,235 3,506 7,082 6,255 8,840 8,965 2,871 16,170 15,252 7,829 12,157 13,290 2,029 8,095 5,239 14,189 15,899 14,694 3,088 4,690 14,427 12,126 1,429 11,176 2,402 3,442 2,270 3,792 3,777 13,209 11,436 8,631 4,684 10,522 242,826 13,351 16,163 15,683 14,435 9,488 5,797 5,545 1,182 16,639 9,431 550

Phone 12 107 151 10 249 0 368 53 197 176 7 27 369 109 0 265 26 34 49 376 7 480 55 6 220 80 75 71 9 85 146 77 57 118 145 87 150 204 0 130 454 144 93 58 105 30 0 175 368 0

SMS In 31 77 103 4 34 0 185 812 0 478 42 516 474 1,359 0 322 11 164 29 221 21 243 6 6 161 38 97 205 38 163 10 23 30 77 119 80 81 137 0 424 215 26 428 728 216 8 0 590 252 11

SMS Out 19 13 28 0 29 0 4 166 0 384 25 349 380 907 0 268 11 139 27 31 10 76 0 0 28 21 76 166 16 135 8 12 24 75 116 80 6 4 0 297 66 19 223 699 202 0 0 568 28 2

Screen 1,075 4,131 7,275 980 8,473 1,834 4,276 3,255 4,603 4,684 1,494 8,439 7,997 4,085 6,332 6,933 1,055 4,219 2,725 7,546 8,422 7,872 1,628 2,463 7,501 6,344 751 5,818 1,252 1,789 1,182 1,974 1,968 6,903 6,007 4,495 2,439 5,559 4,615 6,952 8,411 8,173 7,523 4,976 3,017 2,884 733 8,657 4,908 289

Pkg 105 106 1,009 93 229 103 689 198 160 692 245 564 490 540 241 754 26 154 21 808 106 319 12 208 197 197 200 476 67 115 17 114 27 996 32 82 95 363 2 968 446 616 248 354 64 221 94 791 84 8

Table 2: Summary statistics for the user study. Each row shows the aggregate result for one user. The numbers represent the number of log entries collected for that particular measurement.

• The SMS In and SMS Out columns show how many text messages were received and sent, respectively, as observed by the sensor. • The Screen column counts screen lock and unlock events that occurred while the sensor was running during the user experiment. • The Pkg column shows how many applications were installed and deleted while the study took place.

6.1

Data Normalization

The raw data collected needed to be transformed into a normalized form in order to facilitate analysis. This step included the following changes to each of the different data components: Application data: The sensor captured a snapshot of all running processes at each polling interval. This captured redundant information is trimmed to reduce storage and processing requirements. For each captured set of running apps, the difference with the previous set was taken to find the set of newly opened apps, which could potentially be empty. Android background processes were removed from the data as these entries reflect system behavior and not user behavior. The total number of distinct processes are counted. Finally, the user’s GPS location is estimated and associated with this entry, as described later in this section. Phone and SMS data: These data logs capture information for both incoming and outgoing calls and text messages. For each phone and SMS-related entry, we use post-processing to append the entry with the approximated GPS location of the user when the event occurred. From our capture, we use only the outgoing calls and SMS events as these are userinitiated and reflect the user’s intent. Contact list access, Screen, and Pkg Installation: These entries describe when a user has accessed the contacts list, locked the screen or installed a new app. For each entry we associate a GPS position in the same manner as previously described. Our sensor learns to associate events with time and location. However, the sensor collects the measurements described in table 2 independently to optimize efficiency. For example, application-related data and GPS data are captured at different intervals. The association between user event and user location is done in post-processing. For each event (app use, phone call, etc) the user’s location is estimated via predictive sampling. This is done as follows: let H represent a hashmap and t1 , t2 , .., tN represent the time stamps for all N available GPS entries. GPS coordinates are truncated to 4 decimal places (11-meter granularity). Let l1t1 , l2t1 , ..., l1t2 , l2t2 , ..., l1tN , l2tN , ... represent distinct GPS location entries – where ljti indicates the j th GPS coordinate taken at time ti . The time-location association hashmap is then defined as: H[ti ] = {ljti

∀j}.

In this hashmap, each time-stamp points to a non-unique set of GPS-location estimates. The truncated entries overlap to form probability distribution where more entries indicate a higher probability that the user was at that location. We

Figure 2: Graphical dependency model for user behavior. then use a very simple monte-carlo method to obtain the estimation: k ti ˆ l

=

rand(0, |H[ti ]|)

= H[ti ][k].

(1) (2)

These operations are post-processing related tasks to align our data into a consistent form to be used by the behavior modeling component of our sensor.

6.2

Behavior Modeling

This section describes how our system can be used to detect masquerade attacks through abnormal application usage patterns. In essence, sufficient deviations from previously observed patterns of behavior are considered evidence of potential masquerade attack. This results in the current user being presented with a re-authentication challenge. Our user behavior model is a probability-based method where different device events are considered instantiations of random variables. Our assumption is that these variables interact with a graphical dependency structure shown in figure 2. Figure 2 indicates that app usage behavior is assumed to be dependent on time and on geographical position. That is, a user’s behavior is determined based on when and where he or she may be. The directional arrow indicates causal dependency. For example, application usage, outgoing phone patterns, and outgoing SMS patterns are described to be dependent on time and location and independent of each other; that is to say the time of day, and likewise the location of the user, directly influences what that user is doing with his or her phone. Expressed mathematically, this represents a simple Bayesian network, where nodes are variables, arrows indicate conditional dependence assumptions and nodes with no in-arrows indicate prior distributions. Translating this to equation form, the arrow between nodes “Time”, “GPS” and “App” represents the following Bayesian conditional probability relation: Pr(ti |ai ) Pr(li |ai ) Pr(ti ) . k=1 Pr(tk |ak ) Pr(lk |ak ) Pr(tk )

Pr(ai |ti , li ) = PN

(3)

For this system, we chose to use a non-parametric density estimation method to represent Pr(·) similar to the well known

Parzen windows method. Non-parametric estimation is necessary for two reasons: 1) Given that the GPS coordinates represent human movement, it is not reasonable to assume that a person would move according to a probability distribution of a known mathematical form (such as a Gaussian), therefore we aim to model location probability directly as a function of distance to other observed samples, 2) given latency requirements, we need a fast and efficient model, and complex distributions that require heavy matrix multiplications are less desirable. The time distribution for our model is estimated as follows: 1. Let hi represent the hour element of time stamp ti . 2. Let T [hi ] represent a hash map with keys h1 , h2 , .., hN . Initialize T [hi ] = 0 ∀hi . 3. Increment T [hi ] for each time stamp t1 , t2 , .., tN 7→ h1 , h2 , .., hN . 4. Normalize the function via: T [hi ] =

T [hi ] PN . T [hk ] k=1

5. Smooth T [hi ] via convolution with a mean filter, so that if the user did not exhibit any activity at a particular hour, that entry does not have an undefined value. Variable T [hi ] represents the prior probability of observing an event occurring at time ti . When modeling time, picking the right granularity is a complex problem. If set too high the information is diluted and if too low then the model might over-fit to the training data and generalize poorly. In the end, a time granularity of one hour proved to be optimal. This variable is implemented as an array with a fixed boundary indexed between [0, 23]. For the GPS location model, a two-dimensional table is created. Given a latitude and longitude position, this table returns a normalized probability value. In this modality, the challenge is that, unlike time, GPS coordinates are not well bounded, and table size could potentially be unmanageable. This problem is solved by structuring the probability as a function of the number of “neighbor measurements” at any given location, where neighbors are historically captured GPS positions. 1. Let xi and yi represent GPS latitude and longitude coordinate values for entry i, trimmed to 4 decimal places (or a granularity of 11 meters). Create the key xi , yi 7→ ki by concatenating the string xi , yi with a non-numeric separator character. 2. Randomly sub-sample this dataset to extract 4000 samples; this is done for performance reasons. Initialization is done. 3. Starting with 0, increment G[ki ] for each location (x1 , y1 ), (x2 , y2 ), .., (xN , yN ) 7→ k1 , k2 , .., kN . P 4. Form distance matrix D[i, j] = i,j G[kj ] if Haversine(ki , jj ) < 5, where Haversine(·) refers to the Haversine geodesic distance measurement that calculates the distance between two GPS positions in meters. P 5. Find row marginals d[i] = j D[i, j] ∀i.

6. Find c[0] = 0 and c[i] = d[i] + c[i − 1] ∀i. P 7. Find normalized value for c as c[i] = j c[j]

∀j.

8. Given a new sample (xi , yi ) 7→ ki , search the set of GPS positions index by G[ki ] for those with Haversine distance less than 5 meters; let this value be n. Return c[n] as the probability for this position. If space and processing time constraints were not an issue, the probability table could be normalized by calculating the marginal distributions for xi and yi and dividing each entry by this value. However, this is intractable in practice since GPS is dependent on user activity and not easily bounded in general. Thus, this model is approximated via the aforementioned hash method G[ki ]. This hashmap creates a rough index for GPS positions by binning positions into different grid points. The number of entries, or neighbors, in each grid is used to calculate a probability for observing that position. Given a new sample, ”more neighbors” translates to a higher probability. Application usage is associated with time and location using a three-dimensional table. Using the same hash-map approximation method: 1. Let ci represent the count of the number of distinct apps running at time ti . Estimate a histogram of app counts with the mapping method used prior, such that C[ci ] represents the normalized probability of observing ci apps running. 2. Let ai,j be a string value that represents a particular app j opened at time ti , at location (xi , yi ). Multiple apps could have been opened at any given entry, therefore multiple indices are possible j = 1..., d. Generate key ai,j 7→ αt by concatenating values hi , ki , ai,j with non-numeric separator values. 3. Let A[αj ] represent a hash map with keys α1 , α2 , .., αN . No initialization is done. 4. Starting with 0, increment A[αj ] for each data entry ai,j 7→ αj . 5. Normalize the function via: A[αj ] =

A[αj ] PN . n=1 A[αn ]

This yields a quick look-up of the probability of observing a specific app being used given a time and location. Expanding equation 3: Pr(ai |ti , li ) Pr(ai,j |ti , xi , yi )

=

Pr(ti |ai ) Pr(li |ai ) Pr(ti ) (4) k=1 Pr(tk |ak ) Pr(lk |ak ) Pr(tk )

PN

= T [hi ]G[ki ]C[ci ]

d 1X A[αj ] d j=1

(5)

The phone (P[·]) and SMS models (S[·]) are estimated using the same hashmap-based approximation method as described above. Here, special keys are constructed using time and location information. These keys map to an observation counter that translates to a probability distribution. Given these modalities, the probability of normal behavior is calculated as: Pr(ai,j , pi , si |ti , xi , yi ) = P[pi ]S[si ]T [hi ]G[ki ]C[ci ]

d 1X A[αj ]. d j=1

(6)

For runtime efficiency the log of these probability estimates are used. Using log-probability allows us to replace costly multiplication functions with faster additions. Since log is a monotonic transformation the probability measurements preserve triangle inequality. The following equations are used in our sensor: log (Pr(ai,j , pi , si |ti , xi , yi ))

=

log (P[pi ]) + log (S[si ]) + log (T [hi ]) + log (G[ki ]) + log (C[ci ]) ! d 1X + log A[αj ] . d j=1

Beyond runtime efficiency, use of log-space prevents potential under-flow errors. To find the proper threshold for anomaly detection for a given user, Equation 6 is used to evaluate all available training data for that user to obtain a list of log-probabilities for each sample. This is represented by L[i] = Pr(Xi ), where Xi represents all measured user logs at time i. This list of values is then sorted in ascending order. The index value τ can be calculated by multiplying the number of training samples with the desired false positive (FP) rate. Using this index in the previous values array L[τ ] then yields the threshold for the anomaly detection.

7.

Figure 3: Accuracy of classifying logged user actions related to application usage. Full False Positive Range.

RESULTS

Our models were evaluated using multi-category user classification using the data collected during our study. For each user, the captured data was randomly split into training and test sets at a ratio of 85% to 15%, respectively. Behavior models were trained for each user with their training data. The models were then evaluated using all available testing data from all users where, for each test entry, the model that yielded the highest probability labeled that data. Test data was truncated to 2000 randomly selected samples to ensure consistency across all users and to avoid bias towards any particular user based on volume of available data. Accuracy was measured via the sensor’s ability to correctly identify a user’s test data from among test data belonging to all other users. Performance is displayed in the following figures via ROC curves that plot accuracy vs. false positive rates. All results presented are averages across 50 users. Figures 3 and 4 show the ROC curve for the application usage model. Figure 3 shows the full range while Figure 4 shows the performance curve at a range between 0% and 1% FP, which is more reasonable for deployment. Next, the different components of the behavior model are examined individually to measure their discriminative power. The availability of GPS as a feature makes mobile behavior modeling stand out from other domains, such as desktop monitoring. With repetition and sufficient time and samples, we hypothesize that a user’s movement pattern is consistent enough such that a statistical model can be estimated and statistical inference for new samples is possible. This may be used to detect if a device is moved out of a familiar region. Figures 5 and 6 examine the performance of our nonparametric GPS position model across the 50 users in our study. Figure 5 shows the full-range performance curve,

Figure 4: Accuracy of classifying logged user actions related to application usage. Low False Positive Range.

while Figure 6 shows the low false positive rate range. As the figures show, the participating subjects exhibited consistent movement patterns during the course of the three week study. This made it possible to accurately classify different users based entirely on their GPS coordinates, despite the fact that many of these users worked in the same office. The performance rates for the phone and SMS models were lower by comparison. Figures 7 and 8 show these results. While the classification rates for these modalities remain statistically significant when compared to random guessing, which would yield a baseline value of 2% expected accuracy in a 50-user experiment, the performance curve is noticeably lower than app usage and GPS models. This performance is partially attributable to discrepancies from low sampling rates. Compared to the volume of data available for GPS and application usage, phone and SMS log entries were sparse and limited in number for most users. It is expected that,

Figure 5: Accuracy of classifying users based on GPS position model. Full False Positive Range.

Figure 7: Outgoing phone call model.

Figure 8: Outgoing SMS model. Figure 6: Accuracy of classifying users based on GPS position model. Low False Positive Range. while phone and SMS data may be useful in predicting abnormal usage for mobile devices, they should not be used as the primary indicators; instead they should be used to compliment other detection modalities. Finally, Figure 9 and Figure 10 show performance curves for user identification based on contact-list access and new application installation behavior. These models were designed to recognize when and how often the users access their contacts’ list and when they installed applications on their mobile devices based on an intuition that users who own these devices would access their contact lists and would install apps at consistent, predictable, times. However, the results show that these modalities did not prove to be particularly discriminative compared to the other behavior measurement paths.

7.1

Time to Detection Estimation

Estimating an expected “time until detection” when an unauthorized access event has occurred is more difficult for mobile devices than it is for desktop platforms. For example,

on mobile devices, activities tend to be more sporadic, with bursts of activity distributed throughout the day, whereas desktop usage exhibit more prolonged and consistent usage sessions. Mobile usage tends to be less consistent. The use of location-based information adds a robust dimension to the problem – this simplifies the challenge in some cases while complicating it in others. For example, our models can classify user movement patterns in terms of GPS locations to a fairly accurate degree. If unauthorized access occurs where the device is taken outside of its recognized area, then detection becomes a simpler problem. However, if the device is kept within a highly familiar area, then GPS-based evaluation would skew the measurement towards normal behavior. In practice, this needs to be tuned on a per-user basis to find the best setting. Time until detection can be estimated by selecting a desired false-positive rate and extrapolating based on the model performance at that rate. For example, the sensor can be restricted to evaluate the likelihood of an app being used only when that app has been opened. If this is assumed to occur 100 times per day, and a maximum of one false positive per day is assumed, the target rate can be set to

of the sensor polling and will need to be adjusted based on user behavior. If we assume a 30-second interval to poll for whether any new apps have been launched, then time until detection is less than two minutes. The GPS model can be similarly examined. While the application model performs inference based on GPS data, which intrinsically integrates location information, location probability evaluation can be by itself, in parallel and asynchronously with the application model. This means a challenge is triggered whenever the device is taken to outside a familiar area. The time-to-detection of this modality can be examined in the same manner. Assuming a 5 minute check interval, there would be 12 ∗ 24 = 288 evaluations per day. With a desired false positive rate of 1 per day (0.3% FP) the classifier performs at around an 88% detection baseline. That is, within two tries the location based classifier is expected to detect an anomaly with 95% confidence. The time to detection under this setting is thus 10 minutes. These detection performance figures are provided as examples; in practice, the sensor’s error rates are configurable based on the security requirements of a specific deployment context. Devices which are used to access highly sensitive data can speed up detection time by accepting a higher challenge rate for device owners. If frequent challenges are more of a concern than detection time, the false positive rate can be made more restrictive. In practice these modalities should be tuned on the sensor on a per user basis. Per user accuracy depends on the user’s behavior patterns: some may be relatively stationary and more consistent in their usage. In this case, the sensor may be tuned to be more sensitive. Other users who are highly mobile might reduce the influence of location-based probability condition and focus more on the set of application used and timing. The results described in this paper provide a balanced performance estimate and how to adjust the sensor accordingly. As such, they should not be construed as performance guarantees for every possible user.

Figure 9: Contacts list access model.

Figure 10: Application installation model.

8.

be 1%. At this rate the application model has an expected classification accuracy of roughly 58%. This says that it will recognize 58% of true anomalous use on the first try while only making 1 false positive error in 100 tries. However, we would expect that an unauthorized user would attempt to open several apps in succession as they proceed through the user’s device. Since a challenge only needs to be issued once, the chances of the sensor failing to catch an unauthorized user maps to a Bernoulli distribution over all evaluations. With a baseline of 58% chance, the number of trials needed until we have a 95% chance of detection can be estimated as follows: (1 − 0.58)x log(0.42)x x

< (1 − 0.95) < log(0.05)

(7) (8)

< 3.45

(9)

This shows that the application model evaluation needs to be triggered roughly three times during one unauthorized usage session before there is close to a 90-95% chance of detection. How long this will take depends on the frequency

FUTURE RESEARCH

The results of our user study provide preliminary evidence that application level interactions can be used to authenticate users to their mobile devices. Our initial experiments were limited in scope, however, and thus leave several important questions regarding application-oriented authentication unanswered. One shortcoming of our study is that although our threat model is intended to capture attacks launched by close friends, roommates, or coworkers, our test subjects were individuals who worked at the same organization but otherwise did not share any preexisting relationships. Our results may therefore not be representative of this class of adversary; for example, very close friends or relatives may have access to knowledge about how the target of an attack typically utilizes his or her device. Additional studies are needed to determine if such knowledge can be leveraged to try to impersonate a device’s owner while an illicit phone session is in progress. We intend to perform several different cases of mimicry studies in which participants are provided with information about each other’s device usage, then asked to try to replicate each other’s actions while searching for information. Potential test cases include strangers who are provided with

no information, strangers who are given a target’s written description of how their phone is used, strangers after watching target’s phone usage directly or through a video, and finally sessions conducted with close friends or roommates. A similar but distinct question is how much information is typically leaked while a phone is used in a manner which matches its “typical” user’s profile. This question suggests the need for additional data collection with study participants about how sensitive the data they store and the actions they take with their phone would be considered. Aside from information concerning a victim’s phone usage habits, another piece of knowledge which may be available to adversaries in a real-world setting is awareness of the security mechanisms which have been deployed on the target device. Clearly, any information which an adversary has regarding what precautions are in use may aid them in circumventing these safeguards. In future studies, we will assess the impact this awareness would have on the ability of an adversary to evade detection by performing a simulated masquerader experiment in which some users are informed of the mechanism and others are not. For the sake of our proof-of-concept experiments, mobile behavior data was collected from users’ devices and uploaded to a server for offline modeling and analysis. In a real world scenario, however, modeling would take place on the device itself and a mitigation strategy would be utilized in response to unusual behavior. One of the limitations of the analysis presented in this paper is that it does not allow us to perform a user study to evaluate the usability of our approach. Specifically, we were not able to gauge how users react when posed with an authentication challenge. The development of mitigation strategies was beyond the scope of our initial study, but will be explored in future research. For example, a recent study explored the effect of gradually darkening or locking out a screen to reduce interruptions in users’ work flow [2]. We intend to conduct a study during which users install our full behavioral authentication system on their mobile devices, including variations on mitigation strategies. An important aspect of our modeling approach is its ability to adapt to a legitimate device user’s behavior gradually alters over time. This will require our model to have a feedback component in which the results of re-authentication challenges are used to label alarms as true or false positives. Deploying our solution on real users’ devices during a future study will allow us to assess our model’s ability to be retrained over time. Another security component which we will assess in future work is the ability of spurious applications to improve the performance of mobile behavioral authentication. Prior research has suggested the use of decoy applications to detect intrusions on mobile devices [11]. We will measure the effectiveness of decoys on mobile behavior modeling via a human subject study.

9.

CONCLUSION

To summarize, this paper presented a novel approach to authenticating users on mobile devices via their device usage habits. The technique we describe is intended to detect masquerader attacks in which a lost or stolen mobile device is used by someone other than the device’s intended user. A behavior model was developed based on different categories of phone activity, such as application usage and phone calls,

and when and where these events occurred. The results of our preliminary human user study suggest that our behavioral authentication mechanism is capable of detecting a masquerade attack in approximately two minutes with one false positive per day. The results provide initial evidence of the efficacy of our proposed authentication technology based upon application usage modeling. Our study yielded a dataset of mobile phone usage behavior, and patterns of application usage acquired from real users. Our results warrant further investigation including a deeper research of mitigation strategies, improved sensor performance, experimentation with different complimentary behavioral authentication techniques, and sensor inputs. We intend to perform additional experiments to assess the effect of adversarial knowledge and the potential for behavioral mimicry attacks.

10.

REFERENCES

[1] A. De Luca, A. Hang, F. Brudy, C. Lindner, H. ˇ you! Hussmann. Touch me once and I know itSs Implicit Authentication based on Touch Screen Patterns. In 30th ACM Conference on Human Factors in Computing Systems (CHI), 2012. [2] L. Agarwal, H. Khan, and U. Hengartner. Ask me again but don’t annoy me: Evaluating re-authentication strategies for smartphones. In Twelfth Symposium on Usable Privacy and Security (SOUPS 2016), pages 221–236, Denver, CO, June 2016. USENIX Association. [3] A. J. Aviv, K. Gibson, E. Mossop, M. Blaze, and J. M. Smith. Smudge attacks on smartphone touch screens. In Proceedings of the 4th USENIX Conference on Offensive Technologies, WOOT’10, pages 1–7, Berkeley, CA, USA, 2010. USENIX Association. [4] D. Tapellini. Smart Phone Thefts Rose to 3.1 Million Last Year, Consumer Reports Finds. Available at: http://www.consumerreports.org/cro/news/2014/04/ smart-phone-thefts-rose-to-3-1-million-last-year/ index.htm, 2014. [5] F. Schaub, R. Deyhle, M. Weber. Password Entry Usability and Shoulder Surfing Susceptibility on Different Smartphone Platforms. In 11th International Conference on Mobile and Ubiquitous Multimedia (MUM), 2012. [6] T. Feng, Z. Liu, K.-A. Kwon, W. Shi, B. Carbunar, Y. Jiang, and N. Nguyen. Continuous mobile authentication using touchscreen gestures. In Homeland Security (HST), 2012 IEEE Conference on Technologies for, pages 451–456, Nov 2012. [7] H. Xu, Y. Zhou, M. Lyu. Towards Continuous and Passive Authentication via Touch Biometrics: An Experimental Study on Smartphones. In 10th Symposium on Usable Privacy and Security (SOUPS), 2014. [8] M. Ichahane, M. Chiny, A. Abou Elkalam, and A. Ouahman. Introduction to identity usurpation applied to biometric modalities. In Multimedia Computing and Systems (ICMCS), 2014 International Conference on, pages 1248–1255, April 2014.

[9] J. Voris, Y. Song, P. Du, M. Ben Salem, S. Hershkop and S. Stolfo. Active Authentication using File System Decoys and User Behavior Modeling: Results of a Large Scale Study. In Under Submission to the 18th International Symposium on Recent Advances in Intrusion Detection, 2015. [10] M. Ben Salem and S. Stolfo. Modeling User Search-Behavior for Masquerade Detection. In Proceedings of the 14th International Symposium on Recent Advances in Intrusion Detection, 2011. [11] M. Ben Salem, J. Voris, and S. Stolfo. On the Use of Decoy Applications for Continuous Authentication on Mobile Devices. In 1st Who Are You?! Adventures in Authentication Workshop (WAY) co-located with the 10th Symposium on Usable Privacy and Security (SOUPS), 2014. [12] M. Brennan, S. Afroz, and R. Greenstadt. Adversarial Stylometry: Circumventing Authorship Recognition to Preserve Privacy and Anonymity. In ACM Transactions on Information and System Security (TISSEC), 2012. [13] M. Brocardo, L. Traore, I. Woungang. Toward a Framework for Continuous Authentication Using Stylometry. In IEEE Conference on Advanced Information Networking and Applications (AINA), 2014. [14] M. Jakobsson, D. Pointcheval. Mutual Authentication for Low-Power Mobile Devices. In Financial Cryptography (FC), 2002. [15] R. Nieva. Smartphone Kill Switch Could Save US Consumers $3.4B, Study Says. Available at: http://www.cnet.com/news/ smartphone-kill-switch-could-save-consumers-3-4b-study-says, 2014. [16] R. Yampolskiy and V. Govindaraju. Behavioural biometrics: A Survey and Classification. In International Journal of Biometrics, 2008. [17] A. Roy, T. Halevi, and N. Memon. An hmm-based behavior modeling approach for continuous mobile authentication. In Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pages 3789–3793, May 2014. [18] S. Uellenbeck, M. D¨ urmuth, C. Wolf, and T. Holz. Quantifying the Security of Graphical Passwords: The Case of Android Unlock Patterns. In 20th ACM Conference on Computer and Communications Security (CCS), 2013. [19] S. Venugopalan and M. Savvides. How to generate spoofed irises from an iris code template. Information Forensics and Security, IEEE Transactions on, 6(2):385–395, June 2011. [20] X. Suo and Y. Zhu and G.S. Owen. Graphical Passwords: A Survey. In 21st Annual Computer Security Applications Conference (ACSAC), 2005.

An Initial Study of Authenticating Mobile Users via Application Usage

teractions with the OS on desktop and laptop computers. [10]. This approach was found to be viable in a desktop setting; the modeling approach was capable of ...

677KB Sizes 4 Downloads 261 Views

Recommend Documents

An Initial Study of Authenticating Mobile Users via Application Usage
Department of Computer Science ..... by these users, each differing from one another by varying degrees. ...... smart-phone-thefts-rose-to-3-1-million-last-year/.

An Approach to Large-Scale Collection of Application Usage Data ...
system that makes large-scale collection of usage data over the. Internet a ..... writing collected data directly to a file or other stream for later consumption.

Authenticating Mobile Agent Platforms Using Signature ...
Of Computer Science and Computer Engineering. La Trobe ... agent can be authenticated, a level of trust can be established, which can then ..... cious hosts,” Lecture Notes in Computer Science, vol. 1419, pp. 44–60,. 1998. [Online]. Available: ..

Mobile Phone usage Circular.pdf
Download. Connect more apps... Try one of the apps below to open or edit this item. Mobile Phone usage Circular.pdf. Mobile Phone usage Circular.pdf. Open.

Initial (Full) marketing authorisation application - European Medicines ...
C9. 06/09/2017. 07/09/2017. 21/09/2017. 27/09/2017. 02/10/2017. 06/10/2017. 12/10/2017. C10. 29/09/2017. 02/10/2017. 16/10/2017. 22/10/2017. 27/10/2017.

Kinetic study of the initial stages of agglutination ...
It has been shown that two limiting regimes of the agglutination process exist: ... Let us focus on the other limiting case when the particle coupling rate is limited ...

TCP Initial Window: A Study
Nearly 90% of the Internet traffic volume is carried by TCP, the transmission control ... LAS (Least Attained Service) and PS+PS scheduling algorithms improve the ... improves the response time of small flows by avoiding slow-start penalty.

Tablet Demand and Disruption Mobile Users Come of ...
Feb 14, 2011 - Morgan Stanley does and seeks to do business with companies covered in Morgan Stanley Research. ..... For notebook, cell phones,.

HOME Program Development Application - City of Mobile
include estimates/documentation of professional services and soft costs (e.g. ... whom they have family or business ties during their tenure or for two years ...

HOME Program Development Application - City of Mobile
City of Mobile HOME Program Development Application 2017. Page 1. CITY OF MOBILE. COMMUNITY & HOUSING DEVELOPMENT DEPARTMENT.

rehabilitation contractor registration application - City of Mobile
CITY OF MOBILE. DEPARTMENT OF HOUSING & COMMUNITY DEVELOPMENT. PRE-QUALIFICATION APPLICATION FOR CONTRACTORS TO ...

Mobile Terrace Neighborhood Study Site - City of Mobile
CONtACt iNFORMAtiON. Prepared by: City of Mobile ... Do not make business or legal decisions based on this data before ... the U. S. Census Bureau. Source:.

Mobile Terrace Neighborhood Study Site - City of Mobile
OLD SHELL RD. SIXTH ST. THIRTEENTH ST. FIFTH ST ... Phone: 251.208.7943 – Email: [email protected] http://maps.cityofmobile.org. The City of Mobile ...

An Initial Strategy of Intensive Medical Therapy Is ...
Dr. Mahmarian is on the Advisory Board for CV Therapeutics .... The statistical analysis plan is pub- ... Chi-square or t test statistics were used to compare.

RFP DS2013 - City of Mobile Disparity Study
Mar 8, 2013 - The City of Mobile will not reimburse participant for any costs associated with ..... If the prime bidder is a joint venture, please describe below the ...

Initial Stages of Growth of Gallium Nitride via Iodine Vapor Phase ...
J-S Park. 1. , RF Davis. 1. , Z Sitar. 1. 1. Department of Materials Science and Engineering, North ... the scanning electron micrographs presented in Figs. 2 and 3 ...

RFP DS2013 - City of Mobile Disparity Study
Mar 8, 2013 - digital version on computer disk or flash drive in Microsoft Word format. (All draft and final versions of this study, including, but ... Signature. Title ...