Sensor Placement for Outbreak Detection in Computer Security

Andreas Krause SCS, CMU

H. Brendan McMahan Google Inc.

Carlos Guestrin SCS, CMU

Anupam Gupta SCS, CMU

Abstract We consider the important computer security problem of outbreak detection, where we want to place sensors (monitoring stations, probes) for detecting events (computer viruses) spreading over a network. We show that such problems can be modeled by the problem of simultaneously maximizing a collection of submodular set functions. We show, how the S ATURATE algorithm [3] performs nearoptimally in this setting, even if sensors can (accidentally or through adversarial manipulation) fail.

1

Introduction

An important problem in computer security is outbreak detection in networks. In this problem, we are given a network G = (V, E), and a process spreading dynamically over the network. We can place a set of monitoring stations, which detect the events. Examples include detecting viruses spreading over computer networks, monitoring municipal water distribution networks for contamination detection [5], and even problems like selecting informative weblogs to read in order to detect citation cascades [6]. More formally, we are given a set of outbreak scenarios I. Each scenario i ∈ I models an event starting at a node s ∈ V and spreading over the graph. With each node v ∈ V, we associate the detection time T (i, v) as the earliest time at which the event reaches node v. T (i, v) = ∞ if v is never reached. We can place a set of k sensors at a set of nodes A ⊆ V, |A| = k. These nodes detect the event i at time T (i, A) = minv∈A T (i, v). With each scenario i, we also associate a penalty function πi , where πi (t) quantifies our loss if the outbreak is detected at time t. For example, we can set πi (t) to model the monetary loss associated with servers failing due to the virus infection or to model the amount of contaminated water consumed, if the outbreak is detected at time t. With each scenario, we can then associate the reward function Ri (A) = πi (∞) − πi (T (i, A)), which is defined over all subsets 2V , and quantifies the utility for placing sensors at locations A. If no sensors are placed, then no utility is obtained. The goal in outbreak detection is then to place a set of sensors, such that the utility Ri is simultaneously maximized over all Ri . From this goal, one can formalize different optimization problems. If one believes P that outbreaks happen at random, then one can define an average case objective Ravg (A) = i∈I P (i)Ri (A). If an adversary selects the outbreak scenario i knowing about our sensor placement (and hence picking the worst possible scenario i), our objective is Radv (A) = mini∈I Ri (A). Our goal is, given a budget k on the number of sensors we can place, to find a placement A∗ = argmax R(A),

(1)

|A|≤k

where R is either Ravg or Radv .

2

Sensor placement algorithms

Unfortunately, both problems are hard [3]. The key to obtaining approximate solutions is to realize that the objective functions Ri satisfy an important property, which we proved in [5]: Adding a sensor helps more if we have placed few sensors so far, and less if we already have placed many sensors. This property is formalized by the combinatorial concept of submodularity (c.f., [7]). A set function F is called submodular, if for all A ⊆ B ⊆ V and s ∈ V \ B it holds that F (A ∪ {s}) − 1

F (A) ≥ F (B ∪ {s}) − F (B), i.e., adding s to A helps more than adding s to a superset B. F is called nondecreasing, if for all A ⊆ B it holds that F (A) ≤ F (B). A key result about submodular functions states that the greedy algorithm, which iteratively adds the sensor s to the set A of chosen locations such that F (A ∪ {s}) is maximized, is near-optimal: It is guaranteed to obtain a solution AG , which achieves at least a constant fraction of (1 − 1/e) ≈ 63% of the optimal solution [7] – in fact, this is the best possible guarantee achievable in polynomial time unless P = N P [1]. Since submodular functions are closed under nonnegative linear combinations, the average case objective Ravg is submodular as well, and hence, the greedy algorithm solves problem (1) near-optimally. It is also possible to use submodularity to obtain online bounds and speed up algorithms [5]. Unfortunately, the adversarial objective Radv , which is far more relevant for computer security, is not submodular. In fact, it can be shown, that in this setting, the greedy algorithm performs arbitrarily badly. In [3], we consider the problem of solving Problem (1) for arbitrary nondecreasing submodular functions Ri . We develop the S ATURATE algorithm, which is guaranteed to find a sensor placement A, for which Radv (A) ≥ max|A0 |≤k Radv (A0 ), and |A| ≤ αk for some small α, i.e., finds a solution which obtains adversarial score at least as much as the optimal solution, at slightly increased cost. Similarly to the greedy algorithm, S ATURATE is shown to be best-possible under reasonable complexity-theoretic assumptions [3].

3

Other applications and connection to machine learning

Sensor failures. The problem of maximizing the minimum over a set of submodular functions arises in other settings as well. For example, in the outbreak detection problem, sensors might fail, due to hardware failures or manipulation by an adversary. We can model this problem in the following way: Given a submodular function F (e.g., the utility for placing a set of sensors), and a set B ⊆ V, we define a new function FB (A) = F (A \ B). This set function corresponds to the (reduced) utility if all the sensors at locations in B fail. It is easy to show that if F is nondecreasing and submodular, so is FB . Hence, the problem of optimizing sensor placements which are robust to sensor failures results in a problem of simultaneously maximizing a collection of submodular functions, e.g., for the worst-case failure of k 0 < k sensors we solve max|A|≤k min|B|≤k0 FB (A). In fact, we can combine probabilistic/adversarial outbreak scenarios with probabilistic/adversarial sensor failures in an arbitrary manner. For example, we can try to optimize for placements which are robust against an adversarial virus infection, with probabilistic sensor failures, and vice versa. The S ATURATE algorithm can be applied to any such combination. Connection to machine learning. One important problem in machine learning is feature selection. In feature selection, the goal is to select a subset of features which are informative with respect to, e.g., a given classification task. One objective frequently considered is the problem of selecting a set of features which maximize the information gained about the class variable Y after observing the features A, F (A) = H(Y ) − H(Y | A), where H denotes the Shannon entropy. In [4], it was shown, that in a large class of graphical models, the information gain F (A) is in fact a submodular function. Now we can consider a setting, where an adversary can delete features which we selected (as considered, e.g., in [2]). The problem of selecting features robustly against such arbitrary deletion of, e.g., m features, is hence equivalent to the problem of maximizing min|B|≤m FB (A), where B are the deleted features. In [3], we draw other connections, e.g., to the problem of minimizing the maximum posterior variance in Gaussian Process regression and robust experimental design. We believe that the problem of maximizing an adverarially chosen submodular objective function is relevant to a variety of security and machine learning problems.

References [1] Uriel Feige, A threshold of ln n for approximating set cover, Journal of the ACM 45 (1998), no. 4, 634 – 652. [2] Amir Globerson and Sam Roweis, Nightmare at test time: Robust learning by feature deletion, ICML, 2006. [3] A. Krause, B. McMahan, C. Guestrin, and A. Gupta, Selecting observations against adversarial objectives, Advances in Neural Information Processing Systems (Vancouver, Canada), 2007. [4] Andreas Krause and Carlos Guestrin, Near-optimal value of information in graphical models, UAI, 2005. [5] Andreas Krause, Jure Leskovec, Carlos Guestrin, Jeanne VanBriesen, and Christos Faloutsos, Efficient sensor placement optimization for securing large water distribution networks, Submitted to the Journal of Water Resources Planning an Management (2007). [6] Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, and Natalie Glance, Cost-effective outbreak detection in networks, 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2007. [7] G. Nemhauser, L. Wolsey, and M. Fisher, An analysis of the approximations for maximizing submodular set functions, Mathematical Programming 14 (1978), 265–294.

2

Sensor Placement for Outbreak Detection in Computer Security

We consider the important computer security problem of outbreak detection, ... of the optimal solution [7] – in fact, this is the best possible guarantee ... It is also possible to use submodularity to obtain online bounds and ... a set of features which maximize the information gained about the class variable Y after observing.

87KB Sizes 12 Downloads 224 Views

Recommend Documents

Sensor placement in sensor and actuator networks
sor placement in wireless sensor and actuator networks (WSAN). One or more ..... This scheme has obvious advantage over the algorithms in [MXD+07] in mes-.

Structuring an event ontology for disease outbreak detection
Apr 11, 2008 - Abstract. Background: This paper describes the design of an event ontology being developed for application in the machine understanding of infectious disease-related events reported in natural language text. This event ontology is desi

Robust Location Detection in Emergency Sensor Networks
that generalizes the basic ID-CODE algorithm and produces irreducible r-robust codes. The degree of robustness, r, is a design parameter that can be traded off ...

Outlier Detection in Sensor Networks
Keywords. Data Mining, Histogram, Outlier Detection, Wireless Sensor. Networks. 1. INTRODUCTION. Sensor networks will be deployed in buildings, cars, and ... republish, to post on servers or to redistribute to lists, requires prior specific permissio

Mo_Jianhua_CL12_Relay Placement for Physical Layer Security A ...
Sign in. Page. 1. /. 4. Loading… .... PDF (d)=1 − dα. sedα. re. (dα. rd + dα .... In Fig. 2, we plot. PDF (d) and PRF (d) as functions of the relay position. We. find that ...

Mo_Jianhua_CL12_Relay Placement for Physical Layer Security A ...
Mo_Jianhua_CL12_Relay Placement for Physical Layer Security A Secure Connection Perspective.pdf. Mo_Jianhua_CL12_Relay Placement for Physical ...

Self-powered sensor for use in closed-loop security system
May 16, 1990 - includes a self-powered sensor network which provides a switch-actuating signal ... switch is opened, voltage from the closed-loop security system becomes ... which must monitor a variety of physical conditions in a variety of ...

Optimal Sensor Placement with a Statistical Criterion for ...
[6] Meo M and Zumpano G (2005), On the optimal sensor placement techniques for a bridge structure, Engineering. Structures 27(10), 1488-1497. [7] Marano GC, Monti G, Quaranta G (2011), Comparison of different optimum criteria for sensor placement in

Multicast encryption infrastructure for security in Sensor ...
Introduction: Wireless technology has seen remarkable growth in the past decade [1][2]. Low cost, low .... the article to distinguish between these two structures.

Self-powered sensor for use in closed-loop security system
May 16, 1990 - Q1 and Q1 will again turn on (i.e. the circuit will auto matically reset itself after .... CERTIFICATE OF CORRECTION. PATENT NO. : R133 3 8 0 7.

Data Storage Placement in Sensor Networks
May 25, 2006 - mission by aggregating data; it does not address storage problem in sensor networks. Data-centric storage schemes. [16,17,19] store data to different places in sensor networks according to different data types. In [17,19], the authors

Optimal Base Station Placement for Wireless Sensor Networks ... - MDPI
Jan 14, 2015 - with the case that the base station is placed at the center of the network ...... algorithm, which we call the likelihood maximum inscribed disk ...

Automated Detection of Sensor Detachments for ...
module on an Android mobile smartphone using the Au-. toSense body-area sensor network and the mStress mobile inferencing framework [2]. AutoSense [3] is ...

Optimal Model Detection in Distributed Sensor ...
determine the environment model for decentralized detection in sensor networks. We develop a clustering ... Utilizing model clustering, Genetic-Fuzzy Model Detection. (GFMD) can obtain the optimal threshold in a ..... computer intrusion detection,”

Outbreak Guidance for Schools and Nurseries in East ...
Inform the head teacher or manager of nursery. • Movement of supply teachers and specialist staff between schools/nurseries may need to be restricted.

Network Embedded Support for Sensor Network Security
May 5, 2006 - multicast region that uses a common key for communications. Multicast ...... Reliance on limited, non-renewable battery energy resources.

Global Clock Synchronization in Sensor Networks - Computer Science
... in a distributed system. Papers on synchronization in sensor networks include [6], [5], ... of our diffusion-based algorithm is to use local operations to achieve global ..... In a zone scheme, we can use the same method as Alg. 1 to first design

Global Clock Synchronization in Sensor Networks - Computer Science
Dartmouth College. Hanover, NH 03755. Email: [email protected] .... The solution targets an ad hoc network in which two nodes may be out of range and ...

Machine Learning for Computer Security
Conventional security software requires a lot of human effort to identity threats, extract char- ... to select both the top-performing N-grams and the best performing classifier which in this case was ... Workshop Notes of Visualization and Data.

Machine Learning for Computer Security
ferring application protocol behaviors in encrypted traffic to help intrusion detection systems. The ... Applications of Data Mining in Computer Security. Kluwer,.

MMA7660FC, 3-Axis Orientation/Motion Detection Sensor - MicroPython
Laptop PC: Anti-Theft. • Gaming: Motion Detection ... 10 LEAD. DFN. CASE 2002-03. MMA7660FC. MMA7660FC: XYZ-AXIS. ACCELEROMETER. ±1.5 g.