Adaptive Sequential Bayesian Change Point Detection Ryan Turner

Whistler, BC December 12, 2009 Joint work with Yunus Saatci and Carl Edward Rasmussen

Turner (Engineering, Cambridge)

Adaptive Sequential Bayesian Change Point Detection

1

Motivation

Handle nonstationarity in time series Avoid making point estimates of (changing) parameters Modular framework Tractability Online Probabilistic predictions Minimal hand tuning

Turner (Engineering, Cambridge)

Adaptive Sequential Bayesian Change Point Detection

2

Ingredients The time since the last change point, namely the run length rt (τ ) The underlying predictive model (UPM) p(xt |x(t−τ ):t−1 =: xt , θm ) for any τ ∈ [1, . . . , (t − 1)], at time t The hazard function H(r|θh ) The hyper-parameters θ := {θh , θm } 8

Observations

6 4 2 0 −2 −4

Run Length

−6 200 150 100 50 0

0

100

200

300

400

500

600

700

800

900

1000

Time

Figure: Sample drawn from BOCPD. Turner (Engineering, Cambridge)

Adaptive Sequential Bayesian Change Point Detection

3

Previous Work

Test based approaches Retrospective Bayesian approaches Bayesian Online Change Point Detection (BOCPD) (e.g., Adams & MacKay 2007) BOCPD sensitive to hyper-parameters

Turner (Engineering, Cambridge)

Adaptive Sequential Bayesian Change Point Detection

4

The BOCPD Algorithm The goal in BOCPD is to calculate the posterior run length at time t, i.e., p(rt |x1:t ), sequentially. p(xt+1 |x1:t ) =

X

p(xt+1 |x1:t , rt )p(rt |x1:t ) =

rt

X

(r)

p(xt+1 |xt )p(rt |x1:t ) ,

rt

(1) γt := p(rt , x1:t ) =

X

p(rt , rt−1 , x1:t )

rt−1

=

X rt−1

(r)

p(rt |rt−1 ) p(xt |rt−1 , xt ) p(rt−1 , x1:t−1 ) . {z }| {z } | {z } | hazard

likelihood (UPM)

(2)

γt−1

This defines a forward message passing scheme p(rt |x1:t ) ∝ γt .

Turner (Engineering, Cambridge)

Adaptive Sequential Bayesian Change Point Detection

5

Learning

Learn by maximizing (log) marginal likelihood, the evidence Done by decomposing into the one-step-ahead predictive likelihoods log p(x1:T |θ) =

T X

log p(xt |x1:t−1 , θ)

(3)

t=1

Compute derivatives using forward propagation (r) ∂ ∂θm p(xt |rt−1 , xt , θm ) hazard function ∂θ∂h p(rt |rt−1 , θh )

The derivatives of the UPM The derivatives of the

Turner (Engineering, Cambridge)

Adaptive Sequential Bayesian Change Point Detection

6

Improvements

Pruning Naive implementation is O(T 2 ) Eliminate low probability messages for O(T )

Modularity Any hazard function H(t) ∈ [0, 1] Any model that provides a posterior predictive Gaussian process regression, Bayesian linear regression, and Kernel Density Estimation

Caching Repetitive predictions under given run length (r) Use intelligent caching p(xt |rt−1 , xt )

Turner (Engineering, Cambridge)

Adaptive Sequential Bayesian Change Point Detection

7

Well Log Data We used the logistic hazard, H(t) = hσ(at + b), and used an IID Gaussian UPM, with the aim of detecting changes in mean and variance. After learning the parameters our method has a better predictive likelihood than Adams & MacKay 2007. 2

NMR

0 −2 −4 0

500

1000

1500

2000

2500

3000

3500

4000

500

1000

1500

2000

2500

3000

3500

4000

Run Length

50 100 150 200 250 300

Measurements

Figure: The BOCPD run length distribution on the well log data. Turner (Engineering, Cambridge)

Adaptive Sequential Bayesian Change Point Detection

8

Industry Portfolios Tried the “30 industry portfolios” data set (from Ken French repository). Change points found coincide with significant events: the climax of the Internet bubble, the burst of the Internet bubble, and the 2004 presidential election. Dot−com bubble burst September 11 Asia crisis, Dot−com bubble

US presidential election Major rate cut

Northern Rock bank run Lehman collapse

Run Length (trading days)

50 100 150 200 250 300 350 400 450 1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

Date (years)

Figure: The BOCPD run length distribution between 1998 and 2008. Turner (Engineering, Cambridge)

Adaptive Sequential Bayesian Change Point Detection

9

Results

Table: A summary of comparing the negative log predictive likelihoods (NLL) (nats/observation) on test data. We also include the 95% error bars on the NLL and the p-value that the joint model/learned hypers has a higher NLL using a one sided t-test.

Well Log Method NLL error bars TIM 1.53 0.0449 fixed hypers 0.313 0.0267 0.0293 learned hypers 0.247 Industry Portfolios TIM 42.6 0.246 indep. 39.64 0.217 joint 39.54 0.213

Turner (Engineering, Cambridge)

p-value <1e-10 6e-04 NA <1e-10 0.271 NA

Adaptive Sequential Bayesian Change Point Detection

10

Summary

Extended work of Adams and MacKay 2007 Made more general through hyperparameter learning Increases predictive performance on real-world datasets Extended modularity to non-trivial UPMs Improved efficiency using pruning and caching

Turner (Engineering, Cambridge)

Adaptive Sequential Bayesian Change Point Detection

11

Adaptive Sequential Bayesian Change Point Detection

Dec 12, 2009 - The underlying predictive model (UPM) p(xt|x(t−τ):t−1 =: x. (τ) t. ,θm) for any τ ∈ [1,..., (t − 1)], at time t. The hazard function H(r|θh).

577KB Sizes 1 Downloads 272 Views

Recommend Documents

Multi-Scale Change Point Detection in Multivariate ...
can also be used as a layer of a deep network, in composition with other neural layer types such as convolutional and fully connected layers. For example, the input to a wavelet layer can be the output of a convolutional layer. To apply a convolution

Adaptive Bayesian personalized ranking for heterogeneous implicit ...
explicit feedbacks such as 5-star graded ratings, especially in the context of Netflix $1 million prize. ...... from Social Media, DUBMMSM '12, 2012, pp. 19–22.

Statistical resynchronization and Bayesian detection of periodically ...
course microarray experiments, a number of strategies have ... cell cycle course transcriptions. However, obtaining a pure synchronized ... Published online January 22, 2004 ...... Whitfield,M.L., Sherlock,G., Saldanha,A.J., Murray,J.I., Ball,C.A.,.

Practical Floating-point Divergence Detection
ing 3D printing, computer gaming, mesh generation, robot motion planning), ..... contract is a comparison between signatures of outputs computed under reals ..... platforms. Their targeting problem is similar to the problem described in [22], and it

BAYESIAN DETECTION OF RECURRENT COPY ...
Nov 26, 2008 - and efficient approach to analyze a single array sample. In this paper ..... analyzed using Affymetrix 500K (Nsp) platform. Col- umns are .... culation of interval scores for DNA copy number data analysis.,” J Comput. Biol, vol.

An Adaptive Fusion Algorithm for Spam Detection
An email spam is defined as an unsolicited ... to filter harmful information, for example, false information in email .... with the champion solutions of the cor-.

An Adaptive Fusion Algorithm for Spam Detection
adaptive fusion algorithm for spam detection offers a general content- based approach. The method can be applied to non-email spam detection tasks with little ..... Table 2. The (1-AUC) percent scores of our adaptive fusion algorithm AFSD and other f

A Sequential Monte Carlo Method for Bayesian ...
Sep 15, 2002 - to Bayesian logistic regression and yields a 98% reduction in data .... posterior, f(θ|x), and appeal to the law of large numbers to estimate.

Batch and Sequential Bayesian Estimators of the ...
IEEE 802.11 wireless networks, sequential Monte Carlo, unknown transition matrix. ...... the M.Phil. degree in computer science from the. University of Murcia ...

unsupervised change detection using ransac
the noise pattern, illumination, and mis-registration error should not be identified ... Fitting data to predefined model is a classical problem with solutions like least ...

Unsupervised Change Detection with Synthetic ...
False alarm rate no SRAD. 0.05%. 0.02%. 0.1. 0.21%. 0.01%. 0.5. 82.30%. 0%. 1.0. 80.05%. 0%. Alessandria. Λ (SRAD). Detection accuracy. False alarm rate.

Molecular Recognition as a Bayesian Signal Detection Problem
and interaction energies between them vary. In one phase, the recognizer should be complementary in structure to its target (like a lock and a key), while in the ...

Adaptive Spike Detection for Resilient Data Stream Mining
Email: [email protected], ... Email: [email protected]. “A system's .... spike detection will be probed by persistent ad-.

Adaptive Spike Detection for Resilient Data Stream ...
Keywords: adaptive spike detection, resilient data mining ... proach (similar to time series analysis) by work- ... requires efficient processing and rapid decision.

A Bayesian approach to object detection using ... - Springer Link
using receiver operating characteristic (ROC) analysis on several representative ... PCA Ж Bayesian approach Ж Non-Gaussian models Ж. M-estimators Ж ...

Adaptive Communal Detection in Search of Adversarial ...
2007 ACM SIGKDD Workshop on Domain Driven Data Mining. (DDDM2007) ..... data set is modelled as a data stream, there are few significant indicators of ...

Adaptive Spike Detection for Resilient Data Stream ...
Keywords: adaptive spike detection, resilient data mining .... only when there are rapid and large increases ..... mining solution', Encyclopedia of Data Warehous-.

Adaptive Local Thresholding for Detection of Nuclei in ... - MathWorks
background is done through either global or local thresholding. In the analysis of cytology images, determination of threshold is a particularly difficult problem to ...

Content based JPEG Fragmentation Point Detection
Similar to many other binary file format, the JPEG stan- dard specifies that the .... value Af of Ai over n DCT blocks as the AC statistics of the fragment f. In other ...

Automatic Circle Detection on Images with an Adaptive ...
test circle approximates the actual edge-circle, the lesser becomes the value of this ... otherwise, or republish, to post on servers or to redistribute to lists, requires prior ... performance of our ABFOA based algorithm with other evolutionary ...

A Self-Adaptive Detection System for MAC Misbehavior ...
reasons can significantly degrade the performance of mobile ad- hoc networks. Currently, detection systems for handling selfish misbehavior has been proposed ...