Bayesian Hidden Markov Models for UAV-Enabled ...

Viewer
Transcript

Bayesian Hidden Markov Models for UAV-Enabled Target Localization on Road Networks with Soft-Hard Data Nisar Ahmeda , David Casbeerb , Yongcan Caob , and Derek Kingstonb a Aerospace b Air

Engineering Sciences, University of Colorado Boulder, Boulder, CO, USA; Force Research Laboratory, Wright-Patterson AFB, Dayton, OH, USA ABSTRACT

This work addresses the problem of localizing a mobile intruder on a road network with a small UAV through fusion of event-based ‘hard data’ collected from a network of unattended ground sensors (UGS) and ‘soft data’ provided by human dismount operators (HDOs) whose statistical characteristics may be unknown. Current approaches to road network intruder detection/tracking have two key limitations: predictions become computationally expensive with highly uncertain target motions and sparse data, and they cannot easily accommodate fusion with uncertain sensor models. This work shows that these issues can be addressed in a practical and theoretically sound way using hidden Markov models (HMMs) within a comprehensive Bayesian framework. A formal procedure is derived for automatically generating sparse Markov chain approximations for target state dynamics based on standard motion assumptions. This leads to efficient online implementation via fast sparse matrix operations for non-Gaussian localization aboard small UAV platforms, and also leads to useful statistical insights about stochastic target dynamics that could be exploited by autonomous UAV guidance and control laws. The computational efficiency of the HMM can be leveraged in Rao-Blackwellized sampling schemes to address the problem of simultaneously fusing and characterizing uncertain HDO soft sensor data via hierarchical Bayesian estimation. Simulation results are provided to demonstrate the proposed approach. Keywords: Target localization, hidden Markov models, road network surveillance, UAVs, UGS, soft/hard data fusion, Bayesian estimation, human dismount operators

1. INTRODUCTION Unmanned aerial vehicles (UAVs) and unattended ground sensors (UGSs) have become vital tools in global efforts to enhance situational awareness and tactical superiority in a variety of applications. Large-scale integrated surveillance, target acquisition, and reconnaissance (ISTAR) missions form a particularly important family of applications for networked UAVs and UGSs, in which human supervisors must be provided with timely information about adversarial activities, uncertain environmental conditions, and friendly asset status to support coordinated decision-making in complex dynamic environments. With expanded deployment of autonomous UAV and UGS networks, researchers must tackle the inherent performance limitations of such systems that arise from physical and tactical constraints in large-scale settings. To help overcome these limitations and improve robustness, there is much interest in developing formal strategies to enhance and extend the reasoning of autonomous systems through combined exploitation of formal mathematical insights about uncertain dynamical systems and the natural planning/perception abilities of human assets. Recent work has focused on cooperative control and sensing strategies for the UAV-UGS road reconnaissance (U2R2) problem,1 in which a network of autonomous UAVs and stationary UGSs must cooperatively detect a potentially hostile intruder moving on a known road network and deliver images of the localized intruder to a remote base station. This work examines two particular problems of broad interest: 1. how can stochastic target dynamics be modeled in a computationally efficient and compact manner for onboard predictive planning and control by an autonomous UAV with constrained computing capabilities? Further author information: Send correspondence to N.A. and D.C: E-mail: N.A.: [email protected]; D.C.: [email protected]; Y.C.: [email protected]; D.K.: [email protected]

Figure 1.

U2R2 problem sketch: the target intruder vehicle must be detected, localized, and imaged along a known road network as quickly

and precisely as possible by a UAV that gathers and fuses data from UGS nodes and in situ HDOs (who can report information through mobile computer interfaces).

2. how can ‘hard data’ provided by UGS be correctly fused with ‘soft data’ reports provided by in situ human dismount operators (HDOs), especially if the statistical ‘human sensor’ characteristics are unknown a priori (i.e. if HDO reports have uncertain true detection/false alarm rates)? This work shows that dynamic Bayesian estimation algorithms based on hidden Markov models (HMMs) provide a highly practical and theoretically sound framework to address both issues. The main elements of the proposed HMM estimation framework are summarized here in the context of the U2R2 problem, although the proposed approach can be easily extended to other related applications. A formal procedure is presented for automatically generating HMMs from standard descriptions of stochastic target behavior on road networks. It is shown that the highly structured form of the resulting HMMs enables the use of computationally efficient sparse matrix operations for fast multi-step target state prediction as well as filtering, smoothing and MAP estimation with multiple UGS and HDOs whose true detection/false alarm rates are known. These characteristics make HMM models well-suited for constrained real-time predictive planning and data fusion aboard autonomous UAVs. It is then shown that the efficiency of the HMM estimation framework can be leveraged in Rao-Blackwellized sampling strategies for simultaneous Bayesian target state estimation and uncertain soft sensor model parameter learning with prior information. Numerical simulations are provided to illustrate the proposed methods.

2. BACKGROUND AND RELATED WORK 2.1 Problem Setup Figure 1 shows the general setup of the U2R2 problem. A small autonomous patrol UAV is equipped with a visual sensor to send snapshot images back to the base station for analysis, but cannot automatically track or identify intruders due to onboard processing limitations. To reduce bandwidth and processing loads for image analysis, the UAV only collects images with a high statistical likelihood of containing the intruder. The UAV gathers data for inferring the intruder’s likely position via wireless communication with stationary UGSs deployed at various known strategic points of the road network. These concealed battery-powered UGSs can monitor a variety of signals (seismic, Doppler, magnetic, infrared), but only locally store data over short time windows following trigger events. Since the UGSs have restricted communication ranges and lines of sight, they are unable to send data directly to each other or to the base station. The UAV is therefore an airborne data ferry (for camera data) and fusion center (for UGS data) that can only receive new data from UGSs within communication range. The U2R2 problem has four operational phases that are handled autonomously by the UAV/UGS system.1 In the patrol phase, the UAV repeatedly visits and interrogate UGS nodes to gather evidence for possible intruder detections. Upon confirmed detection, the UAV moves into the isolation phase to localize and capture informative images of the intruder along the road network. In the delivery and response phases, the UAV send data back to the base station for further analysis, intruder confirmation and response instructions. It is assumed here that the UGS network has detected a single target and that the UAV is in the patrol phase.

The UAV planning stage generates ‘traveling salesman’ flight paths to visit each UGS and collect enough useful data to confidently localize and image the intruder. A recursive Bayes filter can be used to fuse UGS data and construct a statistical estimate of the intruder’s position on the road network at any given time.2 The Bayes filter uses knowledge of the road network and a generic model of intruder motion dynamics to update the probability distribution function (pdf) of the intruder state along feasible directions of travel. The intruder’s state pdf and most likely location along the road network can then be updated with new ‘detection’/‘no detection’ information from the UGSs interrogated by the UAV. The updated target state pdf can be used by an intelligent path planning algorithm to dynamically steer the UAV toward areas that maximize the probability of either imaging the target and/or receiving useful UGS data to maintain low target state uncertainty. Since the UAV cannot process visual information for target localization, it must make frequent visits to the UGSs, which cover only a limited portion of the road network and are also susceptible to missed detections as well as frequent false alarms.1 The target location pdf may cover a majority of the road network well before the UAV enters the isolation phase, making fast and reliable autonomous target localization and imaging very challenging. The system is also unable to adapt to environmental changes that impact predictions made by the intruder dynamics model (e.g. weather, road conditions), and neither the sensors nor the onboard Bayesian model accounts for characteristics (vehicle type, size, motion patterns, etc.) that could help refine knowledge of intruder intent. It is therefore natural to consider how readily available ‘soft’ information from ‘human sensors’, such as in situ HDOs on patrol, can be used to opportunistically fill in these ‘perception gaps’ to help improve autonomous target state detection and estimation performance, even if these human sensors are not statistically well-characterized or ‘calibrated’ in advance.

2.2 Related Work The problem of estimating the continuous position of a mobile intruder on a known road network given UGS and/or UAV measurements has received much attention in the target tracking literature. Virtually all proposed solutions come in the form of either parametric Kalman filters (KFs) or variants thereof (e.g. the IMM KF3 ), or nonparametric Bayes filter approximations, such as the particle filter4 or histogram filter.2 Although parametric KFs lead to simple and highly efficient recursive updates for target state estimation, they are limited to describing linear/linearized target state dynamics with Gaussian process uncertainties. As such, they are inappropriate for applications characterized by infrequent target measurements or large open road networks with multiple intersections/turns, since these quickly lead to highly multimodal non-Gaussian target state pdf predictions. Nonparametric particle or histogram filters are far more flexible and can approximate arbitrarily complex pdfs, but are much more computationally expensive to implement for predictive planning, especially if the target pdf becomes highly uncertain/multimodal due to infrequent measurements or frequent false alarms. The problem of soft/hard data fusion has lately received increasing attention, e.g. see for instance [5–8], although Bayesian fusion of hard UGS data with statistically uncharacterized soft HDO observations has not been previously considered for road network target localization. In this work, we assume HDOs have a mobile computing interface which allows them to report time-stamped information to UAVs through wireless link; at a minimum, a human can be thought of as ‘mobile UGS’, but in general could be capable of providing varying degrees of positive/negative information for physical/low-level phenomena (e.g. target position, velocity, heading) or abstract/high-level phenomena (e.g. target characteristics, intentions, behaviors; road network conditions, UGS conditions). The techniques presented in [5, 7, 8] are the most relevant in this regard, but are not directly applicable to the U2R2 problem since they restrict attention to either simpler static target localization/classification problems or to linear-Gaussian target dynamics and human sensor models. Hidden Markov models (HMMs) of the target state dynamics and UGS/HDO measurement processes can easily overcome these limitations. As shown next, HMMs combine the advantageous predictive and computational features of compact parametric models (such as the KF) with the flexibility of grid-based models (such as the histogram filter) for representing highly non-Gaussian distributions in discrete time. To the best of the authors’ knowledge, this is the first time fully Bayesian road network target localization problems like U2R2 and soft/hard data fusion problems have been studied completely within a discrete-time discrete-space (DTDS) parametric HMM framework.

3. HIDDEN MARKOV MODELS FOR ROAD TARGET LOCALIZATION This section formally derives and applies HMMs for the isolation phase of the U2R2 problem. A formal convolution-based procedure is presented for automatically generating highly sparse parametric DTDS Markov chain models from discrete-time hybrid-space stochastic road network motion models. Several attractive analytical and computational features of the Markov chain formulation are discussed. Recursive Bayesian estimation algorithms for fusing hard/soft data are then presented for the ideal baseline case of undelayed centralized fusion with fully characterized sensors. Section 4 further develops these estimation strategies for the important special case of fusion with unknown soft sensor data quality.

3.1 Derivation of Markov chains from road network specification A hidden Markov model is generally described by two dependent discrete time stochastic processes: (i) a (first order time invariant) Markov chain over a finite number of discrete unobservable states ζ, and (ii) an observation process that produces random ‘measurement’ variables y that are conditionally dependent on the hidden state ζ.9 It is proposed to model the target’s unknown position on the road network by a Markov chain, which is characterized by a set of conditional transition probabilities between different possible discrete state realizations ζ. The Markov model can be used to efficiently update and predict the probability of the target’s future location at any future point in time starting from any initial probability distribution over ζ. Furthermore, the Markov model can be used by efficient recursive Bayesian inference algorithms to update the probability distribution over ζ given new UGS/HDO data y collected by the UAV. The underlying model of the stochastic target dynamics is first described according to a standard graph-based road network specification, followed by the procedure for automatically converting this into a time invariant Markov chain. The procedure for recursive Bayesian fusion of UGS/HDO data with this model is then described. 3.1.1 Road Network and Target State Specification The road network is specified by a graph model GR = (N , E) consisting of a finite set of nodes N and edges E. Each edge i ∈ E, i = 1, ..., Ne represents one of Ne roads. Each road i defines a single direction of travel for a single target between start and nodes ns (i), ne (i) ∈ N , according to the following 1D discrete-time stochastic kinematics model, xk+1 = xk + vk (i),

(1)

vk (i) ∼ P(vk (i)), where xk ∈ [0, 1] is the (normalized) 1D position of the target along edge i at discrete time step k = 0, 1, 2, ... and vk (i) is the (normalized) discrete-time random discrete-time velocity drawn from probability distribution P(vk (i)). For convenience and without loss of generality, assume P(vk (i)) = U(vmin (i), vmax (i)), i.e. a uniform distribution defined by upper and lower bound velocities 0 ≤ vmin (i) ≤ vmax (i)∗ . Given that the target is on road i, this model encodes the assumption that the target moves in a single direction with an unknown but bounded velocity, i.e. the target cannot perform a U-turn in the middle of the road. To model the target’s possible motion along the same road in the other direction (i.e. after possibly performing a U-turn at the end node ne (i)), another parallel ‘reverse’ edge er can be added with the same velocity bounds but with the implied direction of travel reversed (i.e. with the start and end nodes reversed). As such, the discrete-time target state is fully described by the hybrid state vector ξ(k) = [ek , xk ]T , where the discrete state ek ∈ E indicates which road the target is on at time k and xk ∈ [0, 1] indicates how far along road ek the target is at time k. All edges are connected by intersection nodes nh ∈ N , h = 1, ..., Nn , which define the discrete-time conditional probabilities of: (i) transitioning from the end of one road edge to the start of a new road edge, i.e. P (ek+1 = j|ek = i, xk = 1) (where j 6= i); (ii) performing a U-turn along the same road edge, i.e. P (ek+1 = i|ek = i, xk = 1); or (iii) of terminating at the node and ‘escaping’/‘exiting’ the road network, i.e. P (ek+1 = >|ek = i, xk = 1), where ek+1 = > denotes the terminal escape/exit state for which P (ek+1 = >|ek = >) = 1 and P P(ek+1 = i|ek = >) = 0 for all i 6= >. Note that, if E(nh ) denotes the set of edges connected to node nh , then (j)∈E(nh ) P (ek+1 = j|ek = i, x = 1) = 1 for all i ∈ E(nh ). ∗

other 1D velocity distributions such as truncated normals, exponentials, etc. can also be used

3.1.2 Markov chain model of target location Given an initial state probability P (ξ0 ) = P ([e0 , x0 ]T ), a state estimator is sought to: (i) predict the possible target states at future discrete time steps k = 1, 2, ...,; and (ii) update beliefs about the current target state at time T given new sensor data Y1:T = {y1 , ..., yT } for time steps k = 1 : T . Requirement (i) amounts to calculating the future state marginal distribution via the Chapman-Kolmogorov equation p(ξk ) = p([ek , xk ]T ) =

Z

Z ...

ξk−1

ξk−2

Z Y k

p(ξt |ξt−1 )p(ξ0 )dξ0 ... dξk−2 dξk−1 ,

(2)

ξ0 t=1

where p(ξt |ξt−1 ) represents the discrete-time transition probability kernel from state ξt−1 to state ξt for all R ξt ∈ E × [0, 1] and the abused operator ξt (·)dξt informally represents marginalization over the hybrid state (i.e. summation over et and integration over xt ). Requirement (ii) amounts to recursively calculating the Bayes’ posterior distribution at time k, p(ξk |Y1:k ) ∝ p(ξk |Y1:k−1 )p(Yk |ξk )

(3)

where p(Yk |ξk ) represents the conditional likelihood of data Yk given ξk and p(ξk |Y1:k−1 ) represents the posterior from time k − 1 predicted forward one time step via eq. (2). Note that (2) is in general analytically intractable if the state is described by the hybrid distribution function p(ξk ) = P (ek )p(xk |ek ), where the road position xk is described by conditional probability density function p(xk |ek ) and P (ek ) is a probability mass function. Indeed, (2) is generally an exponentially growing sum of complicated truncated probability density functions, since each transition kernel must account for the possibility that ek = i 6= > switches to ek = j 6= i any time the probability of xk = 1 is strictly greater than zero (i.e. whenever it is possible the target has reached an intersection node). This issue can be addressed through spatial discretization of the position state xk along each edge. PNe Suppose edge i is discretized into bi cells, so that the total number of cells in the road network is ¯b = i=1 bi . This allows each possible discrete edge-road pair of the discretized state vector ξt = [et , xt ]T to be treated as a unique discrete coordinate point along the road network, so that ξk = [ek , xk ]T 7→ ζk0 ∈ 1, 2, ..., ¯b . (4) Suppose also that each intersection node nh is counted as a separate discrete position cell; if the set of possible discretized edge-road position states ζk0 is augmented to include nodes as temporary road network ‘position’ states located between all possible edge transitions, then the total number of discrete spatial coordinates (i.e. cells) in the network is d = Nn + ¯b, so that the full discretized state transformation becomes ζk ∈ {1, 2, ..., d}. Under this discretization scheme, eqs. (2) and (3) become P (ζk ) =

X X ζk−1 ζk−2

...

k XY

P (ζt |ζt−1 )P (ζt−1 )

(5)

ζ1 t=1

P (ζk |Y1:k ) ∝ P (ζk |Y1:k−1 )P (Yk |ζk ),

(6)

where all intractable integrals have now been replaced by a recursive set of tractable summations over discrete state cell realizations. The state abstraction ξ 7→ ζ thus permits us to restrict attention to discrete state transition probabilities in discrete time. This is naturally formulated as a Markov chain, which is specified by an initial probability distribution P (ζ0 ) and the transition kernel P (ζt |ζt−1 ) for t = 1, 2, ..., k.9 While P (ζ0 ) is easily derived by integrating p(ξ0 ), it is less obvious how to obtain P (ζt |ζt−1 ) from the node transition probabilities and the edge-dependent stochastic position transition kernels p(ξt |ξt−1 ). Indeed, d2 conditional transition probabilities P (ζt |ζt−1 ) must be specified to describe how the target can move from any one of d states ζk at time k to any one of d states ζk+1 at time k + 1. This issue is resolved by noting that the following conditional independence relationship holds from the Markov assumption on the target position dynamics, p(ξt |ξt−1 ) = p([et , xt ]T |[et−1 , xt−1 ]T ) = P (et |et−1 )p(xt |xt−1 , et−1 ).

(7)

(a)

Figure 2.

(b)

(a) Set up for convolution integral (9) for computing joint probability that target occupies cell cj at time k and cell ci at time

k + 1 while remaining on the same road edge; (b) function integrated in convolution to obtain cell-to-cell state transition probability aij along a given road edge

The conditional probabilities P (et |et−1 ) are already defined for transitions from the end of an edge to another edge via an intersection node. For transitions from one edge i onto itself prior to reaching the end of edge i, P (et = i|et−1 = i) = 1. Hence, it remains to define cell-to-cell transitions p(xt |xt−1 , et−1 = i) for two cases. If the target transitions from an intersection node onto a new edge, this can be defined such that the target always lands on the first cell of the new edge at the next time step (with probability equal to the probability of turning onto that new edge); otherwise, if et = et−1 , then p(xt |xt−1 , et−1 ) should be derived directly through integration of the stochastic dynamics model along the road edge, eq. (1). In this case, prior to discretization of xt−1 , it follows that (for discretization interval ∆T between time step k and k − 1) p(xt |xt−1 , et−1 ) = U(xt−1 + vmin (et−1 )∆T, xt−1 + vmax (et−1 )∆T ) = U(xk+1,min , xk+1,max ),

(8)

i.e. the predicted position at xt is uniformly distributed given the precise position xt−1 along edge et−1 = et . Now, assuming that the grid used to discretized the position xt and xt−1 along the edge is uniform and constant in time, and assuming that P (xt−1 ∈ cj |et−1 ) can be well-approximated by a uniform distribution (i.e. by ‘smearing out’ the pdf p(xt−1 |et−1 ) within cell cj ), then it is straightforward to show that the discretized transition probability P (ζt = cl |ζt−1 = cj ) corresponding to P (xt ∈ cl |xt−1 ∈ cj , et−1 ) is given by Z P (ζt = cl |ζt−1 = cj ) ∝ P (ζt = cl , ζt−1 = cj ) =

c+ j

c− j

+ U(c− j , cj )

Z

c+ l

c− l

U(xk+1,min , xk+1,max )dxk+1 dxk

(9)

+ where cl and cj are the discretized cells for xt and xt−1 along the edge i = et = et−1 (where c− l and cl respectively correspond to the lower and upper bound x values of cell cl ). As shown in Fig. 2(a), this expression corresponds to a convolution integral representing the 1-step probability transition function for a point mass starting anywhere in cell cj transitioning to any point in cell cl along edge i. Applying this calculation for all possible cell transitions from cj at time k − 1 to cl at time k, we obtain the desired set of discretized transition probabilities p(xt |xt−1 , et−1 ) 7→ P (ζt |ζt−1 ) along a fixed edge. Note that P (ζt |ζt−1 ) = 0 for all ζt < ζt−1 along the same edge, since the target can only move forwards along the edge. Fig. 2(b) shows the function that is effectively integrated with respect to xt−1 in the convolution process to obtain the cell transition probability P (ζt = cl |ζt−1 = cj ) = alj along a single edge i, where the grid size along i is selected such that |cj | = |cl | = vmax (i)∆t ∀i (note that t − 1 = k and t = k + 1 in this illustration).

Combining the inter-edge transition probabilities from the node specifications with the intra-edge position transition probabilities derived via eq. (9) above, a total of d2 transition probabilities are recovered. Fortunately, the vast majority of these probabilities are zero; in fact, the only non-zero probabilities are those transitions corresponding to physically feasible target state transitions, namely : (i) between neighboring states along a given edge (given by the convolution integral above); (ii) from the end of one edge to a node (with probability 1); (iii) from a node to an edge (with a pre-specified probability of making the turn onto a new outgoing edge

(a)

(b)

(c)

(d)

Figure 3. (a) Benchmark small road network consisting of 5 one-way road edges and 4 turn intersection nodess (1 terminating); (b) sparsity pattern of STM A for 151 discretized target location states along road network in (a) (∆T = 4.167 secs, ∆c = 50 m), showing 300 non-zero entries out of 22, 801 total matrix entries; (c) benchmark large road network consisting of 31 one-way road edges and 17 turn intersection nodes (3 terminating); (d) sparsity pattern of A for 500 discretized target location states along the road network in (c) (∆T = 5.7692 secs, ∆c = 75 m), showing 994 non-zero entries out of 250, 000 total matrix entries.

or performing a U-turn back onto the old incoming edge); or (iv) to a terminating node (with a pre-specified termination probability). + − + If p(xk ) and p(xk+1 ) can be well-approximated by uniform pdfs over [c− j , cj ] and [cl , cl ] for all time k (which can be ensured by choosing a suitable grid size |cj | = |cl |, as shown in Fig. 2), then all forward state transition probability calculations for ζk can be expressed in a compact recursive form. Define πk ≡ P (ζk ) to be a d × 1 vector of probabilities over the discretized state ζ such that 0 ≤ πk (z) ≤ 1 for index variable z ∈ {1, ..., d}, where Pd πk (z) = P (ζk = z) and z=1 πk (z) = 1. Furthermore, define the d × d column stochastic matrix A to be the matrix whose (r, q)th entry defines the transition probability P (ζk+1 = r|ζk = q) = arq (such that each column Pd q must sum up to 1, i.e. r=1 arq = 1). Then, the 1-step prediction update for (5) is equal to

πk+1 = Aπk ,

(10)

where A is a highly sparse matrix. Extending this prediction update to the general form of (5), πk = Ak π0 = Φ(k, 0)π0 ,

(11)

where the general Markov state transition matrix (STM) is Φ(k, 0) ≡ Ak . Therefore, probabilistic predictions of the target location can be readily approximated via fast highly sparse d-dimensional vector-matrix operations: even if d is on the order of hundreds or thousands, the required operations for stochastic state prediction can be carried out for real-time online operation with modern linear algebra libraries. 3.1.3 Markov chain analysis of stochastic target motion dynamics Figure 3 (a) and (c) shows two benchmark road networks used to study the U2R2 problem, which each encode velocity distributions and intersection turn probabilities for targets along edges and nodes according to (1). Figure 3 (b) and (d) show the sparsity pattern of the resulting Markov transition matrix A for this network obtained via the proposed convolution approximation: all non-zero elements are found exclusively on either the main diagonal, the first superdiagonals, or a handful of off-diagonal elements. This model clearly captures the ‘highly local’ nature of the stochastic target dynamics that stems from constrained forward motion along each edge (corresponding to the non-zero main band diagonal entries) as well as more ‘global’ aspects related to turn/termination events at nodes (corresponding to the non-zero off-diagonal elements). It is interesting to note that the nearly diagonal block matrix entries are sub-Markov chains with the so-called ‘left-to-right’ transition property, which encodes the constraint of ‘one way travel’ along a particular edge with random (forward) velocities (such Markov chains are also called ‘Bakis models’ in the signal processing literature9 ). For the road networks considered here, the mean velocities along each edge are constant, and therefore the values on these band diagonal entries consist of only a handful of unique numbers that are completely determined by the resolution of the spatial and temporal discretizations (e.g. for both examples in Fig. 3, the band diagonals for each subblock

of A contain only 2-3 unique numbers corresponding to conditional transition probabilities). These highly sparse and constrained structural properties make it quite trivial to add, subtract or modify road edges and intersection nodes in an existing road network model. Eq. (11) shows that the discretized target state probability can be predicted an arbitrary number of time steps s into the future using straightforward matrix power calculations, starting from any initial distribution p(ζk ). In particular, the conditional probability of reaching any state ζk+s = i in s steps starting from ζk = j is given by the (i, j) entry of the STM Φ(k + s, k) = Φ(s, 0) = As . This provides a simple direct way to compute target state reachability probabilities, which in turn could be used in intelligent target interception strategies for UAV planning and control. For instance, such information could be used to rule out or to minimize UAV fly-bys over portions of the road network where certain entries of As are guaranteed to fall below a certain threshold for at least s time steps in the future. In general, the fact that A is sparse and As can be quickly computed (e.g. via pre-calculated eigenvalue/eigenvector decomposition) makes the discrete Markov chain model particularly attractive for closed-loop planning/control when compared to alternative continuous non-Gaussian target state prediction models such as the IMM-KF or particle filter, which require greater effort to implement and analyze. Eq. (11) can also be used to derive the Markov chain’s fundamental matrix ,10 N=

∞ X

Qk = (I − Q)−1

(12)

k=0

where Q is the sub-matrix of A corresponding to entries for only those transitions between all non-terminating states, and the (i, j) entry of N gives the expected number of times the target visits non-terminating state i starting from non-terminating state j. Several other important quantities can be computed from N , such as the variance in the number of times the target visits state i starting from j, as well as the expected number of time steps to reach any terminating node starting from any non-terminating state j. If A is invertible, then powers of the inverse matrix A−1 can also be used for probabilistic state retrodiction (i.e. to determine where the target possibly could have arrived from). Such information leads to a concise set of statistics related to expected target behavior over the entire road network, and could thus provide additional information for designing intelligent UAV planning/control algorithms in the U2R2 isolation phase. 3.1.4 Caveats Appropriate space and time discretization parameters are needed to generate a sufficiently accurate Markov chain approximation of the target’s stochastic motion model through the road network, since the derivation above rests on the assumption that p(xk ) and p(xk+1 ) can be well-approximated by uniform distributions inside + − + the cell intervals [c− j , cj ] and [cl , cl ] along any given edge for any single time step. The state transition probabilities given by the convolution integral approximation tend towards 1 as the spatial interval increases (leading to unrealistic predictions of extremely slow target motion along edges), whereas the the probabilities tend to ‘bleed’ out of cells faster as the time discretization interval increases (leading to unrealistic predictions of fast target motion along edges). While it is possible to obtain a more accurate prediction model using very small spatial and time discretization intervals, this also leads to a significant increase in the size of d and hence A and the computation time for predictions via eq. (11) (although A itself remains sparse). These issues can be avoided by choosing an arbitrary discretization time interval ∆T and constraining spatial intervals to only be as large as the maximum expected target travel time along any given edge within a single ∆T time interval. Furthermore, the Markov chain model implies that the target’s stochastic residency time in some specific state ζ is exponentially distributed. In general, however, marginalization of highly variable target velocities from the road network motion model actually implies that the discretized space/time dynamics follows a semi-Markov process, which in turn would result in target state residency times that are distributed according to mixtures of exponentials. As such, the distributions for edge/road traversal times throughout the network would not be characterized by single mean and variance values for each edge, but rather by upper and lower bounding collections of traversal time statistics for each edge. Such processes are better described by more flexible variable duration Markov models, in which the expected traversal time through any ζ state is modeled by an additional ‘countdown’ random variable augmented to the original state vector.9 Although such models could lead to

even higher fidelity target location probability predictions at arbitrary space/time scales, they are significantly more difficult to analyze and implement than the regular time invariant Markov chains derived here (which are adequate and sufficiently accurate for the relatively small velocity variances considered), and are thus not considered further. However, more sophisticated models such as these will be investigated in future work. 3.1.5 Pure prediction accuracy of the Markov model vs. Monte Carlo particle simulation The state transition probabilities and predicted target state probabilities obtained via the Markov chain approximation were validated against ‘ground truth’ Monte Carlo particle simulations for both the small and large road networks shown in Fig. 3. Figure 4 (a)-(d) and (i)-(l) show ground truth results for the predicted target state probabilities obtained with 10,000 particles at two different future time steps for the small and large networks starting from the initial belief state π0 (1) = 1 and π0 (2 : d) = 0. The ground truth distributions show the relative frequency of obtaining continuous state particles in the discretized cell states used for the HMM at the indicated time steps (using the same ∆T discretization), if all particles are propagated from the initial time and position through the continuous time stochastic motion model for each edge given by eq. (1) and edge transition probabilities specified for the intersection nodes. Fig. 4 (e)-(h) and (m)-(p) show the predicted state probabilities obtained by the HMM prediction equation (11) for the same two scenarios and time steps using the corresponding STMs shown in Fig. 3 (b) and (d). Figure 5 shows the Kullback-Leibler divergence (KLD) profile between the particle-predicted ‘ground truth’ distribution ρk and HMM-predicted approximate distributions πk for 150 consecutive forward time steps k, where the KLD for time k is given by KL[ρk ||πk ] =

d X z=1

ρk (z) log

ρk (z) . πk (z)

(13)

Since KL[ρk ||πk ] ≥ 0 and KL[ρk ||πk ] = 0 if and only if πk = ρk , the KLD provides an informal notion of ‘distance’ between the predicted distributions πk and ρk : the smaller (13) is, the harder it becomes to tell the two distributions apart.11 The KLD provides a general notion of statistical consistency for approximations to highly non-Gaussian/multi-modal distributions, since (13) grows larger when πk (z) < ρk (z) for z where ρk (z) > 0, i.e. if the approximation πk incorrectly diminishes the support of the ‘exact’ distribution ρk , thus indicating inconsistency due to excessive information loss. These results show that the predicted state distributions obtained with the computationally cheaper HMM model are in good agreement with the predictions made by the more computationally expensive Monte Carlo sampling procedure; indeed, only minor discrepencies can be found between the two sets of predictions in each case for areas with low probability mass. As Fig. 4 shows, these discrepencies can be attributed to the fact that a small amount of probability mass tends to ‘bleed out’ from the front and back tails of the predicted HMM distributions, although the bulk of the probability mass still tends to concentrate around the actual support of the ‘true’ predicted state distribution. These tail errors are a direct consequence of the key assumption used to derive the recursive Markov chain prediction update, i.e. that p(xk ) and p(xk+1 ) are always approximately uniform over each discrete cell, since this causes small portions of ‘redistributed’ probability mass to either repeatedely lag slightly behind or jump slightly ahead of the true support for the predicted distribution. Nevertheless, as shown in Fig. 5, this leads to relatively small and bounded errors between HMM and Monte Carlo results, even as the target passes through several highly connected intersections to yield highly multi-modal state distributions.

3.2 Fusing UGS and HDO Data for Bayesian Target Localization The observation process for the HMM is parametrically modeled by a set of conditional probabilities that relate measurements yk obtained from UGS and HDO assets to the true discretized target state ζk , i.e. P (yku |ζk , z u ) and P (ykh |ζk , z h ), where yku is an observation from UGS u at location z u in the road network and ykh is an observation from human h at location z h . For the initial investigation here, it is assumed that yku and ykh correspond to binary ‘detection’/‘no detection’ data at known fixed geographical points along the road network, i.e. UGS and HDO positions are assumed to be known at each time step k, and hence each observation can be assigned to a unique cell z u or z h corresponding to the reporting sensor’s location at a single point along a single edge in

Particles @ step 30

Particles @ step 60

3

1000

Particles @ step 90

3

1000

Particles @ step 120

900

900

900

800

800

800

800

700

700

700

700

600

600

600

3

500

3

500

2

2

600

3

500

400

400

300

300

200

200

100

100

400

3

1000

3

1000

900

2

3

500

2

400

300

300

200

200

100

100

4

4

0

4

0 -500

4

10

1

500

1000

2

1500

2000

1

1

-500

0

500

(a)

2

1000

1500

2000

4

4

0

1

1

-500

0

500

(b)

HMM @ step, 30 1000

3

800

600

600

4

0 -500

4

10

p(E3) = 0

1

500

1000

2

1500

2000

HMM @ step, 120

3

1000

3

900

900

800

800 p(E2) = 0.18071

p(E3) = 0.31929

p(E2) = 0.45946

700 p(E1) = 1

2000

(d)

1000

900

700

1500

HMM @ step, 90

3

1000

800

1000

(c)

HMM @ step, 60

900

2

p(E3) = 0.49718

700

700

600

600

p(E2) = 1.0726e-07

p(E2) = 0

3

p(E1001) = 0

500

3

500

2

2

3

500

2

400

3

500

p(E3) = 0

400

2

400

400

p(E4) = 0

300

p(Exit) = 0

300

300

300

p(E4) = 0.24356

200 p(E4) = 0

100

4

4

10

1

500

1000

2

1500

2000

1

1

-500

0

500

(e)

0

16 1

9

11

2000

4

1000

2000

3000

5

8

14

2

9

1

9 21

4000

22 5000

6000

1 2

17 5 3

7

6

4

1000

2000

5

14

2 3

0

8

3000

16 1

1 2

6

4

18

7

5

2000

3000

8

0

14

2 9 3 21

1000

23 500

-500

0

11 1000 4

9 21

4000

22 5000

4000

1 2

5000

6000

20 7000

6000

20

(m)

Figure 4.

7 10

11

1

1 2

15

10 6

1000

4

19

13

1000

23

7

5

2000

8

0

14

2

17 5 3

7

6

4

2000

5

8

0

14

2

3000

3000

11

1513 8 1212

9

9 21

22

16

5000

6000

4000

5000

6000

7000

2000

10 6

16 1

19

13 1 2

17 5 3

6

4

20

3

23

7

5

9 21

22

16

17

7000

0

1000

2000

3000

8

14

2

-500

4

4000

5000

6000

20

17

7000

(l)

16 1

9

HMM @ step, 120

15

10 6

1000

19

4

23

7

5

500

17 5 3

6

4

14

2

(n)

2000

3000

4000

16 1

1513 8 1212

9

5000

6000

17 5 3

6

4

9 3 21

4

23

7

5

22 16

17

7000

(o)

0

1000

2000

3000

4000

8

14

2

-500

20

10 6

19

13 1 2

7 10

11

18 22 16

17 1000

8

0

9 3 21 0

11

14

13 1 2

7 10

11

-500

20

7 10

11

18

4000

18 22 16

1500

1500

1513 8 14 1212

11 1000 4

2

1000

23 500

3 0

15

17 1000

500

Particles @ step 120

17

7000

10 6

6

4

9 3 21 0

0

(h)

-500

500

-500

-500

HMM @ step, 90

19

17 5 3

1

1

(k)

7 10

11

13

16 1

9

18 22 16

2000

1500

1513 8 14 1212

15

10 6

19

17 5 3

16

HMM @ step, 60

7 10

1500

14

(j)

11

13

9

18 16

1500

9

1000

23

0

17

7000

1500

1513 8 14 1212

500

p(E1001) = 0.0157

4

4

1500

1513 8 11 1000 14 1212 500

HMM @ step, 30

11 1000

0

15

10 6 4

-500

20

p(E1) = 0.24356

Particles @ step 90

19

18 16

7 10

11

13

16

(i)

15

-500

2

0

(g)

23

0

3 0

7

6

4

17 5 3

1

1

1500

1513 8 11 1000 14 1212 15

10 6

500

18

0

1500

4

4

Particles @ step 60

19

13 1 2

7 10

-500

500

1000

1500

1513 8 11 1000 14 1212 15

500

2

0

100

p(E1001) = 0.49993

(f)

Particles @ step 30 1500

p(Exit) = 0.05559

p(E1) = 3.3219e-05

p(Exit) = 0

100

p(E1001) = 0.45946

4

4

0

p(E4) = 3.3218e-05

p(E1) = 0.081078

p(Exit) = 0

100 0 -500

200

200

200

5000

6000

20

17

7000

(p)

(best viewed in color) Predicted target state probabilities for discretized road networks for 4 different forward time steps k =

30, 60, 90, 120 (from left to right): (a)-(d) Monte Carlo predictions for small road network; (e)-(h) HMM predictions for small road network; (i)-(l) Monte Carlo predictions for large road network; (m)-(p) HMM predictions for large road network. Darker/lighter tiles indicate higher/lower probability that target will be in particular cell.

the road network. The required probability parameters for P (yku |ζk , z u ) and P (ykh |ζk , z h ) correspond to the ‘true detection’ and ‘false alarm’ rates for each UGS u and HDO h, e.g. for each UGS (and likewise for each HDO): P (yku = ‘detection’|ζk , z u ) = au if ζk = z u (‘true detection’) P (yku

u

u

= ‘detection’|ζk , z ) = b if ζk 6= z

u

(‘false alarm’).

(14) (15)

Additional data could also be extracted from UGS and HDO measurements, e.g. target velocity and type, and these can easily be incorporated into the HMM given above (or extensions thereof) by specifying suitable conditional probability distributions. Mobile HDO observations can also be considered if z h is not fixed and

KL Divergence Between PF and HMM Predicted Probabilities 2 Small Road Network Large Road Network 1.8

KL Divergence (nats)

1.6

1.4

1.2

1

0.8

0.6

0.4 0

50

100

150

Time Step

Figure 5.

KL divergences between ground-truth particle predicted and approximate HMM predicted target state distributions for small

and large road networks over 150 consecutive prediction time steps, showing that HMM predictions are generally consistent with reasonably bounded and stable long-term prediction errors.

possibly uncertain. Such variations are ignored here for simplicity, but will be explored in future work. 3.2.1 The forward-backward algorithm for recursive state filtering/smoothing Assuming (au , bu ) and (ah , bh ) are known for each UGS u and HDO h, all available ykh and yku at each time step k can be fused together to recursively update prior probabilistic predictions of the discretized target state ζk via Bayes’ rule, u h u h P (ζk |Y1:k , Y1:k ) ∝ P (ζk |Y1:k−1 , Y1:k−1 )P (yku |ζk )P (ykh |ζk ),

(16)

in accordance with the Bayes’ filtering eqs. (5)-(6). The required prediction-update filtering calculations correspond to the forward step of the well-known and highly efficient HMM ‘forward-backward’ algorithm,11 which is in fact the mathematical DTDS analog of the Kalman filter-smoother (or Rauch-Tung-Striebel smoother) algorithm for discrete time continuous state space models.11 Fig. 6 shows example HMM filtering results for UGS and HDO data generated by a simulated target on a simple road network. In this case, data is processed from all UGS at every time step simultaneously, and the probability that the target is located on a given road at any given time step is shown on the side of each plot as the sum of probabilities for all ζk along the corresponding edge (here, 1 time step is 4.6 secs). These results show that the target probability distribution maintained by the HMM Bayes’ filter provides a reasonable probabilistic representation of the target state ζk at each time step k using a relatively sparse set of measurements over an extended period of time, as long as the (au , bu ) and (ah , bh ) do not lead to an excessive number of false alarms or missed detections. 3.2.2 The Viterbi algorithm for maximum a posteriori (MAP) trajectory estimation One notable drawback of the HMM Bayes’ filter is that it only returns the marginal posterior distribution for each ζk individually, i.e. it does not return a joint posterior distribution over the entire set of complete target trajectories from k = 1 : T . Indeed, the number of possible trajectories becomes exponentially large over time, so this joint posterior cannot tractably represented anyway. However, it is possible to modify the HMM data fusion procedure to estimate the single most likely trajectory taken by the target. In other words, rather than M AP computing the full joint posterior over all ζ1:k , the maximum a posteriori (MAP) joint sequence of states ζ1:T u h given Y1:T and Y1:T can be identified, where M AP u h ζ1:T = arg max P (ζ1:T |Y1:T , Y1:T ) ∝ arg max P (ζ0 )

T Y k=1

P (ζk |ζk−1 )

Nu Y u=1

P (yku |ζk , au , bu )

Nh Y

P (ykh |ζk , ah , bh ).

h=1

(17)

Figure 6.

(a)

(b)

(c)

(d)

(e)

(f)

Truth model simulation results for HMM Bayes’ filter for small benchmark road network model consisting of 5 edges (1 two-

way road), 4 nodes (1 terminating), 9 UGS and 1 HDO (shown as green dots, indicating ‘no detection’ at each location; dots turn red for single time step when detection event is registered).

Top row: true target position shown as single red square tile; bottom row:

HMM updated target location distribution (white/red tiles = low/high probability locations). UGS and HDO parameters are all set to (au , bu ) = (ah , bh ) = (0.99, 0.025) in all cases.

(a)

Figure 7.

(b)

Cell normalized MAP geographic 2D grid cell coordinate errors under different UGS specifications for road network truth

simulation shown in Fig. 6 (error values given in # of cells, where 1 cell = 50 m) : (a) nominal UGS/HDO specs (au = 0.99, bu = 0.025), showing x and y geolocalization errors to be within ±250 m (±5 cells) ; (b) noisy UGS/HDO specs (au = 0.99, bu = 0.10), showing error ranges ∼ 4-6 times as large.

The well-known Viterbi algorithm provides a computationally efficient procedure for computing this MAP sequence via dynamic programming.11 Fig. 7 shows example Viterbi MAP target trajectory estimation results for a simulated target on the same road network from Fig. 6. These results show that the MAP-estimated target trajectory can be quite good for nominal UGS/HDO specs (i.e. low au , high bu ), but can be quite unreliable for noisier UGS/HDO data. Unlike the HMM Bayes filter, the Viterbi MAP procedure is unable to provide a measure of the uncertainty M AP around the point trajectory estimate ζ1:T , and thus should be used/interpreted with caution when dealing with sparse/noisy information sets. Nevertheless, the MAP trajectory estimate can be a useful ‘what-if’ analysis tool for intelligent target interception planning, i.e. if it is believed that the target is attempting to move between two

specific nodes of interest on the road network over a certain period of time, then the MAP trajectory estimate could be used to determine the most likely route taken (or to be taken) by the target given the current sequence of available observations. 3.2.3 Fusing delayed and out-of-order UGS/HDO data in the HMM So far, it has been assumed that all ‘available’ UGS/HDO measurements have been already collected and stored aboard the UAV for ‘on time in order’ processing with either the forward-backward or Viterbi algorithms. In practice, however, UGS and HDO data will be gathered out of order and with possibly significant delays (e.g. data from an UGS may have been recorded earlier than a report made by an HDO, but was only collected after the HDO’s data was transmitted to the UAV). Such cases can be handled in two general ways. If delayed/outof-order measurements picked up by the UAV can be ignored if they are very out of date or uninformative. Otherwise, both the foward-backward and Viterbi algorithms can be modified to quickly and repeatedly ‘scrub through’ a length w window of target states ζk:k−w so that past and current target state beliefs can be properly updated with new information from time-delayed measurements. Such repeated scrubbing with delayed data can be very valuable for removing old unlikely target state hypotheses and/or spawning new more likely state hypotheses, especially when the target state distribution becomes highly multi-modal.

4. SOFT-HARD DATA FUSION WITH UNCERTAIN SOFT SENSOR MODELS The parameters (au , bu ) for each UGS u can generally be obtained through offline calibration procedures, and it is often feasible to assume that these are static with au = a and bu = b ∀u (within perhaps some small expected range of variation under nominal operating conditions). On the other hand, a large amount of data is often required to accurately identify parameters (ah , bh ) for even a single HDO, due to the context-sensitive nature of soft information. Furthermore, it is possible that (ah , bh ) might change over time (e.g. due to fatigue, physical stress), especially if in situ observation scenarios cannot be matched precisely under controlled experimental conditions for HDO observation calibration. In addition, unlike UGS which are permanently stationed alongside fixed points of the road network, HDO availability and location z h are not guaranteed to stay fixed, e.g. if some HDOs rotate patrol duties at a specific observation post. The technical challenge then becomes to modify the HMM-based Bayesian data fusion procedure for applications where the parameters for some or all (ah , bh ) are uncertain; this can be extended more generally to include learning of uncertain UGS parameters (au , bu ) (e.g. for damaged or malfunctioning units) and exploitation of hard/soft constraints on sensor characteristics in the form of prior probability distributions.

4.1 Maximum Likelihood Point Estimation One way to tackle this problem is to simultaneously derive a set of point estimates for (ah , bh ) that lead enable consistent calculation of the desired posterior probabilities over ζk via the HMM Bayes’ filter. This can be solved with maximum likelihood estimation techniques, which maximize the so-called ‘incomplete data’ likelihood function of all unknown HDO parameters (ah , bh ) † , u h p(Y1:T , Y1:T |ah , bh , au , bu ) ∝

X ζT

...

X ζ0

P (ζ0 )

T Y k=1

P (ζk |ζk−1 )

Nu Y u=1

P (yku |ζk , au , bu )

Nh Y

P (ykh |ζk , ah , bh ).

(18)

h=1

Direct maximization of (18) (or its logarithm) is feasible for a single set of unknown HDO parameters, but is not scalable for batch estimation with multiple unknown HDO parameters. Alternatively, the expectationmaximization (EM) algorithm can overcome these limitations by alternately performing HMM filtering and iterative parameter optimization with the observed data set.11 The EM algorithm is guaranteed to monotonically improve eq. (18) at each iteration and will always eventually converge to a local maximum in the (ah , bh ) space. However, EM can get stuck near poor local maxima, and thus requires expensive restarts or multiple intializations to obtain good estimates. EM can also perform poorly if only sparse HDO data sets are available for online calibration. In such cases, Bayesian learning with parameter priors can lead to better performance. †

where incomplete data refers to the fact that ζ1:T are unknown; were these known, then maximum likelihood estimates for (ah , bh ) would be trivial to obtain based on relative event frequencies

(a)

(b)

Figure 8. Single HDO parameter ID results for road network truth simulation shown in Fig.

(c) 6, with noisy UGS specs (au = 0.99, bu = 0.10)

and (atrue , btrue , ) = (0.85, 0.18): (a) log-likelihood surface for (ah , bh ), showing maximum likelihood estimate at (0.18, 1); (b) RBMC log h h particle weights along ah axis, indicating that all possible values for the HDO’s true detection rate ah are roughly equally probable; (c) RBMC log particle weights along bh axis, indicating high probability for the HDO’s false alarm rate in the neighborhood of bh = 0.18.

4.2 Fully Bayesian Parameter and State Estimation An alternative to maximum likelihood point estimation is hierarchical fully Bayesian estimation, in which priors over the unknown (ah , bh ) are assumed and predictions over ζk are made by either: (i) averaging over the unknown (ah , bh ) parameters with respect to their posterior joint distributions; or (ii) updating a set of evidence-weighted sample hypotheses for (ah , bh ) and maintaining a separate HMM Bayes’ filter for each. Although the calculations involved for both approaches are analytically intractable, good approximate solutions can be obtained in either case using modern Bayesian analysis techniques. Approach (i) is referred to here as marginal Bayesian estimation (MBE), whereas approach (ii) is referred to here as Rao-Blackwellized Monte Carlo (RBMC).12 MBE provides a naturally conservative scheme to mitigate the risk of assigning incorrect parameters, whereas RBMC provides a ‘survival of the fittest’ mechanism for narrowing down the posterior distribution over ζk and unknown (au , bu ). RBMC coincides with MBE if the weighted sample hypotheses are averaged together to form predictions about ζk . However, MBE can also be implemented in a variety of other more sophisticated ways than weighted sampling, e.g. MCMC Gibbs sampling or variational inference.11 However, these approaches are generally better suited to ‘offline’ fusion, since they require iteratively converge to final approximations of the desired posterior distributions. RBMC is naturally parallelizable and better suited to ‘online’ implementation, as each hypothesis sample can directly utilize the HMM Bayes’ filter to significantly improve estimation efficiency. RBMC also enables approximate MAP point estimation, i.e. if it is only desired to retain the highest weighted hypotheses. However, RBMC becomes computationally expensive and difficult to implement properly if many samples are desired for thorough exploration of the uncertain (au , bu ) parameter space. Fig. 8 shows example maximum likelihood and RBMC Bayesian parameter estimation results for a single HDO in the same example road network shown in Fig. 6, where data from the HDO and 10 UGS with known parameters are used to infer the unknown (ah , bh ) values in the presence of an unknown set of target states ζ1:T . The log-likelihood surface shown in Fig. 8 (a) was evaluated using a brute force calculation of the renormalization u h constant of the HMM Bayes filter for an entire time series of data Y1:T and Y1:T (T = 350 time steps) over a grid of (ah , bh ) points. This surface shows that the maximum likelihood estimate effectively converges to a ridge corresponding to the true bh (false alarm) parameter and all possible ah values, where a maximum is achieved for the point estimate ah = 1. The 150 sample RBMC weights shown in Fig. 6 (b) and (c) show the shapes of the log-posterior probability distributions over ah and bh are in fact directly related to the log-likelihoods obtained in Fig. 8 (a). These results indicate that bh is relatively easy to estimate in most cases, whereas ah is generally much more difficult to identify, especially when au is high and bu is low. This is not too suprising due to the fact that (true) observations of the target are rare, so negative observations from UGS and HDO u h naturally tend to dominate Y1:T and Y1:T . As such, it is easy to rule out false alarms by corroborating observed UGS/HDO measurements against each other and expected target location probabilities obtained the Markov chain. Furthermore, these results suggest that a mixed MBE-point estimation Bayesian approach should be used to estimate ζk on the basis of UGS and HDO data; in particular, the full distribution over ah should be taken into account due to its high uncertainty, whereas a point estimate of bh may actually be acceptable in practice

due to its relatively low uncertainty. A priori information about the reliability of human detection could also be used to strengthen the fusion results, e.g. if it is given that ah > 0.5 should hold for a ‘realistically acceptable’ human sensor.

5. CONCLUSIONS AND FUTURE WORK This work showed that the issues of efficient target state prediction, soft/hard data fusion and soft sensor parameter estimation can be addressed in a practical and theoretically sound way for UAV-enabled road target localization using hidden Markov models (HMMs) within a comprehensive Bayesian framework. A formal procedure was derived for automatically generating sparse Markov chain approximations for target state dynamics based on standard motion assumptions. The HMM model can be implemented via fast matrix operations for non-Gaussian localization aboard resource constrained UAVs, and also leads to useful statistical insights about stochastic target dynamics. The HMM’s computational efficiency can be leveraged in several ways for online data fusion, and can be leveraged to address the problem of simultaneously fusing and characterizing uncertain HDO soft sensor data via hierarchical Bayesian estimation. Future work will including integratiotn of the proposed HMM dynamics modeling approach with recently developed techniques for fusing more complex information from HDOs, e.g. using map-based sketch information.5 We will also explore algorithms for closed-loop UAV path planning that account for the HDO availability and parameter uncertainty. We will also assess different strategies for handling out-of-order/delayed data in the HMM framework and coping with data association issues for multi-target localization scenarios.

ACKNOWLEDGMENTS N. Ahmed was supported by an American Society for Engineering Education (ASEE) Air Force Summer Faculty Fellowship. Y. Cao was supported by a National Research Council (NRC) Postdoctoral Research Fellowship.

REFERENCES [1] Kingston, D., “Intruder Tracking Using UAV Teams and Ground Sensor Networks,” in [German Aviation and Aerospace Conference 2012 (DLRK 2012) ], (2012). [2] Niedfeldt, P., Kingston, D., and Beard, R., “Vehicle State Estimation within a Road Network using a Bayesian Filter,” in [American Control Conference 2011 (ACC 2011) ], 4910–4915 (2011). [3] Pannetier, B., Moras, J., Dezert, J., and Sella, G., “Study of data fusion algorithms applied to unattended ground sensor network,” in [Proceedings of SPIE: Signal Processing, Sensor/Information Fusion, and Target Recognition XXIII ], (2000). [4] Ulmke, M. and Koch, W., “Road-map assisted ground moving target tracking,” IEEE Transactions on Aerospace and Electronic Systems (4), 1264–1274 (2006). [5] Ahmed, N., Campbell, M., Casbeer, D., Cao, Y., and Kingston, D., “Fully Bayesian learning and spatial reasoning with flexible human sensor networks,” in [Proceedings of the 2015 Int’l Conf. on Cyberphysical Systems (ICCPS 2015) ], accepted, to appear, ACM/IEEE (2015). [6] Ahmed, N., Sample, E., and Campbell, M., “Bayesian multi-categorical soft data fusion for human-robot collaboration,” IEEE Transactions on Robotics 29(1), 189–206 (2013). [7] Wang, D., Kaplan, L., Le, H., and Abdelzaher, T., “On truth discovery in social sensing: A maximum likelihood estimation approach,” in [Proc. of the 11th Int’l Conf. on Information Proc. in Sensor Networks], 233–244, ACM (2012). [8] Kaupp, T., Douillard, B., Ramos, F., Makarenko, A., and Upcroft, B., “Shared Environment Representation for a Human-Robot Team Performing Information Fusion,” Journal of Field Robotics 24(11), 911–942 (2007). [9] Rabiner, L., “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE 77(2), 257–286 (1989). [10] Grinstead, C. M. and Snell, J. L., [Introduction to probability ], American Mathematical Soc. (1998). [11] Bishop, C., “Pattern Recognition and Machine Learning,” (2006). [12] Casella, G. and Robert, C. P., [Monte Carlo Statistical methods ], Springer-Verlag, New York (1999).

Bayesian Hidden Markov Models for UAV-Enabled ...

online bayesian estimation of hidden markov models ...

Hidden Markov Models - Semantic Scholar

Variational Nonparametric Bayesian Hidden Markov ...

Bayesian Variable Order Markov Models

Bayesian variable order Markov models - Frankfurt Institute for ...

101_Paper 380-Hidden Markov Models for churn prediction.pdf ...

Supertagging with Factorial Hidden Markov Models - Jason Baldridge

Discriminative Training of Hidden Markov Models by ...

Learning Hidden Markov Models Using Probabilistic ...

Supertagging with Factorial Hidden Markov Models - Jason Baldridge

Mining Trajectory Patterns Using Hidden Markov Models

Speech emotion recognition using hidden Markov models

Word Confusability --- Measuring Hidden Markov Model Similarity

Implementing a Hidden Markov Model Speech ... - CiteSeerX

variational bhattacharyya divergence for hidden markov ...

Structured Inference for Recurrent Hidden Semi-markov ...

Causal Hidden Markov Model for View Independent ...