804

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 6, JUNE 2011

Hierarchical Method for Foreground Detection Using Codebook Model Jing-Ming Guo, Senior Member, IEEE, Yun-Fu Liu, Student Member, IEEE, Chih-Hsien Hsia, Member, IEEE, Min-Hsiung Shih, and Chih-Sheng Hsu

Abstract—This paper presents a hierarchical scheme with block-based and pixel-based codebooks for foreground detection. The codebook is mainly used to compress information to achieve a high efficient processing speed. In the block-based stage, 12 intensity values are employed to represent a block. The algorithm extends the concept of the block truncation coding, and thus it can further improve the processing efficiency by enjoying its low complexity advantage. In detail, the block-based stage can remove most of the backgrounds without reducing the true positive rate, yet it has low precision. To overcome this problem, the pixel-based stage is adopted to enhance the precision, which also can reduce the false positive rate. Moreover, the shortterm information is employed to improve background updating for adaptive environments. As documented in the experimental results, the proposed algorithm can provide superior performance to that of the former related approaches. Index Terms—Background subtraction, block truncation coding, foreground detection, shadow detection, surveillance.

I. Introduction

I

N VISUAL surveillance, background subtraction is an important issue to extract foreground object for further analysis, such as human motion analysis. A challenge problem for background subtraction is that the backgrounds are usually non-stationary in practice, such as waving tree, rippling water, and light changing. Another difficult problem is that the foreground generally suffers from shadow interference which leads to wrong analysis of foreground objects. Hence, background model is highly demanded to be adaptively manipulated via background maintenance. Further well-known issues in background maintenances are available in [1]. To overcome shadows, some well-known color models or methods can be adopted, such as RGB or HSV models, gradient information or ratio edge. In particular, Horprasert et al. [2] proposed to employ statistical RGB color model to remove shadow. However, it suffers from some drawbacks, including: 1) more processing time is required to compute thresholds; Manuscript received December 17, 2009; revised June 23, 2010 and December 13, 2010; accepted February 5, 2011. Date of publication March 28, 2011; date of current version June 3, 2011. This work was supported by the National Science Council, Taiwan, under Contract NSC 99-2631-H-011-001. This paper was recommended by Associate Editor T. Hang. The authors are with the National Taiwan University of Science and Technology, Taipei 10607, Taiwan (e-mail: [email protected]; [email protected]; [email protected]; [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCSVT.2011.2133270

2) non-stationary background problem cannot be solved; and 3) a fixed threshold near the origin is used which offers less flexibility. Another RGB color model proposed by Carmona et al. [3] can solve the third problem of [2], yet it needs too many parameters for their color model. In [4] and [5], the HSV color model is employed to detect shadows. The shadows are defined by a diminution of the luminance and saturation values when the hue variation is smaller than a predefined threshold parameter. In [6] and [7], the gradient information is employed to detect shadows, and achieves good results. Yet, multiple steps are required for removing shadows, and thus it increases the complexity. Shoaib et al. [8] proposed ratio edge method to detect shadow, and the geometric heuristics was used to improve the performance. However, the main problem of this scheme is its high complexity. Most foreground detection methods are pixel-based, and one of the popular methods is the mixture of Gaussian (MOG). Stauffer-Grimson [9], [10] proposed the MOG by using multiple Gaussian distributions to represent each pixel in background modeling. The advantage is to overcome nonstationary background and thus provides better adaptation for background modeling. Yet, it has some drawbacks. One of which is that if the standard deviation is too small, a pixel may easily be judged as foreground, and vice versa. Another drawback is that it cannot remove shadows, since the matching criterion simply indicates that a pixel is classified as background when a new input pixel is within 2.5 times of the standard deviation. Chen et al. [11] proposed a hierarchical method with the MOG, the method also employs block-based and pixel-based strategies, yet shadows cannot be removed with their method. Martel-Brisson and Zaccarin [12] presented a novel pixel-based statistical approach to model moving cast shadows of non-uniform and intensity-varying objects. This approach employs MOG’s learning strategy to build statistical models for describing moving cast shadows, yet this model requires more time for learning. Benedek-Sziranyi [13] chose the CIE L*u*v* color spaces to detect foregrounds or shadows by MOG, and the texture features are employed to enhance the segmentation results. The main problem of this scheme is its low processing speed. Kim et al. [14] presented a real-time algorithm for foreground detection, which samples background pixel values and then quantizes them into codebooks (CBs). This approach can improve the processing speed by compressing background information. Moreover, two features, layered modeling/detection

c 2011 IEEE 1051-8215/$26.00 

GUO et al.: HIERARCHICAL METHOD FOR FOREGROUND DETECTION USING CODEBOOK MODEL

and adaptive CB updating, are presented for further improving the algorithm. In [15] and [16], the concept of Kohonen networks and self-organizing maps [17] were proposed to build the background model. The background model can automatically adapt to a self-organizing manner without a prior knowledge. Patwardhan et al. [18] proposed a robust foreground detection by propagating layers using the maximumlikelihood assignment, and then clustered into “layers,” in which pixels that share similar statistics are modeled as union of such nonparametric layer-models. The pixel-layer manner for foreground detection requires more time for processing, at around 10 frames/s on a standard laptop computer. In our observation, classifying each pixel to represent various types of features after background training period is a good manner for building adaptive background model. Also, it can overcome the non-stationary problem for background classification. Another foreground detection method can be classified as texturebased, in which Heikkila-Pietikainen [19] presented an efficient texture-based method by using adaptive local binary pattern (LBP) histograms to model the background of each pixel. The LBP method employs circular neighboring pixels to label the threshold difference between the neighboring pixels and the center pixel. The results are considered as a binary number which can fully represent the texture of a pattern. In this paper, a hierarchical method is proposed for high efficient foreground detection by employing block-based and pixel-based filtering to model the background. The idea of the proposed block-based strategy is from the traditional compression scheme, block truncation coding (BTC) [20], which divides an image into multiple non-overlapped blocks, and each block is only represented by the high-mean and lowmean. In this paper, four intensity values are employed to represent a block, and each pixel in a block is substituted by one of the high-top mean, high-bottom mean, low-top mean or low-bottom mean of the corresponding block. The block-based background modeling is able to efficiently detect foreground without reducing the true positive (TP) rate, yet the precision is rather low. To overcome this problem, the pixel-based CB strategy is involved to compress background information to simultaneously maintain its high speed advantage and enhance the accuracy. Moreover, a color model from the former approach [3] which can distinguish shadow, highlight, background, and foreground is modified with fewer parameters for improving the efficiency. As documented in the experimental results, the proposed method can effectively solve the non-stationary background problem. One specific problem for background subtraction is that a moving object becomes stationary foreground when it stands still for a while during the period of background construction. Consequently, this object shall become a part of the background model. For this, the short-term information is employed to solve this problem in background model construction. This paper is organized as follows. Section II presents the constructions of the two types of background models in training phase, and Section III elaborates the detail hierarchical scheme for foreground detection. Afterward, the mechanism of the short-term information models which updates the back-

Fig. 1.

805

Conceptual flowchart of the proposed algorithm.

ground models from foreground information is introduced in Section IV. Finally, Section V documents the experimental results and the conclusions are given in Section VI.

II. Background Model Construction In this paper, a coarse-to-fine foreground detection strategy is proposed, in which it involves two types of CBs, called block-based and pixel-based CBs, to filter with different sizes of areas. The background modeling of the two proposed CBs is similar to the former CB [14]. Although the former CB can provide a high efficiency in background model updating, it still involves many redundancies in our observation. To ease this problem, the weighting concept of the MOG [9] is adopted in this paper for preserving the advantage of CB, and then further speeds up the classification of foreground and background. Fig. 1 shows the overall conceptual flowchart of the proposed method. The horizontal axis shown on the top denotes the frames with different time indices (xt ). This time zones are separated into two parts, in which the first half (1 ≤ t ≤ T ) and the second half (t > T ) are for training the background models and foreground detection, respectively. Thus, the model updating algorithms are also separated into two parts for different time zones. The background models training is first introduced in this section. A. The Features Used in Block-Based Background Subtraction Suppose a frame (xt , where t ≤ T ) of size P × Q is divided into multiple non-overlapped blocks of size M × N, and each block is processed independently. For extracting the feature values to fully represent the characteristics of the block, the image compression method, called BTC, is employed to cope with this object. In BTC, the high-mean and lowmean which preserve the first-moment and second-moment characteristics of the block is adopted, while in this paper more mean parameters are utilized to provide a better discrimination capability as given below M  N  1 µ= (1) xm,n M × N m=1 n=1 M  N  (xm,n |xm,n ≥ µ) µh =

m=1 n=1

M  N   1, if xm,n ≥ µ 0, O.W. m=1 n=1

806

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 6, JUNE 2011 M  N 

µl =

(xm,n |xm,n < µ)

m=1 n=1 M  N  

(2) 1, if xm,n < µ 0, O.W.

m=1 n=1 M  N 

µht =

(xm,n |xm,n ≥ µh )

m=1 n=1 M  N   m=1 n=1 M  N 

µhb =

m=1 n=1

(3)

(xm,n |µl ≤ xm,n < µ)

m=1 n=1 M  N   m=1 n=1 M  N 

µlb =

(xm,n |µ ≤ xm,n < µh )

M  N   1, if µ ≤ xm,n < µh 0, O.W. m=1 n=1 M  N 

µlt =

Fig. 2. Algorithms of the background model training. (a) Block-based CBs. (b) Pixel-based CBs.

1, if xm,n ≥ µh 0, O.W.

1, if µl ≤ xm,n < µ 0, O.W.

(xm,n |xm,n < µl )

m=1 n=1 M  N   m=1 n=1

.

(4)

1, if xm,n < µl 0, O.W.

The variable xm,n denotes the pixel value in a block. As a result, four values, named high-top mean (µht ), high-bottom mean (µhb ), low-top mean (µlt ), and low-bottom mean (µlb ), are derived for representing a block. In some special cases such as: 1) if all xm,n in one block are identical, then set these four values as µ; 2) if all xm,n ≥ µ in one block are identical, then set µht and µhb as µh ; and 3) if all xm,n < µ in one block are identical, then set µlt and µlb as µl . Due to the frame is in RGB color space, and each block in each color channel is represented by the four values, thus a block can be represented by G G G B B B B v = {µRht , µRhb , µRlt , µRlb , µG ht , µhb , µlt , µlb , µht , µhb , µlt , µlb }. Comparing with Chen et al.’s hierarchical method [11], in which the texture information is employed to form a 48-D feature, the proposed method can effectively classify foreground and background by simply using 12 dimensions. Thus, the processing speed is superior to Chen et al.’s method, and which will be demonstrated in Section V. B. Updating Block-Based Background Models (CBs) in the Training Phase During the training period, the features of a specific block can be represented as a vector Vb = {vbt |1 ≤ t ≤ T }. A CB for a block can be represented as C = {ci |1 ≤ i ≤ L}, consisting of L codewords, and which is employed to describe the possible background of the block. Each codeword (ci ) is constructed with 12-D features as v. An additional weight wi is geared for indicating the importance of the ith codeword, and which is similar to the MOG method [9] that a codeword with a greater weight has higher likelihood of being a

background codeword in CB. Notably, each block of a frame (including various time slots) has an independent block-based CB with different sizes L, and thus in total (P/M) × (Q/N) block-based CBs in a frame. Fig. 2(a) illustrates the algorithm of the proposed blockbased background model updating in training phase, where T denotes the number of frames in the training phase, variable K is a temporary constant for recording the number of current codewords, and the parameter α denotes the learning rate. The goal of this algorithm is to utilize multiple ci with different features for describing the entire block contents across various time slots. For judging whether a block feature vbt had appeared, a match function which is widely-used throughout this paper as shown below is employed to measure the correlations with the vectors in the CB ⎧ ⎪ ⎨ true, if match(rsource , rcodeword ) =

⎪ ⎩

dT d < λ2 dim(d)

(5)

false, O.W.

where rsource and rcodeword denote the compared vectors with arbitrary dimensions, dim(·) denotes the dimension of the input vector, the parameter λ denotes the threshold for determining the compared two vectors are matched or not, and d = rsource − rcodeword .

(6)

When a codeword is matched and updated, the importance of the ith vector should be increased by increasing the value of the variable wi . The concept is straightforward that normally the background does not change as time goes by. By extracting the L(L ≤ K) codewords which similar to the background from the obtained pairs (C, w) = {(ci , wi )|1 ≤ i ≤ K}, the codewords are sorted in descending order according to the corresponding weights, and the L codewords meet the following equation are selected as for the codebook  k

  L = arg min (7) wi > η where k ≤ K k



i=1 

where wi denotes the corresponding weights of the sorted ci , K and L denote the sizes of the CB before and after the refining procedure, parameter η denotes the threshold for reserving the  qualified codewords. The refined CB C = {ci |1 ≤ i ≤ L} will be employed for the further foreground detection.

GUO et al.: HIERARCHICAL METHOD FOR FOREGROUND DETECTION USING CODEBOOK MODEL

807

C. Updating Pixel-Based Background Models (CBs) in the Training Phase The algorithm of the pixel-based CBs updating is similar to the block-based updating, and the corresponding algorithm is illustrated in Fig. 2(b). The training sources of the two updating algorithms, block-based and pixel-based, are independent, and the relationship is also shown in Fig. 1. Notably, the unit of the updating is changed from a block to a pixel, and which also means that the number of the trained pixel-based CBs is equal to the size of a frame (P × Q), not (P/M) × (Q/N) in block-based training. The training pixels in a specific position during the training period can be represented as a vector Xp = {xpt |1 ≤ t ≤ T, xpt = t t t t t t , xG,p (xR,p , x,Bp )}, in which the variables xR,p , xG,p , and xB,p denote the pixel values in R, G, and B channels, respectively. Moreover, the pixel-based CB (F = {fi |1 ≤ i ≤ K}) consisting of K codewords represents the output of the algorithm of Fig. 2(b). The pixel-based CB is more powerful in rendering the detail information than that of the block-based CB. Each codeword fi = (fR,i , fG,i , fB,i ) includes a 3-D feature which represents the background pixel values in each RGB channels. The parameter λ in (5) used in pixel-based CB updating is different from that used in the block-based CB. Normally, the λ in block-based updating should be smaller than that of the pixel-based updating to ensure a high TP rate can be achieved. After the CB updating as indicating in Fig. 2(b), the K fi  are refined by (7) with the sorted fi in descending order, in which the constant L is switched to J for representing the number of the reserved pixel-based codewords. The refined  CB F = {fi |1 ≤ i ≤ J} is then employed for further foreground detection as that of the block-based CB.

III. Hierarchical Foreground Detection After the background models training as indicated before the time point T as shown in Fig. 1, the constructed blockbased and pixel-based CBs are obtained and then applied to the proposed hierarchical foreground detection. The series of frames as denoted xT +1 , xT +2 , ... are fed into the proposed system for detecting the moving objects. To reduce the redundant foreground detection operations, most of the background regions are filtered out with the block-based background subtraction. In this process, the block-based and pixel-based background CBs are kept updating to overcome dynamic environments. Subsequently, the coarse foreground obtained by the block-based CB can be precisely refined by the pixelbased foreground detection mechanism. Finally, the detected objects (foregrounds) are further classified into three different regions, including true foreground, highlight, or shadow. The details are elaborated below. A. Foreground Detection with the Block-Based CB In the detecting phase, input frames (xt ) are first divided into non-overlapped blocks for block-based background subtraction. In addition, the 12-D vectors v calculated from each block are employed for this process. Fig. 3(a) shows the algorithm of block-based background subtraction, where

Fig. 3. Algorithms of the foreground detection. (a) Block-based coarse background filtering and CBs updating. (b) Foreground detection and color matching.

Fig. 4.

Proposed color model for foreground classification.

the variable K denotes a temporary constant which stores the updated counts of the block-based codewords; timeci and timefi denote the times when the ith block-/pixel-based codeword are updated, and which will be employed in shortterm updating as introduced in Section IV. First, the input vector (vbt ) extracted from a block is compared with the ith block-based codeword (ci ) to determine whether a match is found, in which the match function as defined in (5) is applied. Regarding the parameter λ, although the block-based stage allows more backgrounds to be classified as foregrounds (false positive), it can ensure a high correct decision (true positive) as well. When a vbt is classified as background, the corresponding block is also used to update the pixel-based CB. This is to ensure the pixel-based CB still can adapt to the various backgrounds classified by the block-based stage. However, this strategy also raises a disadvantage by increasing the processing time in foreground detection. To cope with this, the parameter Dupdate is utilized to enable the function of pixelbased updating in this algorithm. Thus, the updating function will be performed everyDupdate frames. 1) Foreground Detection with the Pixel-Based CB: This subsection introduces how to classify a pixel in a block to foreground or background as shown in Fig. 3(b). The constant timefi denotes the last updated time, and which will be discussed in Section IV. The input pixel xpt with three colors (red, green, and blue) takes turn compared with J pixel-based codewords (fi ), in which the match function is

808

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 6, JUNE 2011

simply able to determine whether xpt belongs to foreground or background. Notably, the foregrounds judged by match(·) are further classified into one true foreground and two fake foregrounds (shadow and highlight) by the proposed color model. In Carmona et al.’s work [3], a color model was proposed, which classified a pixel into shadow, highlight, background, and foreground four states in RGB color space. However, many parameters are employed in this model, which leads to a disadvantage by increasing the computational complexity. In this paper, the number of parameters is reduced to three, namely θcolor , β, and γ, to reduce the complexity. Fig. 4 illustrates the proposed color model. Given two t t t vectors xpt = (xR,p , xG,p , xB,p ) and fi = (fR,i , fG,i , f B,i ), where the first vector denotes the input color pixel and the second vector denotes the ith codeword in pixel-based CB of size J. The color model forms a cone shape with an angle θcolor , and which is bounded with a high bound and a low bound as formulated below Ymax = β fi  (8) Ymin = γ fi 

(9)

where β > 1 and γ < 1; the value fi  is the norm of the pixel-based codeword fi as defined below 2 2 2 fi  = fR,i + fG,i . (10) + fB,i The bounded cone region with gray color in Fig. 4 denotes the possible background which is affected by highlight or shadow, which means that if the input pixel xpt is beyond the gray range as indicated in Region (III), then it is classified t as true foreground. First, the norm xproj  is calculated for highlight and shadow decision, where

t xp · f i t xproj = (11) fˆi fi  and fˆi = fi / fi  denotes the unit vector of fi and t t t t xp , fi = xR,p fR,i + xG,p fG,i + xB,p fB,i .

(12)

t Thus xproj  can be calculated as below   xt , fi    t    p (13) where fˆi  = 1. xproj  = fi   t   falls into Region (I) or (II) If the calculated value xproj as labeled in Fig. 4, the xpt belongs to the fake foreground as shadow or highlight, respectively. Moreover, the angle θxt t between the two vectors xpt and xproj is also calculated for det termining that whether xp is a new foreground [Region (III)] or  an existing background codeword fi [Region (I) Region (II)] ⎛ ⎞ t dist x ⎠ θxt = tan−1 ⎝  (14)  t  xproj 

where distxt denotes the distance between the two vectors xpt t and xproj , and which is defined as    2   t 2 (15) distxt = xpt  − xproj 

Fig. 5. 2-D color model in red–green space. (a) CB’s color model [14]. (b) Proposed color model.

where xpt  is the norm of xpt and is defined as  t t t x  = xR,p 2 + xt 2 + xB,p 2. G,p p

(16)

t  and According to the above two calculated values, xproj θxt , the overall proposed color match function based on the proposed color model is organized as below   ⎧  t  ⎪ t Shadow, if θ < θ ∩ Y ≤ x ⎪ x color min proj  ⎪ ⎪ ⎪ ⎪ ⎨< fi    t , matchcolor (xpt , fi )= Highlight, if θxt < θcolor ∩ fi  ≤   x proj ⎪ ⎪ ⎪ ⎪ < Y ⎪ max ⎪ ⎩ Foreground, O.W. (17) The previous cylinder color model used in CB [14] cannot deal with the rapid light variation such as moving clouds. Fig. 5(a) illustrates the 2-D color model in red-green space to indicate the shortage of the cylinder color model, in which t the two vectors xpt and xproj , and one distance distxt in the lower part associate to normal light condition. Under highlight condition, the R, G, and B values will increase simultaneously, and the vector xpt may exceed the range of the cylinder color model as indicated in gray to make erroneous judgment. In [14], the color distortion in the color space is employed to normalize distxt , and thus solve this problem. Conversely, in this paper the problem is solved by changing the shape of the color model from cylinder to cone. As can be seen in Fig. 5(b), even the xpt is effected by the highlight, it is still covered by the proposed cone color model to avoid the above problem. It is noteworthy that the main object of this paper is to present a high efficiency foreground detection using the proposed hierarchical-stage strategy. The color model in fact can be replaced with any up-to-date method without affecting the overall structure of the proposed scheme.

IV. Background Models Updating with the Short-Term Information Models As can be seen in Fig. 1, the blocks/pixels which are classified to foreground by block-/pixel-based stages in foreground detection phase are also adopted for updating the block-based and pixel-based background models (CBs). This is the way to deal with the situations that a moving object becomes stationary background when it stands still for a while during the period of background construction. Notably, the blocks/pixels of the foreground have to update the temporary

GUO et al.: HIERARCHICAL METHOD FOR FOREGROUND DETECTION USING CODEBOOK MODEL

background model (short-term information model) first to filter out the fake backgrounds which do not provide the enough construction conditions of the true background. Fig. 6 shows the mutual relationships between the background CBs and the short-term information models. As can be seen that the two algorithms shown in Fig. 6(a) and (b) are highly similar except for some variable usages, thus in this section, the block-based CB updating algorithm from short-term information model is adopted for explanation. The readers can easily apply the concepts to the pixel-based updating algorithm from the model as shown in Fig. 6(b). If the output 12-D vector (vbt ) extracted from a block is classified to foreground by the first decision flow in Fig. 3(a), it is then considered as the input to the flowchart of Fig. 6(a). The first part of this algorithm is to update the block-based short-term information model (Cs = {cis |1 ≤ i ≤ Ls }) with vbt , and which is identical to that in the training phase as s shown in Fig. 2(a). Yet, herein an additional variable timeci is involved to store the updated time for estimating whether the corresponding ith codeword (cis ) has been updated for a specific period or not. If the duration is longer than a s predefined parameter Ddelete , the corresponding cis is simply a temporary foreground, not a stationary background. Consequently, those codewords with this property are removed for saving memory space. In addition, another parameter s wci is employed for representing the updated counts of s ci , and which is utilized to represent the stationary degree of the temporary background. When cis , is favor to strong stationary (wci s ≥ Dadd ), the short-term information model can be considered as a part of the true background model (C = {ci |1 ≤ i ≤ L}). Notably, the true background CBs still have a possibility to be changed to foreground. For instance, a vehicle parking for a long time is eventually pulled away. To cope with this, an additional parameter timeci is employed for the background codeword to store the updated time. This additional value is employed for filtering out ci which meets the states of eventually moving as foregrounds with the predefined parameter Ddelete . The parameter timeci is also employed for block-based CB as appeared in Fig. 3(a) [the parameter timefi for pixel-based CB also appears in Fig. 3(a) and (b)]. In fact, the former CB [14] also employed the shortterm information models to deal with various environments. s The parameters Ddelete and Ddelete are both employed in the proposed and CB’s schemes. Here below elaborates the main differences between ours and CB’s short-term information. s 1) Ddelete : The former CB employed the maximum negative run-length (MNRL) and the last updated time to record information. Moreover, MNRL was used to make decision for the codewords in short-term models. Conversely, the proposed method simply employs the s parameter Ddelete to record the updated time—the ith codeword is removed when the difference between the current time (t) and the last updated time of the codeword (timeci s for block-based short-term model or timefi s for block-based short-term model) is longer than s Ddelete . Thus, the proposed method is simpler than that of the former CB method.

809

Fig. 6. Background models (CBs) updating algorithms from short-term information models. (a) Block-based CBs. (b) Pixels-based CBs.

2) Ddelete : The former CB employed the last updated time to record the time information. In this case, if the period between the last updated time and the current time is longer than Ddelete , then the corresponding codeword is removed from the codebook. However, in most occasions the period is unpredictable, and thus the Ddelete was not given in the former CB. Conversely, the proposed s method makes Ddelete equals to Dadd times Ddelete (which is the worst case) to ensure reserving the last updated times for codewords in background model. Moreover, it also determines whether a codeword is removed when the short-term information is considered as the long-term information. With this strategy, the proposed method is more efficient than that of the former CB method. V. Experimental Results For measuring the accuracy of the results, the criterions false positive (FP) rate, TP rate, precision, and similarity [15] are employed as follows: fp FP rate = (18) fp + tn tp (19) TP rate = tp + fn tp (20) pecision = tp + fp tp (21) similarity = tp + fp + fn where tp, tn, fp, and fn denote the numbers of the true positives, true negatives, false positives, and false negatives, respectively, (fp + tn) indicates the total number of pixels represented foreground, and (fp+tn) indicates the total number of pixels represented background. There are in total 11 parameters used in this paper, and these can be separated into two groups as: 1) performance-guided parameters. The threshold λ = 5 for block-based CB, the threshold λ = 6 for pixel-based CB, the threshold η = 0.7, the angle of color model θcolor = 3, the ratios of the upper and lower boundaries β = 1.15 and γ = 0.72, and the period of updating Dupdate = 3, and

810

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 6, JUNE 2011

Fig. 7. Classified results of sequence [24] for Intelligent− Room (row 1), Campus (row 2), Highway− I (row 3), and Laboratory (row 4) with shadow (red), highlight (green), and foreground (blue) labels. (a) Original image. (b) Block-based stage with block of size 10 × 10. (c) Hierarchical results. TABLE I Averaged Performance Comparisons Using Test Sequence WAVINGTREES [21]

MOG [9]

FP Rate TP Rate Precision Similarity Frames/S 0.0913 0.9307 0.6955 0.6729 40.13

Cucchiara et al.’s method [5]

0.3041

0.901

0.4628

0.4422

30.56

CB [14]

0.0075

0.9434

0.929

0.8913

102.43

Chen et al.’s method [11]

0.1165

0.8562

0.645

0.5962

64.35

Chiu et al.’s method [25]

0.0603

0.5641

0.7037

0.4599

320.05

Block-based stage 5 × 5

0.0213

0.9775

0.8445

0.8281

266.96

Block-based stage 8 × 8

0.0173

0.9665

0.8523

0.8273

317.38

Block-based stage 10 × 10

0.0171

0.9741

0.8369

0.8194

363.64

Block-based stage 12 × 12

0.0178

0.918

0.8064

0.7683

390.36

Hierarchical method (5 × 5)

0.0032

0.9525

0.9694

0.9258

163.35

Hierarchical method (8 × 8)

0.0027

0.9395

0.9759

0.9193

183.41

Hierarchical method (10 × 10) 0.0019

0.9463

0.9787

0.9281

196.27

Hierarchical method (12 × 12)

0.9052

0.9707

0.8842

203.16

0.002

2) application-guided parameters. The learning rate α = 0.05, and the three parameters for adding or deleting codewords are s Dadd = 100, Ddelete = 200, and Ddelete = 200. Among these, the parameters in the second group are normally changed for specific applications, for instances: the scenario when limited hardware memory space is available; when high processing speed is preferred, or various background determination times are required. For the first group of the parameters, the system performance will be affected significantly. Thus in this paper, three sequences including WAVINGTREES [21], WATERSURFACE [22], and CAMPUS [22] are adopted as the training sequences. In fact, nine various sequences are tested in this paper, while simply the above three are involved in

Fig. 8. Foreground (white) classified results with sequence WAVINGTREES [21]. (a) Original image (frame 247). (b) Ground truth. (c) MOG [9]. (d) Cucchiara et al.’s method [5]. (e) CB [14]. (f) Chen et al.’s method [11]. (g) Chiu et al.’s method [25]. (h)–(k) Block-based stage only with block of size (h) 5 × 5, (i) 8 × 8, (j) 10 × 10, (k) 12 × 12, and (l)–(o) with proposed hierarchical method with block of size (l) 5 × 5, (m) 8 × 8, (n) 10 × 10, and (o) 12 × 12.

the training to provide a more objective experimental result for performance comparisons of the rest three sequences. It is noteworthy that the prospective readers can choose the most suited sequences for training to adapt to their applications. Herein, the genetic algorithm [23] is employed to optimize those parameters in the first group with an estimation function given as below, and the object is to achieve the highest score with the estimation function Estimation value =

tp + tn tp + fp + tn + fn

(22)

where the numbers of tp, tn, fp, and fn are calculated from the frames 242–258 of WAVINGTREES, the frames 481–525 of WATERSURFACE, and the frames 636–709 of CAMPUS with the corresponding ground truths. The estimation value is directly proportional to performance, and which is employed for estimating the performance of the combinations among parameters. The optimized results are organized above. Fig. 7 shows the test sequences [24] of size 320 × 240 with Intelligent− Room (row 1), Campus (row 2), Highway− I (row 3), and Laboratory (row 4). To provide a better understanding about the detected results, three colors, including

GUO et al.: HIERARCHICAL METHOD FOR FOREGROUND DETECTION USING CODEBOOK MODEL

Fig. 9. Accuracy values of each frame for sequence WAVINGTREES [21]. (a) FP rate. (b) TP rate. (c) Precision. (d) Similarity.

TABLE II Averaged Performance Comparisons Using Test Sequence WATERSURFACE [22]

MOG [9]

FP Rate TP Rate Precision Similarity Frames/s 0.0431 0.8969 0.5515 0.5183 46.26

Cucchiara et al.’s method [5]

0.0265

0.8122

0.637

0.5595

30.23

CB [14]

0.0038

0.8118

0.9247

0.7639

101.01

Chen et al.’s method [11]

0.0228

0.8215

0.668

0.5835

62.48

Chiu et al.’s method [25]

0.0012

0.7153

0.9539

0.6965

284.36

Block-based stage 5 × 5

0.0406

0.9582

0.5828

0.5717

212.34

Block-based stage 8 × 8

0.0554

0.9561

0.5137

0.5045

272.07

Block-based stage 10 × 10

0.0594

0.9285

0.4889

0.4749

318.58

Block-based stage 12 × 12

0.0738

0.9348

0.4411

0.4336

347.67

Hierarchical method (5 × 5)

0.0058

0.9081

0.8978

0.8279

146.05

Hierarchical method (8 × 8)

0.0052

0.9024

0.9092

0.8328

181.42

Hierarchical method (10 × 10) 0.0058

0.8791

0.8943

0.8018

190.58

Hierarchical method (12 × 12)

0.8806

0.8918

0.8077

200.64

0.006

811

Fig. 10. Foreground (white) classified results with WaterSurface [22]. (a) Original image (frame 529). (b) Ground truth. (c) MOG [9]. (d) Cucchiara et al.’s method [5]. (e) CB [14]. (f) Chen et al.’s method [11]. (g) Chiu et al.’s method [25]. (h)–(k) Block-based only with block of size (h) 5 × 5, (i) 8 × 8, (j) 10 × 10, (k) 12 × 12. (l)–(o) Proposed hierarchical method with block of size (l) 5 × 5, (m) 8 × 8, (n) 10 × 10, and (o) 12 × 12.

red, green and blue, are employed to represent shadows, highlight, and foreground, respectively. Fig. 7(b) shows the detected results using the block-based stage with block of size 10 × 10, in which most of the backgrounds can be removed. Fig. 7(c) shows the results obtained by the hierarchical strategy. Apparently, the pixel-based stage can significantly enhance the detected precision. Yet, we would like to point out a weakness of the proposed method. As it can be seen in the third row of Fig. 7 (Highway− I), when the color of the shadow is extremely dark, some of them will be classified as foreground. The related parameter γ is optimized for general case in this paper, if the setting cannot fully cope with the applied environments, users can adjust this parameter for expected results. Fig. 8 shows the test sequence WAVINGTREES [21] of non-stationary background with waving tree containing 287 frames of size 160 × 120. Compared with the five former methods, MOG [9], Cucchiara et al.’s method [5], CB [14], Chen et al.’s method [11], and Chiu et al.’s method [25] in Fig. 8(c)–(g), the proposed method can provide better performance in handling non-stationary background. Fig. 8(h)–(k) shows

812

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 6, JUNE 2011

Fig. 11. Accuracy values of each frame for sequence WaterSurface [22]. (a) FP rate. (b) TP rate. (c) Precision. (d) Similarity.

the detected results with different block sizes using simply block-based CB. Apparently, most backgrounds are removed without reducing TP rate. Most importantly, the processing speed is highly efficient with the block-based strategy. Yet, a low precision is its drawback. To overcome this problem, the pixel-based stage is involved to enhance the precision, and which can also reduce the FP rate. Fig. 8(l)–(o) shows the detected results using the proposed hierarchical scheme (block-based stage and pixel-based stage) with various block sizes. Fig. 9(a)–(d) shows the FP rate, TP rate, precision, and similarity, respectively, of each frame using the sequence WAVINGTREES [21]. Table I shows the average accuracy results using the test sequence WAVINGTREES. It is clear that the block-based stage provides high TP rate and high processing speed, yet it has lower precision and similarity. To compensate the issues of precision and similarity raised by block-based stage, the proposed hierarchical (block-based + pixel-based) can yield excellent performance in terms of the above four indices. Although the proposed hierarchical scheme reduces the frame rate compared with the block-based stage only, it still outperforms most of the former five methods. Similarly, Fig. 10 shows the test sequence WaterSurface [22] with non-stationary background of sea waves containing 632

Fig. 12. Foreground (white) classified results with CAMPUS [22]. (a) Original image (frame 695). (b) Ground truth. (c) MOG [9]. (d) Cucchiara et al.’s method [5]. (e) CB [14]. (f) Chen et al.’s method [11]. (g) Chiu et al.’s method [25]. (h)–(k) Block-based only with block of size (h) 5 × 5, (i) 8 × 8, (j) 10 × 10, (k) 12 × 12. (l)–(o) Proposed hierarchical method with block of size (l) 5 × 5, (m) 8 × 8, (n) 10 × 10, and (o) 12 × 12.

frames of size 160×128, and Fig. 11 shows the corresponding accuracy values of each frame. Table II shows the corresponding average accuracy results. Fig. 12 shows the test sequence CAMPUS [22] with non-stationary background of moving tree containing 1439 frames of size 160 × 128, and Fig. 13 shows the corresponding accuracy values of each frame. Table III shows the corresponding average accuracy results. These two sets of sequences can yield identical conclusions as that of the sequence WAVINGTREES. The proposed method can yield superior performance than the five former works in terms of the four indices. Fig. 14 shows the sequence MOVEDOBJECT [21] with a moving object, containing 1745 frames of size 160 × 120. The sequence MOVEDOBJECT is employed to test the adaptability of the background model. When the chair is moved at frame 888, after a period (Dadd ) this chair becomes a part of background with the background model. On the frame 986 of this figure, it shows a good result without any background or foreground regions. Fig. 15 shows the classified results with the test sequence PETS-2001 DATA 3 [26]. The sequence is employed to test the adaptability

GUO et al.: HIERARCHICAL METHOD FOR FOREGROUND DETECTION USING CODEBOOK MODEL

813

TABLE III Averaged Performance Comparisons Using Test Sequence CAMPUS [22] FP Rate TP Rate Precision Similarity Frames/s

Fig. 13. Accuracy values of each frame for sequence Campus [22]. (a) FP rate. (b) TP rate. (c) Precision. (d) Similarity.

MOG [9]

0.1478

0.8811

0.2862

0.2725

53.26

Cucchiara et al.’s method [5]

0.1781

0.7225

0.231

0.203

23.16

CB [14]

0.0342

0.9219

0.5567

0.528

85.87

Chen et al.’s method [11]

0.1614

0.7517

0.2562

0.2295

51.81

Chiu et al.’s method [25]

0.0604

0.4926

0.3533

0.2406

278.16

Block-based stage 5 × 5

0.0453

0.9239

0.4964

0.479

172.84

Block-based stage 8 × 8

0.0398

0.925

0.5039

0.4876

276.57

Block-based stage 10 × 10

0.0366

0.9267

0.5171

0.5018

303.03

Block-based stage 12 × 12

0.0441

0.8561

0.445

0.4257

331.14

Hierarchical method (5 × 5)

0.0132

0.9057

0.7188

0.6706

109.28

Hierarchical method (8 × 8)

0.0103

0.9019

0.7666

0.7118

140.05

Hierarchical method (10 × 10) 0.0101

0.9031

0.7698

0.7162

152.69

Hierarchical method (12 × 12) 0.0091

0.8342

0.7815

0.6957

159.74

Fig. 15. Foreground (blue) classified results with the test sequence PETS2001 DATA 3 [26], in which the short-term information is included in the proposed method. TABLE IV Average Performance FP Rate TP Rate Precision Similarity Frames/s

Fig. 14. Foreground (blue) classified results using sequence MOVEDOBJECT [21] with the proposed method when short-term information is involved.

of the illumination changes. According to the results, the proposed color model is able to deal with most of light changes except for some extremely hugh variations, which proves the effectiveness of the proposed color model. Table IV organizes the average of accuracy results from Tables I to III. It is clear that the proposed algorithm provides the highest accuracy performance among the various compared methods. Moreover, the frames/s of the proposed method is

MOG [9]

0.0941

0.9029

0.5111

0.4879

64.22

Cucchiara et al.’s method [5]

0.1696

0.8119

0.4436

0.4016

27.98

CB [14]

0.0152

0.8924

0.8035

0.7278

96.44

Chen et al.’s method [11]

0.1002

0.8098

0.5231

0.4698

59.55

Chiu et al.’s method [25]

0.0406

0.5907

0.6703

0.4657

294.19

Block-based stage 5 × 5

0.0357

0.9532

0.6412

0.6263

217.38

Block-based stage 8 × 8

0.0375

0.9492

0.6233

0.6065

288.67

Block-based stage 10 × 10

0.0377

0.9431

0.6143

0.5987

328.42

Block-based stage 12 × 12

0.0452

0.9030

0.5642

0.5425

356.39

Hierarchical method (5 × 5)

0.0074

0.9221

0.8620

0.8081

139.56

Hierarchical method (8 × 8)

0.0061

0.9146

0.8839

0.8213

168.29

Hierarchical method (10 × 10) 0.0059

0.9095

0.8809

0.8154

179.85

Hierarchical method (12 × 12) 0.0057

0.8733

0.8813

0.7959

187.85

also superior to the four former approaches except for Chiu et al.’s method [25]. Nonetheless, the proposed method can provide much better performance than that of Chiu et al.’s method in terms of the rest four indices. The testing platform is under C language with Intel core 2, 2.4 GHz CPU, 2G RAM, and Windows XP SP2 operating system. In general,

814

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 6, JUNE 2011

the larger block can achieve a higher processing speed, yet lower TP rate, and vice versa. We would like to recommend a processing-speed-oriented application to choose a larger block, while a smaller block would be a promising choice for TP rate-oriented application. VI. Conclusion A hierarchical method for foreground detection with blockbased and pixel-based CBs is proposed. The block-based stage can enjoy high speed processing speed and detect most of the foreground without reducing TP rate, and the pixel-based stage can further improve the precision of the detected foreground object with reducing FP rate. Moreover, a modified color model and match function have also been introduced in this paper which can classify a pixel into shadow, highlight, background, and foreground. To cope with the adaptive environments, the short-term information is employed to improve background updating. As documented in the experimental results, the hierarchical method provides high efficient for background subtraction which can be a good candidate for vision-based applications, such as human motion analysis or surveillance systems. References [1] K. Toyama, J. Krumm, B. Brumitt, and B. Meyers, “Wallflower: Principles and practice of background maintenance,” in Proc. IEEE Int. Conf. Comput. Vis., vol. 1. Sep. 1999, pp. 255–261. [2] T. Horprasert, D. Harwood, and L. S. Davis, “A statistical approach for real-time robust background subtraction and shadow detection,” in Proc. IEEE Int. Conf. Comput. Vis., vol. 99. Sep. 1999, pp. 1–19. [3] E. J. Carmona, J. Martinez-Cantos, and J. Mira, “A new video segmentation method of moving objects based on blob-level knowledge,” Pattern Recognit. Lett., vol. 29, no. 3, pp. 272–285, Feb. 2008. [4] R. Cucchiara, C. Grana, M. Piccardi, A. Prati, and S. Sirotti, “Improving shadow suppression in moving object detection with HSV color information,” in Proc. IEEE Conf. Intell. Transportation Syst., Aug. 2001, pp. 334–339. [5] R. Cucchiara, C. Grana, M. Piccardi, and A. Prati, “Detection moving objects, ghosts, and shadows in video streams,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 25, no. 10, pp. 1337–1342, Oct. 2003. [6] M. Izadi and P. Saeedi, “Robust region-based background subtraction and shadow removing using color and gradient information,” in Proc. Int. Conf. Pattern Recognit., Dec. 2008, pp. 1–5. [7] W. Zhang, X. Z. Fang, X. K. Yang, and Q. M. J. Wu, “Moving cast shadows detection using ratio edge,” IEEE Trans. Multimedia, vol. 9, no. 6, pp. 1202–1213, Oct. 2007. [8] M. Shoaib, R. Dragon, and J. Ostermann, “Shadow detection for moving humans using gradient-based background subtraction,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Apr. 2009, no. 4959698, pp. 773–776. [9] C. Stauffer and W. E. L. Grimson, “Adaptive background mixture models for real-time tracking,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2. Jun. 1999, pp. 246–252. [10] C. Stauffer and W. E. L. Grimson, “Learning patterns of activity using real-time tracking,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 747–757, Aug. 2000. [11] Y.-T. Chen, C.-S. Chen, C.-R. Huang, and Y.-P. Hung, “Efficient hierarchical method for background subtraction,” Pattern Recognit., vol. 40, no. 10, pp. 2706–2715, Oct. 2007. [12] N. Martel-Brisson and A. Zaccarin, “Learning and removing cast shadows through a multidistribution approach,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 7, pp. 1133–1146, Jul. 2007. [13] C. Benedek and T. Sziranyi, “Bayesian foreground and shadow detection in uncertain frame rate surveillance videos,” IEEE Trans. Image Process., vol. 17, no. 4, pp. 608–621, Apr. 2008. [14] K. Kim, T. H. Chalidabhongse, D. Harwood, and L. Davis, “Real-time foreground-background segmentation using codebook model,” RealTime Imaging, vol. 11, no. 3, pp. 172–185, Jun. 2005.

[15] L. Maddalena and A. Petrosino, “A self-organizing approach to background subtraction for visual surveillance applications,” IEEE Trans. Image Process., vol. 17, no. 7, pp. 1168–1177, Jul. 2008. [16] L. Massalena and A. Petrosino, “Multivalued background/foreground separation for moving object detection,” in Proc. Fuzzy Logic Applicat., LNCS 5571. 2009, pp. 263–270. [17] T. Kohonen, Self-Organization and Associative Memory, 2nd ed. Berlin, Germany: Springer-Verlag, 1988. [18] K. Patwardhan, G. Sapiro, and V. Morellas, “Robust foreground detection in video using pixel layers,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 4, pp. 746–751, Apr. 2008. [19] M. Heikkila and M. Pietikainen, “A texture-based method for modeling the background and detecting moving objects,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 4, pp. 657–662, Apr. 2006. [20] E. J. Delp and O. R. Mitchell, “Image compression using block truncation coding,” IEEE Trans. Commun. Syst., vol. 27, no. 9, pp. 1335–1342, Sep. 1979. [21] Test Images for Wallflower Paper [Online]. Available: http://research. microsoft.com/en-us/um/people/jckrumm/WallFlower/TestImages.htm [22] Statistical Modeling of Complex Background for Foreground Object Detection [Online]. Available: http://perception.i2r.a-star.edu.sg/ bk− model/bk− index.html [23] L. Davis, Handbook of Genetic Algorithms. New York: Van Nostrand Reinhold, 1991. [24] Autonomous Agents for On-Scene Networked Incident Management [Online]. Available: http://cvrr.ucsd.edu/aton/shadow/index.html [25] C.-C. Chiu, M.-Y. Ku, and L.-W. Liang, “A robust object segmentation system using a probability-based background extraction algorithm,” IEEE Trans. Circuits Syst. Video Technol., vol. 20, no. 4, pp. 518–528, Apr. 2010. [26] PETS2001 Datasets [Online]. Available: http://www.cvg.cs.rdg.ac.uk/ PETS2001/pets2001-dataset.html

Jing-Ming Guo (M’06–SM’10) was born in Kaohsiung, Taiwan, on November 19, 1972. He received the B.S.E.E. and M.S.E.E. degrees from National Central University, Taoyuan, Taiwan, in 1995 and 1997, respectively, and the Ph.D. degree from the Institute of Communication Engineering, National Taiwan University, Taipei, Taiwan, in 2004. From 1998 to 1999, he was an Information Technique Officer with the Chinese Army. From 2003 to 2004, he was granted the National Science Council Scholarship for advanced research from the Department of Electrical and Computer Engineering, University of California, Santa Barbara. He is currently a Professor with the Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei. His current research interests include multimedia signal processing, multimedia security, digital halftoning, and digital watermarking. Dr. Guo is a senior member of the IEEE Signal Processing Society. He received the Excellence Teaching Award in 2009, the Research Excellence Award in 2008, the Acer Dragon Thesis Award in 2005, the Outstanding Paper Award from IPPR, Computer Vision, and Graphic Image Processing in 2005 and 2006, and the Outstanding Faculty Award in 2002 and 2003.

Yun-Fu Liu (S’09) was born in Hualien, Taiwan, on October 30, 1984. He received the M.S.E.E. degree from the Department of Electrical Engineering, Chang Gung University, Taoyuan, Taiwan, in 2009. Currently, he is pursuing the Doctoral degree from the Department of Electronic Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan. His current research interests include digital halftoning, digital watermarking, pattern recognition, and intelligent surveillance systems. Mr. Liu is a student member of the IEEE Signal Processing Society. He received the Special Jury Award from Chimei Innolux Corporation, Jhunan, Taiwan, in 2009, and the Third Masters Thesis Award from the Fuzzy Society, China, in 2009.

GUO et al.: HIERARCHICAL METHOD FOR FOREGROUND DETECTION USING CODEBOOK MODEL

Chih-Hsien Hsia (M’10) was born in Taipei, Taiwan, in 1979. He received the B.S. degree in electronics engineering from the Technology and Science Institute of Northern Taiwan, Taipei, Taiwan, in 2003, and the M.S. degree in electrical engineering and Ph.D. degree from Tamkang University, New Taipei, Taiwan, in 2005 and 2010, respectively. He was a Visiting Scholar with Iowa State University, Ames, from July 2007 to September 2007. He is now a Post-Doctoral Research Fellow with the Multimedia Signal Processing Laboratory, Graduate Institute of Electrical Engineering, National Taiwan University of Science and Technology, Taipei. He is currently an Adjunct Assistant Professor with the Department of Electrical Engineering, Tamkang University, New Taipei, Taiwan. His current research interests include DSP IC design, image/video processing, multimedia compression system design, multiresolution signal processing algorithms, and computer/robot vision processing. Dr. Hsia is a member of the Phi Tau Phi Scholastic Honor Society.

Min-Hsiung Shih was born in Kaohsiung, Taiwan, on December 25, 1987. He received the B.S. degree from the Department of Computer and Communication Engineering, National Kaohsiung First University of Science and Technology, Kaohsiung, Taiwan, in 2010. Currently, he is pursuing the Masters degree from the Department of Electronic Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan. His current research interests include pattern recognition and intelligent surveillance systems.

815

Chih-Sheng Hsu was born in Taichung, Taiwan, on December 20, 1984. He received the B.S.E.E. and M.S.E.E. degrees from the National Taiwan University of Science and Technology, Taipei, Taiwan, in 2007 and 2010, respectively. His current research interests include intelligent surveillance.

Hierarchical Method for Foreground Detection Using Codebook Model

Jun 3, 2011 - important issue to extract foreground object for further analysis, such as human motion analysis. .... this object shall become a part of the background model. For this, the short-term information is employed ...... processing-speed-oriented application to choose a larger block, while a smaller block would be a ...

1MB Sizes 3 Downloads 282 Views

Recommend Documents

Saliency Detection via Foreground Rendering and Background ...
Saliency Detection via Foreground Rendering and Background Exclusion.pdf. Saliency Detection via Foreground Rendering and Background Exclusion.pdf.

Nonparametric Hierarchical Bayesian Model for ...
employed in fMRI data analysis, particularly in modeling ... To distinguish these functionally-defined clusters ... The next layer of this hierarchical model defines.

BAYESIAN HIERARCHICAL MODEL FOR ...
NETWORK FROM MICROARRAY DATA ... pecially for analyzing small sample size data. ... correlation parameters are exchangeable meaning that the.

Nonparametric Hierarchical Bayesian Model for ...
results of alternative data-driven methods in capturing the category structure in the ..... free energy function F[q] = E[log q(h)] − E[log p(y, h)]. Here, and in the ...

Hierarchical Constrained Local Model Using ICA and Its Application to ...
2 Computer Science Department, San Francisco State University, San Francisco, CA. 3 Division of Genetics and Metabolism, Children's National Medical Center ...

A New Shot Change Detection Method Using ...
Department of Electronic Engineering, Shanghai Jiao Tong University ..... Consumer Electronics, vol. 49, pp. ... Circuits and Systems for Video Technology, vol.

A Saliency Detection Model Using Low-Level Features ...
The authors are with the School of Computer Engineering, Nanyang ... maps: 1) global non-linear normalization (being simple and suit- ...... Nevrez İmamoğlu received the B.E. degree in Com- ... a research associate for 1 year during 2010–2011 ...

A Hierarchical Model for Value Estimation in Sponsored ...
Also, our hierarchical model achieves the best fit for utilities that are close to .... free equilibrium, while Varian [22] showed how it could be applied to derive bounds on ...... the 18th International World Wide Web Conference, pages 511–520 ..

Dynamic Model Selection for Hierarchical Deep ... - Research at Google
Figure 2: An illustration of the equivalence between single layers ... assignments as Bernoulli random variables and draw a dif- ..... lowed by 50% Dropout.

a hierarchical model for device placement - Research at Google
We introduce a hierarchical model for efficient placement of computational graphs onto hardware devices, especially in heterogeneous environments with a mixture of. CPUs, GPUs, and other computational devices. Our method learns to assign graph operat

A nonparametric hierarchical Bayesian model for group ...
categories (animals, bodies, cars, faces, scenes, shoes, tools, trees, and vases) in the .... vide an ordering of the profiles for their visualization. In tensorial.

A Hierarchical Conditional Random Field Model for Labeling and ...
the building block for the hierarchical CRF model to be in- troduced in .... In the following, we will call this CRF model the ... cluster images in a semantically meaningful way, which ..... the 2004 IEEE Computer Society Conference on Computer.

A Double Thresholding Method For Cancer Stem Cell Detection ...
A Double Thresholding Method For Cancer Stem Cell Detection ieee.pdf. A Double Thresholding Method For Cancer Stem Cell Detection ieee.pdf. Open.

Revealing Method for the Intrusion Detection System
Detection System. M.Sadiq Ali Khan. Abstract—The goal of an Intrusion Detection is inadequate to detect errors and unusual activity on a network or on the hosts belonging to a local network .... present in both Windows and Unix operating systems. A

A Flow Cytometry Method for Rapid Detection and ...
was developed as an automated instrument for routine testing ... Phone: 61-2-98508157. .... a single instrument for numerous rapid microbiological assay.

New Top-Down Methods Using SVMs for Hierarchical ...
where examples can be assigned to more than one class simultaneously and ...... learning and its application to semantic scene classification,” in Inter- national ...

Hierarchical Co-salient Object Detection via Color Names - GitHub
HCN. Figure 2: Illustration of single-layer combination. Border Effect. In the testing data set, some of the input images have thin artificial borders, which may affect.

An Active Contour Model for Spectrogram Track Detection
Oct 26, 2009 - Department of Computer Science, University of York, Heslington, .... for Q = UT Vx(t)y(t) − µ where µ and Σ the mean and standard deviation of.

Model Based Approach for Outlier Detection with Imperfect Data Labels
much progress has been done in support vector data description for outlier detection, most of the .... However, subject to sampling errors or device .... [9] UCI Machine Learning Repository [Online]. http://archive.ics.uci.edu/ml/datasets.html.

a generalized model for detection of demosaicing ... - IEEE Xplore
Hong Cao and Alex C. Kot. School of Electrical and Electronic Engineering. Nanyang Technological University. {hcao, eackot}@ntu.edu.sg. ABSTRACT.

Agglomerative Hierarchical Speaker Clustering using ...
news and telephone conversations,” Proc. Fall 2004 Rich Tran- ... [3] Reynolds, D. A. and Rose, R. C., “Robust text-independent speaker identification using ...

An Optimization Model for Outlier Detection in ...
Department of Computer Science and Engineering, ... small subset of target dataset such that the degree of disorder of the resultant dataset after the removal ... Previous researches on outlier detection broadly fall into the following categories.

An Elliptical Boundary Model for Skin Color Detection - CiteSeerX
... Y. Lee and Suk I. Yoo. School of Computer Science and Engineering, Seoul National University ... depends on the degree of overlapping between object color ...