EUDEMON: A System for Online Video Frame Copy Detection by Earth Mover’s Distance Jia Xu 1 , Qiushi Bai 1 , Yu Gu 1 , Anthony K.H. Tung 2 , Guoren Wang 1 , Ge Yu 1 , Zhenjie Zhang 3 1

College of Information Science and Engineering, Northeastern University, China {xujia,baiqs,guyu,wangguoren,yuge}@ise.neu.edu.cn 2

School of Computing, National University of Singapore, Singapore [email protected]

3

Advanced Digital Sciences Center, Illinois at Singapore Pte., Singapore [email protected]

0.4 0.4

0.4

0.3

0.0

0.4 0.4

0.3

0.4

f45=0.3

0.3 0.3 0.0

0.3

0.3

0.0

1.4 > 1.2

0.3 0.3

f34=0.3

0.4 0.6

0.3 0.3

f12=0.4

y

x

0.3 0.3

y x

EMD

I. I NTRODUCTION With the proliferation of inexpensive video collection and storage equipments, online video sharing has become popular in recent years. Correspondingly, video patent infringement is appearing as major concern of video sharing systems, e.g., YouTube and Facebook. There are pressing needs for these systems to identify videos with potential problems whenever they are uploaded. To implement this functionality, many researchers are devoting efforts to the development of efficient and effective mechanisms for online video copy detection. Currently, the major video copy detection techniques can be categorized into two classes, namely watermark-based and content-based image retrieval (CBIR) [1], [2]. The watermarkbased solution relies on the addition of artificial signals into videos, which can be easily identified in video copies. However, this technique may fail to work when the watermarks are destroyed or distorted during the transmission. Thus, it is increasingly difficult to design encoding and decoding techniques for robust video watermarking. Content-based detection techniques, on the other hand, is believed to provide better

x Mahattan

Abstract—The Earth Mover’s Distance, or EMD for short, has been proven to be effective for content-based image retrieval. However, due to the cubic complexity of EMD computation, it remains difficult to use EMD in applications with stringent requirement for efficiency. In this paper, we present our new system, called EUDEMON, which utilizes new techniques to support fast Online Video Frame Copy Detection based on the EMD. Given a group of registered frames as queries and a set of targeted detection videos, EUDEMON is capable of identifying relevant frames from the video stream in real time. The significant improvement on efficiency mainly relies on the primal-dual theory in linear programming and well-designed B + tree filters for adaptive candidate pruning. Generally speaking, our system includes a variety of new features crucial to the deployment of EUDEMON in real applications. First, EUDEMON achieves high throughput even when a large number of queries are registered in the system. Second, EUDEMON contains self-optimization component to automatically enhance the effectiveness of the filters based on the recent content of the video stream. Finally, EUDEMON provides a user-friendly visualization interface, named EMD Flow Chart, to help the users to better understand the alarm with the perspective of the EMD.

1.0 < 1.5

z

1.0

x

0.4

f11=0.4

z

0.3 0.3

0.3 0.3

f31=0.3 f41=0.3

1.0

Cij=|i - j|

Fig. 1.

The advantage of Earth Mover’s Distance

detection accuracy, on the basis of the idea “video itself is the best watermark” [1]. Given a video clip, key frames are extracted and then represented by a high-dimensional histogram on specific color space, e.g., RGB space or HSV space [3]. In database community, recent years have also witnessed some attempts on streaming processing engine for video copy detection [4], [5], all of which model videos as combinations of high-dimensional points in specific color space. While such models are effective on detecting nearduplicate videos, the problem remains difficult to tackle when only some of the frames in the video are copied from others. In this demo, instead of searching for complete video clip copies, we focus on finding similar frames on high-speed video streams. Due to the distortions, resolution adjustment, format transformation and other common factors on online videos, a major challenge for us involves selecting an effective distance metric that can distinguish different video frames in the form of histograms and ensuring the match results are consistent with human’s perception to the similarity. In this demo, we present our new system based on the CBIR scheme and employ the Earth Mover’s Distance (EMD), which has been proven to be effective under the CBIR context. The Earth Mover’s Distance, originally developed within the computer vision community, provides a highly robust similarity measurement in the analysis of image databases [3], [6], [7]. Given two high-dimensional image features in the form of histograms, EMD models their dissimilarity as the minimal amount of work necessary to transform one histogram into

another. The physical meaning of EMD implicitly includes a Ground Distance Matrix [Cij ], indicating the distance between the ith bin and the j th bin in the histogram. EMD not only compares the differences between aligned bins of two histograms, but also takes the neighboring bins into account. Compared with existing bin-by-bin distance measures, e.g., pP Lp norms as Lp (x, y) = p ni=1 (xi − yi )p , EMD is more robust even under the occurrence of tiny distribution shift. In Figure 1, we present an example to illustrate the advantage of EMD. In the figure, x, y and z represent the color histograms extracted from three images. x and y are different only by a slight distribution shift in the color space, probably due to the diversity of recording equipments. On the other hand, z shows a total disparate distribution compared to x. By applying binby-bin measure on them, e.g., Manhattan distance, x is more similar to z, although they are intuitively very different. If we employ EMD instead, by using Manhattan distance as the ground distance (Cij = |i − j|), the dissimilarity between x and y are dramatically smoothed. The EMD between x and z becomes larger, because it takes more cost to move between bins that are far away from each other. This example implies that EMD is generally more effective, since it better matches the human’s perception. However, the computation of EMD suffers from a cubic time complexity. To speed up the query processing for image searching based on EMD, we have previously designed effective indexing method based on B + tree and a similarity lower bound filter that utilizes the primal-dual theory in linear programming [8]. In this demo, we extend our previous techniques and develop EUDEMON to support online video frame copy detection, with some new enhancements to meet the requirements of real-time system. II. T ECHNICAL D ETAILS

B. The EUDEMON System and Techniques

A. Research Challenges In content-based image retrieval, the histogram of an image normally refers to a synopsis of the pixel intensity values. To facilitate similarity search, we transform the query images and video frames into greyscale histograms, each of which consists of 256 integers indicating the number of pixels on different greyscale values. These histograms generally reflect the light and shade information of the images. In our demo, we use the Earth Mover’s Distance (EMD) as the similarity measure for histogram comparisons. EMD is formulated by a linear program and can be solved by the simplex method. In the following, we give its formal definition. Definition 2.1: Earth Mover’s Distance (EMD) Given two histograms x and y, the ground distance matrix C = [cij ], the Earth Mover’s Distance between x and y is the optimum solution achieved by the linear program: M inimize : s.t.

P fij ·cij P P i,j min{ i x[i], j y[j]}

∀i :

P

Pj

fij = x[i]

∀j : i fij = y[j] ∀i, j : fij ≥ 0

Where, fij denotes the moving work flow from xi to yj . The complexity of solving the linear program above is O(n3 lg n), where n is the number of histogram bins. This cubic complexity becomes the major obstacle towards online video frame copy detection which deploys EMD as the distance metric. Instead of running exact EMD computation to match the registered queries, general approaches are to adopt filtering methods with lower complexity to prune off unpromising records. In recent years, some attempts on this direction have been made to accelerate EMD-based query processing. In [6], Assent et al. presented some useful lower bound functions for EMD. However, their methods can not well support the high-dimensional data. Wichterich et al. then proposed a flexible dimensionality reduction technique and lower bounds for EMD which can be calculated in the reduced space [7]. Although the dimensionality reduction framework extends the application area of EMD to high-dimensional data, it remains difficult to find an efficient index technique for EMD in high-dimensional space. In [8], we proposed a new database indexing approach to answer EMD-based similarity queries. Our solution is inspired by the primal-dual theory in linear programming and deploys B + tree index for effective candidate pruning. To the best of our knowledge, it is the most efficient algorithm for EMD-based similarity search. In EUDEMON system, we further extend our tree-based filtering technique on the processing of continuous queries towards the online video data. Moreover, by considering the properties of running environment for continuous queries, some new methods, namely the adaptive B + tree filter adjustment and the multi-query optimization, are implemented in EUDEMON to further improve the system throughput.

(1)

System Overview: The EUDEMON system is implemented using C++ with the support of the OpenCV library1 in version 1.0. Figure 2 gives an overview of the general architecture of EUDEMON. A user can register his/her own images as queries in the EUDEMON Query Registrar. After loading the query images and targeted video into the system, EUDEMON starts sampling frames from the video at a fixed rate. When a sampled frame arrives at the EUDEMON Query Processor, its greyscale histogram is constructed. The similarity between the extracted histogram and the histogram of each registered query is verified by the EUDEMON Filter Chain. Two more optimization submodules, namely the EUDEMON Adapter and the EUDEMON Optimizer, are also embedded into the EUDEMON Query Processor. The EUDEMON Adapter enhances the pruning ability of the filter chain by adaptively adjusting the B + tree filters based on the latest incoming frame from the video. The EUDEMON Optimizer further reduces the exact EMD computations by incorporating the idea of multi-query optimization. Another important functionality of EUDEMON is provided by its Visualization Interface. It illustrates how one histogram is transformed into the other by 1 http://sourceforge.net/projects/opencvlibrary/

EUDEMON Query Processor

Online video

Sampling frame ……

EUDEMON Filter Chain

+

EUDEMON Visualization Interface

Query

EUDEMON Adapter

EUDEMON Online videoQueries Registrar EUDEMON Optimizer

Fig. 2.

EUDEMON Filter Chain: This module processes arriving frames in a filter-and-refine manner, i.e., equipping the system with a series of lower or upper bound filters on EMD and try its best to eliminate no-hit records before the final EMD refinement. The Filter Chain is efficiently supported by our B + tree filter which is deployed at the very beginning of the chain. The design of the B + tree filter is inspired by the primal-dual theory in linear programming. That is if we deem the solving process of EMD (Equation 1) as a primal problem with a minimization objective function, it has one and only one dual problem which has a maximization objective function. Given any of a feasible solution in the dual space, the value of the objective function derives a lower bound on EMD. Based on this observation, our B + tree filter is proposed (see Equation 6 in [8]). We deploy L B + tree filters based on L different feasible solutions in the dual space and this kind of co-work will maximize their pruning power. After the B + tree filters, we also equip R-EMD filter [7], LBIM filter [6] and U Bp filter [8] in the filter chain before the exact EMD calculation. More technical details about the filter chain can be found in our previous paper [8]. EUDEMON Adapter: In this module, we designed two components to enhance the adaptivity of the B + tree filters. The first component ensures that the best B+ tree filters are always used at any moment. And the second component aims to automatically adjust the number of B + tree filters based on the statistics on recent queries.



A flow assignment

Overview of EUDEMON architecture

moving ‘earth’ among the histogram bins. This help the users to gain some understanding on the matching results. In the following, we introduce the major modules and the key techniques used in each module.



Result

Adaptively update the ineffective B + tree filter. Since EUDEMON is running in a dynamic video stream environment, to keep the effectiveness of our B + tree filters, it is necessary to adaptively update the B + tree filters according to the recent sampled frame. One important property of the video stream is that most of the consecutive sampled frames are temporal correlated. To utilize this property, we implement an adaptive updating mechanism in EUDEMON to update the most ineffective B + tree filter using the recent-generated feasible solution Φ. For Φ is obtained based on the recent coming frame, the newly-updated B + tree filter will play a powerful role in filtering the subsequent frames. Automatically tune the number of B + tree filters. It is easy to understand that using more B + tree filters

Fig. 3.

EMD flow chart in EUDEMON

may prune more unqualified frames. However, by our empirical studies, with the increase of the filter number, the integrated pruning capability of B + tree filters will gradually become steady. Meanwhile, using more B + tree filters means spending more time in the filtering verification. In EUDEMON, the system recommends the user an optimal number of B + tree filters based on the recent statistic data on query processing. EUDEMON Optimizer: This module utilizes the correlation among registered queries and implements an efficient multi-query processing scheme to further reduce the system workload. Our approach is described as follows. Whenever an inevitable EMD refinement happens, we may know the exact EMD between a query q and a frame f . Then the EMD lower bounds between f and the other similar queries can be derived by using the triangle inequality. Those lower bounds are very helpful for further pruning the no-hit queries. EUDEMON Visualization Interface: Compared with those bin-by-bin distance metrics, EMD can not be directly perceived. Thus, we design the EMD Flow Chart to visualize the differences between the query histogram Hq and the result histogram Hr from the EMD’s perspective. See Figure 3 for an example. In the flow chart, different bins in Hq are painted with different colors. By decorating each bin in Hr according to the color of its received flows, the users can clearly recognize the assignment of each moving-out flow from Hq . To the best of our knowledge, EMD flow chart is the first visualization interface focusing on helping the users to better understand the query results in the perspective of the Earth Mover’s Distance. III. D EMONSTRATION Our EUDEMON system handles the task of Online Video Frame Copy Detection. It aims to help video sharing web site, such as YouTube, and Facebook, to find potential copyright problem on user uploading videos. EUDEMON uses the EMD which has been proven its outstanding performance in image retrieval as the underlying distance. The frame copy surveillance can thus be achieved by registering images which you care about as the queries. The processing efficiency of EUDEMON is ensured by adopting our optimization techniques. We downloaded popular videos, such as an MTV of Lady GaGa, from YouTube as the targets of supervision. And then we extracted key frames from the videos as the query images.

Grey level

Original image

Low color temperature High color temperature

(b) Processing interface

(c) Result display interface Fig. 4.

Display of the EUDEMON interfaces

Avg. Frame Processing Throughput

Fig. 5.

EUDEMON SAR

350 300 250 200 150 100 50 0 5

10

Darker

High saturation

Distortion

Query setting for system robustness testing

400

Avg. Frame Processing Throughput

(a) Basic interface

Lighter

20

40

80

200

EUDEMON SAR

150

100

50

0 5

10

Query Number

All frames are transformed into greyscale histograms in our system. The user of our EUDEMON system will experience three interfaces, see Figure 4. At the Basic Interface (Figure 4(a)), the user submits his/her query images and the video to be monitored. Press any of the query image will pop out its corresponding greyscale histogram. The system transits to the Processing Interface (Figure 4(b)), after the user finishing loading queries and video into the system. In this interface, it displays the queries and their corresponding qualified frames in the left top panel and on the left bottom panel he/she can observe the real-time filter rate changes of different filters equipped in the filter chain. One interesting design of EUDEMON is the way we demonstrate the system performance. Instead of listing the system throughput in the interface, we play the processed frames one by one in the player shown in the right top corner. This will give the user a different but impressive experience to monitor our system performance. After finishing all sampled frames, the system automatically turns to the Result Display Interface (Figure 4(c)). It displays the statistic trend of the system throughput. When click on any query image and one of its result, the EMD flow chart is laid out on left. More information about EUDEMON, including a demo video, is available on our project home page [9]. In practice, a user-concerned frame might be changed artificially in a pirated video. To test the robustness of our system in frame retrieval, we do a few modifications on each query frame, see Figure 5 for illustration. The testing results shown in [9] shows that our system is quite robust and it successfully returns similar frames for each modified query image without false positive. The robustness of our system stems from the utilization of the EMD which consider both of the aligned bins and neighbor bins. Hereby, we also report the throughput test results of EUDEMON system. The experimental results in Figure 6 show that when the registered query number is as large as 80, EUDEMON still guarantees an average throughput at high rate, implying our system is efficient enough to support realtime operations.

(a) On Movie

20 Query Number

40

80

(b) On Music Video Clip

Fig. 6.

EUDEMON vs. SAR

IV. C ONCLUSION We propose to demonstrate our EUDEMON system which takes the Earth Mover’s Distance (EMD) as the underlying similarity measure to perform the online video frame copy detection. EUDEMON provides a series of optimization mechanisms to handle the online EMD-based similarity comparisons. ACKNOWLEDGMENT J. Xu, Q. Bai, Y. Gu, G. Wang and G. Yu are supported by the National Basic Research Program of China (973 Program) under grant 2012CB316201, the National Natural Science Foundation of China (No. 60933001 and No. 61003058), and the Fundamental Research Funds for the Central Universities (No. N100704001). A. Tung is supported by Singapore NRF grant R-252-000-376-279. Z. Zhang is supported by Human Sixth Sense Project (HSSP) from Singapore A*STAR. R EFERENCES [1] A. Hampapur, K. Hyun, and R. M. Bolle, “Comparison of sequence matching techniques for video copy detection,” in Storage and Retrieval for Media Databases, 2002, pp. 194–201. [2] A. Joly, O. Buisson, and C. Fr´elicot, “Statistical similarity search applied to content-based video copy detection,” in BDA, 2005. [3] Y. Rubner, C. Tomasi, and L. J. Guibas, “The earth mover’s distance as a metric for image retrieval,” IJCV, vol. 40, no. 2, pp. 99–121, 2000. [4] H. T. Shen, X. Zhou, Z. Huang, J. Shao, and X. Zhou, “Uqlips: A realtime near-duplicate video clip detection system,” in VLDB, 2007, pp. 1374–1377. [5] Y. Yan, B. C. Ooi, and A. Zhou, “Continuous content-based copy detection over streaming videos,” in ICDE, 2008, pp. 853–862. [6] I. Assent, A. Wenning, and T. Seidl, “Approximation techniques for indexing the earth mover’s distance in multimedia databases,” in ICDE, 2006, p. 11. [7] M. Wichterich, I. Assent, P. Kranen, and T. Seidl, “Efficient emd-based similarity search in multimedia databases via flexible dimensionality reduction,” in SIGMOD, 2008, pp. 199–212. [8] J. Xu, Z. Zhang, A. K. H. Tung, and G. Yu, “Efficient and effective similarity search over probabilistic data based on earth mover’s distance,” PVLDB, vol. 3, no. 1, pp. 758–769, 2010. [9] “EUDEMON Project. http://faculty.neu.edu.cn/ise/xujia/home/EUDEMONIntroduction.html.”

EUDEMON: A System for Online Video Frame Copy ...

support fast Online Video Frame Copy Detection based on the. EMD. Given a ... categorized into two classes, namely watermark-based and content-based ... computer vision community, provides a highly robust similar- .... (a) Basic interface.

500KB Sizes 1 Downloads 246 Views

Recommend Documents

MCFIS: BETTER I-FRAME FOR VIDEO CODING ...
Index Terms—Video coding, uncovered background, light change ... that of ME) and bits for index codes are wasted. Moreover ..... Full-search fractional ME with ±15 as the search length is used. For comparison, we have selected Ding's algorithms an

Read video frame data from file - MATLAB readc.pdf
Read video frame data from file - MATLAB readc.pdf. Read video frame data from file - MATLAB readc.pdf. Open. Extract. Open with. Sign In. Main menu.

Video key frame extraction through dynamic ... - Rameswar Panda
tribution technologies, the extent of video content accessible in the daily life has increased ...... indicates which alternative is better [23]. Since the confidence intervals (with a ..... The Future of Energy Gases, Segment 5 (OV). 7. 92. Ocean Fl

Emergency facility video-conferencing system
Oct 24, 2008 - Health Service Based at a Teaching Hospital, 2 J. of Telemed. &. Telecare .... BBI Newsletter, Welcome to the ROC (Remote Obstetrical Care),. BBI Newsl., vol. ...... (IR) beam transmission for sending control signals to the.

Emergency facility video-conferencing system
Oct 24, 2008 - tors Telehealth Network, Inc.'s Preliminary Invalidity Contentions. Under P.R. 3-3 and Document ... Care Costs, An Evaluation of a Prison Telemedicine Network,. Research Report, Abt Associates, Inc., ..... with a camera mount enabling

Emergency facility video-conferencing system
Oct 24, 2008 - Based on Wireless Communication Technology Ambulance, IEEE. Transactions On ..... Tandberg Features, Tandberg Advantage.' Security ...

Geometry-Based Next Frame Prediction from Monocular Video
use generative computer graphics to predict the next frame to be observed. ... Recent frame prediction methods based on neural net- works [28], [24], [26], [14] ..... three-dimensional point cloud C. The x,y,z coordinates of the projected points in .

Video key frame extraction through dynamic ... - Rameswar Panda
Delaunay graph is posed as a constraint optimization problem. We remove an ...... content-based video search engine supporting spatio-temporal queries, IEEE.

Read video frame data from file - MATLAB read.pdf
Read video frame data from file - MATLAB read.pdf. Read video frame data from file - MATLAB read.pdf. Open. Extract. Open with. Sign In. Main menu.

Author's personal copy The AEGIS detection system for ...
Matveev, F. Merkt, S. Moretto, C. Morhard, G. Nebbia, P. Nedelec, M.K. ... positronium and the antiproton cloud dimensions (of the order of a few mm) the pro-.

an audio indexing system for election video ... - Research at Google
dexing work [1, 2, 3, 4] however here the focus is on video material, the content of ..... the “HTML-ized” version of the documents, and compared the two retrieval ...

System and method for synchronization of video display outputs from ...
Jun 16, 2009 - by executing an interrupt service routine by all host processors. FIG. 9 .... storage medium or a computer netWork Wherein program instructions are sent over ..... other information include initialization information such as a.

Novel method based on video tracking system for ...
A novel method based on video tracking system for simultaneous measurement of kinematics and flow in the wake of a freely swimming fish is described.

A Motion Trajectory Based Video Retrieval System ...
learning and classification tool. In this paper, we propose a novel motion trajectory based video retrieval system. For feature space representation, we use two ...

Video Surveillance for Biometrics: Long-Range Multi-Biometric System
recognition system for long-range human identification. The system is capable of ...... International Conference on Audio- and Video- based. Biometric Person ...

System and method for synchronization of video display outputs from ...
Jun 16, 2009 - media include magnetic media such as hard disks, ?oppy disks, and ... encompass data signals embodied in a carrier Wave such as the data ...

A Consumer Video Search System by Audio-Visual ...
based consumer video search engine exploiting the query- by-concept ... The sufficiently good per- ... concept classification, return ranked videos based on the.

CONTENT-BASED VIDEO COPY DETECTION IN ...
As for many content based retrieval systems, one of the difficult task of a CBCD scheme .... size of the corresponding DB file is about 13 Gb (D = 20 di- mensional ...

Secure hierarchial video delivery system and method
Dec 15, 1994 - MASTER E MODEM H EAZO. "DL I \ ..... modems 10 are incorporated to enable the high speed ..... The broadband RF ampli?ers 53 each.

Outdoor Video Surveillance System
system just transmits the acquired video to a base station where most of the processing is ..... Barnard Communications, IEEE conference on UAVs, 2007.

Secure hierarchial video delivery system and method
Dec 15, 1994 - library for short term storage at a local library where it is available for user access and control. ..... ming to said local libraries for storage on said program record and playback units or for delivery to a .... carried on the cabl

Arbitrator360 In Car Video System -
Arbitrator360 In Car Video System. Duane Miller. Senior Area ... car video capture & retrieval system. .... Currently utilized in HD DVD / Blu-Ray. Because the file ...

Efficient and Effective Video Copy Detection Based on Spatiotemporal ...
the Internet, can be easily duplicated, edited, and redis- tributed. From the view of content ... in this paper, a novel method for video copy detection is proposed. The major ...... images," IEEE International Conference on Computer. Vision, 2005.

Copy of Imperialism- crash course video notes-answers .pdf ...
Page 2 of 2. Page 2 of 2. Copy of Imperialism- crash course video notes-answers .pdf. Copy of Imperialism- crash course video notes-answers .pdf. Open. Extract. Open with. Sign In. Main menu. Displaying Copy of Imperialism- crash course video notes-a