Intelligent Database Laboratory, CSIE, NCKU

AN EFFECTIVE VIDEO RETRIEVAL SYSTEM BY COMBINING VISUAL AND TEXTUAL MINING TECHNIQUES Ja-Hwung Su, Hsin-Ho Yeh, Vincent S. Tseng Department of Computer Science and Information Engineering National Cheng Kung University, Tainan, Taiwan, R.O.C. [email protected] 2009/05/02

IDB LAB

Outline     

Introduction System Architecture Experimental Evaluations Conclusions Future Work

Intelligent Database Laboratory, CSIE, NCKU

-2-

Introduction (1) 

Previous work on video retrieval 

Textual search  



Rely on the video metadata heavily by exactly matching High cost by manual annotation

Visual search 

The traditional content-based video search is limited in the compound and complex visual contents in terms of effectiveness and efficiency.

Intelligent Database Laboratory, CSIE, NCKU

-3-

Introduction (2) 

To retrieve users’ desired videos by combining textual- and visual-based mining 

Our advantages on solving previous problems 1.

2. 3.

For textual-based search, without annotating videos, the videos can be retrieved by the automated metadata we propose. Reduce semantic gap between video concepts and query terms The proposed approach can achieve high performance of visualbased search.

Intelligent Database Laboratory, CSIE, NCKU

-4-

System Architecture

Intelligent Database Laboratory, CSIE, NCKU

-5-

Visual Processing

Step 1. Shot Detection Step 2. Feature Extraction

ModelVisual (FPI tree)

Step 3. Shot Clustering and Encoding Target Videos

Step 4. Temporal Pattern Generation Step 5. FPI Tree Construction

Intelligent Database Laboratory, CSIE, NCKU

-6-

Visual Processing –Shot Clustering and Encoding

D

A

A

D

D

C

B

C

B

C

Intelligent Database Laboratory, CSIE, NCKU

-7-

Visual Processing – Temporal-Pattern Generation Clip-id

Shot/Key-Frame Pattern

Clip 1

A, B, C, A

Clip 2

C, B, B, A, E, F

Clip 3

F, F, E, E, A, B, D, B, C, A, B

Clip 4

B, C, G, C, A, D, B

For Clip1

A B C A

Two shot-patterns A

A→B, A→C, A→A

B

B→C, B→A

C

C→A

Intelligent Database Laboratory, CSIE, NCKU

Start Point winsize=3 -8-

Visual Processing – Fast-Pattern-Index Tree Construction Two shot-patterns Clip1

A→B, A→C, A→A, B→C, B→A, C→A

A

B

C

A

1

1

1

B

1

C

1

D

E

F

G

1

D E F G Fast Pattern Index Tree Intelligent Database Laboratory, CSIE, NCKU

-9-

Visual Processing – Fast-Pattern-Index Tree Construction Two shot-patterns Clip1

A→B, A→C, A→A, B→C, B→A, C→A

Clip2

C→B, C→A, B→B, B→A, B→E, B→F, A→E, A→F, E→F

Clip3

F→F, F→E, F→A, E→E, E→A, E→B, E→D, A→B, A→D, B→D, B→B, B→C, D→B, D→C, D→A, B→A, C→A, C→B

Clip4

B→C, B→G, C→G, C→C, C→A, G→C, G→A, G→D, C→D, C→B, A→D, A→B, D→B A

B

C

D

E

F

G

A

1

1,2,3

1,2,3,4

3

3

3

4

B

1,3,4

2,3

2,3,4

3,4

3

C

1

1,3,4

4

3

D

3,4

3

4

E

2

2

3

3

F

2

2

2

3

G

4

4 3

4

4

Intelligent Database Laboratory, CSIE, NCKU

Fast Pattern Index Tree

- 10 -

Textual Processing

Step 1. Collect the relevant web URLs Step 2. Crawl the relevant web pages

ModelTexture

Step 3. Extract the keyword features Target Video Categories

Step 4. Calculate tf and idf

Intelligent Database Laboratory, CSIE, NCKU

- 11 -

Textual Processing – Collect and Crawl The Relevant Web Pages Query Term

Google

Top10 URLs

Crawl the link content Intelligent Database Laboratory, CSIE, NCKU

- 12 -

Textual Processing – Extract The Feature-Keywords Stop Word Removal Word Set Word Stemming

Keyword Set

Hayley Dee Westenra is a New Zealand soprano. Her first internationally international release releasedalbum, album, Pure, reached reach No 1 on the UK classical charts in sell more than two million copy copies 2003 and has sold worldwide.

Intelligent Database Laboratory, CSIE, NCKU

- 13 -

Textual Processing – Match by Feature-Keywords query

Baseball concept

War player score

Team

policy

field

force

hit

run

rule

tactic weapon

pitch batter

navy

TFIDF

- 14 -

Query Refinement

Intelligent Database Laboratory, CSIE, NCKU

- 15 -

Experiments 

Dataset 

Visual dataset 



13 video concepts  258 video clips with 10464 shots  Totally, the duration of video data is about 20 hours  From each concept, 33% of visual videos are randomly selected as the testing data

Textual dataset  

For each concept, top 2000 keywords that came from top 10 search results by Google are selected as feature keywords. For each concept, we collect top 10 keywords from wikipedia as testing query. Intelligent Database Laboratory, CSIE, NCKU

- 16 -

Experiments – Measurement 

Visual search evaluation



Returned

Non-returned

Relevant

Correct

Incorrect

Non-relevant

Incorrect

Incorrect

Example 

For 10 returned videos, the concepts of five clips are the same as that of query video. Also, there are 20 ground-truth videos.  precision = 5 / 10 * 100%= 50%  Recall = 5 / 20 * 100% = 25%

| Correct | precision = *100% |Returned|

recall =

Intelligent Database Laboratory, CSIE, NCKU

| Correct | *100% | Relevant | - 17 -

Experiments – Measurement 

Textual search evaluation 

“Hit” represents the coverage for the correctly returned categories over the resulting ones

100 %, if the returned k results contain the query term Hit =  0, otherwise 

For example 

If now we have a query term, “homerun,” and 3 concepts (baseball, basketball, racing-car) are returned from our system. Then “homerun” hits the “baseball” concept.

Intelligent Database Laboratory, CSIE, NCKU

- 18 -

Experiment (1) – Visual Search 

Precision



Recall

Intelligent Database Laboratory, CSIE, NCKU

- 19 -

Experiment (2) – Textual Search 

Ratio of hit

Intelligent Database Laboratory, CSIE, NCKU

- 20 -

System Prototype – Example

Intelligent Database Laboratory, CSIE, NCKU

- 21 -

Conclusions 

We propose a hybrid approach with textual- and visual-based mining strategies 

Textual-based  



Using Google as our backend mediator to find the web pages that are most relevant to the target query term Associating the users’ interests with the target concepts by featurekeyword matching

Visual-based  

With the temporal properties, the proposed pattern-based index can accelerate the search. By pattern-based matching, the user’s desired videos can be found effectively. Intelligent Database Laboratory, CSIE, NCKU

- 22 -

Future Work 

Pattern-based index can substantially reduce high dimensional complexity 

Apply this index structure to different types of multimedia applications 

Example  Music Retrieval, Multimedia Recommendation

Intelligent Database Laboratory, CSIE, NCKU

- 23 -

Intelligent Database Laboratory, CSIE, NCKU

- 24 -

an effective video retrieval system by combining visual ...

May 2, 2009 - COMBINING VISUAL AND TEXTUAL MINING .... Totally, the duration of video data is about 20 hours ... Recall = 5 / 20 * 100% = 25%. %100*.

927KB Sizes 0 Downloads 260 Views

Recommend Documents

A Consumer Video Search System by Audio-Visual ...
based consumer video search engine exploiting the query- by-concept ... The sufficiently good per- ... concept classification, return ranked videos based on the.

A Motion Trajectory Based Video Retrieval System ...
learning and classification tool. In this paper, we propose a novel motion trajectory based video retrieval system. For feature space representation, we use two ...

Image retrieval system and image retrieval method
Dec 15, 2005 - face unit to the retrieval processing unit, image data stored in the image information storing unit is retrieved in the retrieval processing unit, and ...

User Evaluation of an Interactive Music Information Retrieval System
H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval – search ... To date, most MIR systems and online services have been using the ...

SemRetriev – an Ontology Driven Image Retrieval System
Jul 9, 2007 - prototype system for image retrieval which combines the use of an ontology which structures an ... SemRetriev is meant to show that the use of semantic .... After the elimination of invalid links and invalid files, there are around ...

Image Retrieval: Color and Texture Combining Based on Query-Image*
into account a particular query-image without interaction between system and .... groups are: City, Clouds, Coastal landscapes, Contemporary buildings, Fields,.

On Combining Visual SLAM and Visual Odometry - The University of ...
monocular SLAM system which combines the benefits of these two techniques. ... that end we recast the usual world-centric EKF implementation of visual SLAM ...

Image Retrieval: Color and Texture Combining Based ...
tion has one common theme (for example, facial collection, collection of finger prints, medical ... It is possible to borrow some methods from that area and apply.

an audio indexing system for election video ... - Research at Google
dexing work [1, 2, 3, 4] however here the focus is on video material, the content of ..... the “HTML-ized” version of the documents, and compared the two retrieval ...

Trajic: An Effective Compression System for Trajectory Data - GitHub
Apr 26, 2014 - Section 3 describes the Trajic system, starting with the predictor then continuing ... One way of incorporating time is to use the synchronised eu- clidean distance ..... will call the encoding overhead (EO(l)). Using the previously ..

An Effective Segmentation Method for Iris Recognition System
Biometric identification is an emerging technology which gains more attention in recent years. ... characteristics, iris has distinct phase information which spans about 249 degrees of freedom [6,7]. This advantage let iris recognition be the most ..

Emergency facility video-conferencing system
Oct 24, 2008 - Health Service Based at a Teaching Hospital, 2 J. of Telemed. &. Telecare .... BBI Newsletter, Welcome to the ROC (Remote Obstetrical Care),. BBI Newsl., vol. ...... (IR) beam transmission for sending control signals to the.

Fast Forensic Video Event Retrieval Using Geospatial ...
It also makes the online video search possible by filtering tremendous amount of data ... H.3.3 [Information Storage and Retrieval]: Informa- tion Search and ...

Emergency facility video-conferencing system
Oct 24, 2008 - tors Telehealth Network, Inc.'s Preliminary Invalidity Contentions. Under P.R. 3-3 and Document ... Care Costs, An Evaluation of a Prison Telemedicine Network,. Research Report, Abt Associates, Inc., ..... with a camera mount enabling

Emergency facility video-conferencing system
Oct 24, 2008 - Based on Wireless Communication Technology Ambulance, IEEE. Transactions On ..... Tandberg Features, Tandberg Advantage.' Security ...

A simple visual navigation system for an UAV - Department of ...
drone initial and actual position be (ax,ay,az)T and (x, y, z)T respectively, and |ax| ≪ s, .... stages: 1) Integral image generation, 2) Fast-Hessian detector. (interest point ..... Available: http://www.gaisler.com/doc/structdes.pdf. [28] A. J. V

A simple visual navigation system for an UAV - GitHub
Tomáš Krajnık∗, Matıas Nitsche†, Sol Pedre†, Libor Preucil∗, Marta E. Mejail†,. ∗. Department of Cybernetics, Faculty of Electrical Engineering, Czech Technical University in Prague [email protected], [email protected]

Video Stream Retrieval of Unseen Queries using ...
Retrieval of live, user-broadcast video streams is an under-addressed and increasingly relevant challenge. The on-line nature of the problem ne- cessitates temporal evaluation and the unforeseeable scope of potential queries motivates an approach whi

Segmented Trajectory based Indexing and Retrieval of Video Data.
Indexing and Retrieval of Video. Data. Multimedia Systems Lab, UIC. 1. Faisal I. Bashir, Ashfaq A. Khokhar, Dan Schonfeld. University of Illinois at Chicago,.

Video Retrieval Based on Textual Queries
Center for Visual Information Technology,. International Institute of Information Technology,. Gachibowli ... There are two important issues in Content-Based Video Ac- cess: (a) A .... matching algorithm then computes the degree of similarity be-.

Video Retrieval Based on Textual Queries
Audio and the textual content in videos can be of immense use ... An advanced video retrieval solution could identify the text ..... Conference on Computer.