Event-Based Analysis of Video

Lihi Zelnik-Manor   and  Michal Irani
CVPR 2001

Abstract
Dynamic events can be regarded as long-term temporal objects, which are characterized by spatio-temporal features at multiple temporal scales. Based on this, we design a simple statistical distance measure between video sequences (possibly of different lengths) based on their behavioral content. This measure is non-parametric and can thus handle a wide range of dynamic events. We use this measure for isolating and clustering events within long continuous video sequences. This is done without prior knowledge of the types of events, their models, or their temporal extent. An outcome of such a clustering process is a temporal segmentation of long video sequences into event-consistent sub-sequences, and their grouping into event-consistent clusters.
Our event representation and associated distance measure can also be used for event-based indexing into long video sequences, even when only one short example-clip is available. However, when multiple example-clips of the same event are available (either as a result of the clustering process, or given manually), these can be used to refine the event representation, the associated distance measure, and accordingly the quality of the detection and clustering process.

Click here for full paper ZelnikIrani.CVPR2001.ps.zip (5591K)  or ZelnikIrani.CVPR2001.pdf (678K)
 
 


Clustering different human activities

 


Representative frames of a 6000 frame long video sequence, displaying different people wearing different clothes performing different activities:



WaveImage    WalkImage


Jog Image   WaveImage2              


RollImage   WalkImage2                                




Clustering results displayed on the time axis (top colored bar) vs. ground-truth information (bottom colored bar).

Outdoor Clustering Reesult



Short example clip (2657K)
                                                               

                                     



Full length sequence (54022K)

 



Clustering actions of a tennis player



 

Punch-Kick-Duck Sequence


 


Action Based Video Synthesis

Raw Footage (punch-duck-kick).

Guide and Synthetic Sequence.

 

Classifying dynamic textures

Water Mix Result

 


Questions and comments should be addressed to: lihi@wisdom.weizmann.ac.il

 
 

Lihi Zelnik-Manor

Computer Vision Group

Department of Computer Science and Applied Mathematics

The Weizmann Institute of Science
Rehovot 76100, Israel