Separating Transparent Layers

of

Repetitive Dynamic Behaviors

Bernard Sarel   and   Michal Irani
 to appear in ICCV'2005

Abstract
In this paper we present an approach for separating two transparent layers of complex non-rigid scene dynamics. The dynamics in one of the layers is assumed to be repetitive, while the other can have any arbitrary dynamics. Such repetitive dynamics includes, among other, human actions in video (e.g., a walking person), or a repetitive musical tune in audio signals. We use a global-to-local space-time alignment approach to detect and align the repetitive behavior. Once aligned, a median operator applied to space-time derivatives is used to recover the intrinsic repeating behavior, and separate it from the other transparent layer. We show results on synthetic and real video sequences. In addition, we show the applicability of our approach to separating mixed audio signals (from a single source).

The paper can be found here (PDF, PS).

 

 

Background

Previously we have dealt with separating transparent layers in video sequences containing one non-rigid layer, while the other layer had to be rigid, up to a 2D parametric transformation (for details see our ECCV’2004 work). In this paper we lift the constraints so that one layer can include arbitrary non-rigid dynamics, while the other layer has repetitive non-rigid dynamics, thus, achieving transparent layers separation of two non-rigid dynamic scenes.

 


Some Example Results

 

Video Examples

 

Example 1: Mixed Video Sequence

In this example we mixed two video sequences. We took a video of a man walking against a uniform background and superimposed it on another video sequence of running water in a small garden creek.  The walking man is the repetitive non-rigid dynamic scene, while the running water is a highly non-rigid dynamic scene.

 

The following MPEG file shows the input sequence and the two recovered video sequence layers.

 

 

 

 

Walking Man and Running Water (4MB MPEG)


 

 

Example 2: Real Video Sequence

In this example we show a real video transparency sequence. The scene was taken at the entrance to one of the buildings here at the Weizmann Institute of Science (see sketch below).

 

 

A camera was placed pointing at a transparent swiveling entrance door to a building. The view of the camera includes an arbitrary non-rigid scene reflected in the swinging door (people playing basketball and their background, represented by the blue ray), and a repetitive non-rigid scene perceived through the swinging door (a man jumping up and down, represented by the red ray). The door was swiveled slightly back and forth so the reflected scene (basketball players and background) is imaged as moving across the jumping man scene. This is a difficult sequence with non-rigid motions, texture, and noise.

 

 

The following MPEG file shows the input sequence and the two recovered video sequence layers.

 

 

Jumping Man (4MB MPEG)

 

 

 

Audio Examples

 

Note #1: In this Audio section of the results, the images below are naturally just for illustration, the separation in done on the audio files.

Note #2: If you download the high quality audio files, make sure you have adequate speakers.

 

 

Example 3: Mixed Audio – Bolero and a Talking Man

In this example we mixed a segment of the masterpiece “Bolero” by Ravel (which we repeated several times), with a recording of a talking man reciting a poem by T.S.Eliot, “Gus: the Theater Cat”.

 

The following MP3/WAV files contain the input audio and the two recovered audio layers.

 

 

The Input Audio

 

“Bolero” + Recited Poem about a Cat

 

 

1.5 MB mp3 file

 

(higher quality 35MB WAV file)

The First Recovered Audio Track

 

“Bolero”

 

 

1.5 MB mp3 file

 

(higher quality 35MB WAV)

The Second Recovered Audio Track

 

Recited Poem about a Cat

 

 

1.5 MB mp3 file

 

(higher quality 35MB WAV)

 

 

 

Example 4: Mixed Audio – Bolero and Another Song

In this example we mixed a segment of the masterpiece “Bolero” by Ravel (which we repeated several times), with a song of by Nat King Cole, “When I Fall in Love”.

 

The following MP3/WAV files contain the input audio and the two recovered audio layers.

 

 

The Input Audio

 

“Bolero” + Song by Nat King Cole

 

 

1.5 MB mp3 file

 

(higher quality 35MB WAV file)

The First Recovered Audio Track

 

“Bolero”

 

 

1.5 MB mp3 file

 

(higher quality 35MB WAV)

The Second Recovered Audio Track

 

Song by Nat King Cole

 

 

1.5 MB mp3 file

 

(higher quality 35MB WAV)

 

_______________________________________________________________________________________________________________________________

 

eXTReMe Tracker