Real-Time Three-path Visual Saliency Model (TVSM)


The visual saliency model (called TVSM and described below) developped at GIPSA-Lab is available as an exe file for windows (XP, Vista, Seven, Windows 8) and works on CPU and/or Nvidia GPU.

Download TVSM 32 bits exe file for Cuda 4.2 : InstallationTVSM32Cuda42.exe
Download TVSM 64 bits exe file for Cuda 4.2 : InstallationTVSM64Cuda42.exe

Download TVSM 32 bits exe file for Cuda 6.5 : InstallationTVSM32Cuda65.exe
Download TVSM 64 bits exe file for Cuda 6.5 : InstallationTVSM64Cuda65.exe

Input Video example (MP4): example1.mp4, example2.mp4
Output Video example (MP4): example1fus.mp4, example2fus.mp4

You need a Cuda compatible driver (Nvidia driver site) to use this model with GPU acceleration, and the good codecs corresponding to your video format. You can chose one (or more) video to process after selecting which pathway to use (a single pathway, two pathways including static or three pathways including also static pathway) and selecting how to store the results (as plain text data image per image and/or as video or image per image).

If you use this code, please cite this paper:
Marat,Sophie and Rahman,Anis and Pellerin,Denis and Guyader,Nathalie and Houzet,Dominique
Improving Visual Saliency by Adding Face Feature Map and Center Bias
Cognitive Computation, 5(1) 63-75, 2013

- about model:
- about IHM or GPU implementation:

Three-path Visual Saliency Model (TVSM)

The bottom-up algorithm implemented here is inspired from the human visual system, and is modeled all the way from the retina to visual cortex cells as shown in the following figure. This model can be sub-divided into three distinct pathways: the static pathway, the dynamic pathway, and the face pathway. (find more details in: 2013_cognitive_comput.pdf)

Block diagram of the proposed visual saliency model with three saliency maps dedicated to specific features: static, dynamic, and face. All these features are computed in parallel pathways, and resultantly each produces a saliency map—such as M s , M d , and M f . The maps may then be fused together either before or after applying the center model to analyze the influence of the center bias. Here, M sc dc f is the final saliency model that combines all the three features with center bias. (find more details in: 2013_PhD_Rahman.pdf)


A. Rahman Face perception in videos: Contributions to a visual saliency model and its implementation on GPUs
PhD thesis, University of Grenoble, France, 2013

Rahman A., Houzet D., Pellerin D., Marat S., Guyader N.
Parallel implementation of a spatio-temporal visual saliency model
Journal of Real-Time Image Processing (JRTIP), Special Issue on Parallel Computing for Real-Time Image Processing, 6: 3-14, 2011

Marat S., Ho Phuoc T., Granjon L., Guyader N., Pellerin D., Guerin-Dugue A.
Modelling spatio-temporal saliency to predict gaze direction for short videos
International Journal of Computer Vision (IJCV), 82(3):231-243, 2009

Other References:

Rahman,Anis and Pellerin,Denis and Houzet,Dominique
Influence of number, location and size of faces on gaze in video
Journal of Eye Movement Research, 7(2):5 1–11, 2014

Rahman,Anis and Houzet,Dominique and Pellerin,Denis
Visual Saliency Model on Multi-GPU
GPU Computing Gems Emerald Edition 451-472, 2011