|
ABSTRACT
This paper presents a real-time framework for computationally tracking objects visually attended by the user while navigating in interactive virtual environments. In addition to the conventional bottom-up (stimulus-driven) features, the framework also uses topdown (goal-directed) contexts to predict the human gaze. The framework first builds feature maps using preattentive features such as luminance, hue, depth, size, and motion. The feature maps are then integrated into a single saliency map using the center-surround difference operation. This pixel-level bottom-up saliency map is converted to an object-level saliency map using the item buffer. Finally, the top-down contexts are inferred from the user's spatial and temporal behaviors during interactive navigation and used to select the most plausibly attended object among candidates produced in the object saliency map. The computational framework was implemented using the GPU and exhibited extremely fast computing performance (5.68 msec for a 256X256 saliency map), substantiating its adequacy for interactive virtual environments. A user experiment was also conducted to evaluate the prediction accuracy of the visual attention tracking framework with respect to actual human gaze data. The attained accuracy level was well supported by the theory of human cognition for visually identifying a single and multiple attentive targets, especially due to the addition of top-down contextual information. The framework can be effectively used for perceptually based rendering without employing an expensive eye tracker, such as providing the depth-of-field effects and managing the level-of-detail in virtual environments.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Awh, E., and Pashler, H. 2000. Evidence for split attentional foci. Journal of Experimental Psychology 26, 2, 834--846.
|
| |
2
|
|
 |
3
|
|
 |
4
|
|
| |
5
|
Burns, D., and Osfield, R., 2006. OpenSceneGraph. http://www.openscenegraph.org.
|
 |
6
|
|
| |
7
|
Connor, C., Egeth, H., and Yantis, S. 2004. Visual attention: Bottom-up vs. top-down. Current Biology 14, 19, 850--852.
|
| |
8
|
|
| |
9
|
Engel, S., Zhang, X., and Wandell, B. 1997. Colour tuning in human visual cortex measured with functional magnetic resonance imaging. Nature 388, 6637, 68--71.
|
| |
10
|
Enns, J. T. 1990. Three-dimensional features that pop out in visual search. In Visual Search, Taylor and Francis, Eds. New York, 37--45.
|
| |
11
|
Haber, J., Myszkowski, K., Yamauchi, H., and Seidel, H.-P. 2001. Perceptually guided corrective splatting. Computer Graphics Forum 20, 3.
|
| |
12
|
Henderson, J. M. 2003. Human gaze control during real-world scene perception. Trends in Cognitive Sciences 7, 11, 498--504.
|
| |
13
|
|
 |
14
|
|
| |
15
|
Jobson, D. J., ur Rahman, Z., and Woodell, G. A. 1997. Properties and performance of a center/surround retinex. IEEE Trans. on Image Processing 6, 3, 451--462.
|
| |
16
|
Kalman, R. E. 1960. A new approach to linear filtering and predictive problems. Trans. ASME, Journal of basic engineering 82, 34--45.
|
| |
17
|
Kessenich, J., Baldwin, D., and Rost, R., 2004. The OpenGL Shading Language. Version 1.10.59. 3Dlabs, Inc. Ltd. http://developer.3dlabs.com/documents/index.htm.
|
| |
18
|
Koch, C., and Ullman, S. 1985. Shifts in selective visual attention. Human Neurobiology 4, 219--227.
|
| |
19
|
Kuipers, B. 1978. Modeling spatial knowledge. Cognitive Science 2, 129--153.
|
 |
20
|
|
| |
21
|
Loftus, G. R., and Mackworth, N. H. 1978. Cognitive determinants of fixation duration during picture viewing. Journal of Experimental Psychology 4, 565--572.
|
 |
22
|
Peter Longhurst , Kurt Debattista , Alan Chalmers, A GPU based saliency map for high-fidelity selective rendering, Proceedings of the 4th international conference on Computer graphics, virtual reality, visualisation and interaction in Africa, January 25-27, 2006, Cape Town, South Africa
[doi> 10.1145/1108590.1108595]
|
| |
23
|
Ma, Y.-F., Hua, X.-S., Lu, L., and Zhang, H. 2005. A generic framework of user attention model and its application in video summarization. IEEE Trans. on Multimedia 7, 5, 907--919.
|
| |
24
|
Marshall, J., Burbeck, C., Ariely, D., Rolland, J., and Martin, K. 1996. Occlusion edge blur: a cue to relative visual depth. Journal of the Optical Society of America 13, 681--688.
|
| |
25
|
Mather, G. 1997. The use of image blur as a depth cue. Perception 26, 1147--1158.
|
| |
26
|
Mozer, M. C., and Sitton, M. 1998. Computational modeling of spatial attention. In Attention, H. Pashler, Ed. UCL Press, London, 341--393.
|
| |
27
|
Nagy, A. L., and Sanchez, R. R. 1990. Critical color differences determined with a visual search task. Journal of the Optical Society of America 7, 7, 1209--1217.
|
| |
28
|
Nakayama, K., and Silverman, G. 1986. Serial and parallel processing of visual feature conjunctions. Nature 320, 264--265.
|
| |
29
|
O'Craven, K. M., Downing, P. E., and Kanwisher, N. 1999. fMRI evidence for objects as the units of attentional selection. Nature 401, 6753, 584--587.
|
| |
30
|
OpenCV, 2006. http://sourceforge.net/projects/opencvlibrary/.
|
| |
31
|
Ouerhani, N., Bracamonte, J., Hugli, H., Ansorge, M., and Pellandini, F. 2001. Adaptive color image compression based on visual attention. In Proceedings of ICIAP, 416--421.
|
| |
32
|
Ouerhani, N., von Wartburg, R., and Hugli, H. 2004. Empirical validation of the saliency-based model of visual attention. Electronic Letters on Computer Vision and Image Analysis 3, 1, 13--24.
|
| |
33
|
Rutishauser, U., Walther, D., Koch, C., and Perona, P. 2004. Is bottom-up attention useful for object recognition? In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition 2004, 37--44.
|
 |
34
|
|
| |
35
|
Sears, C., and Pylyshyn, Z. 2000. Multiple object tracking and attentional processing. Journal of Experimental Psychology 54, 1, 1--14.
|
| |
36
|
Siegel, A. W., and White, S. H. 1975. The development of spatial representations of large-scale environments. In Advances in Child Development and Behavior, H. Reese, Ed., vol. 10. Academic Press, New York, 10--55.
|
| |
37
|
Speed, F. M., Hocking, R. R., and Hackney, O. P. 1978. Methods of analysis of linear models with unbalanced data. Journal of the American Statistical Association 73, 361, 105--112.
|
| |
38
|
Treisman, A. M., and Gelade, G. 1980. A feature-integration theory of attention. Cognitive Psychology 12, 97--136.
|
| |
39
|
Vishton, P., and Cutting, J. 1995. Wayfinding, displacements, and mental maps: velocity field are not typically used to determine one's aimpoint. Journal of Experimental Psychology 21, 978--995.
|
 |
40
|
|
 |
41
|
|
| |
42
|
|
| |
43
|
Wolfe, and Jeremy, M. 1993. Guided search 2.0. In Proceedings of the Human Factors and Ergonomics Society 37th Annual Meeting, 1295--1299.
|
 |
44
|
|
|