|
ABSTRACT
Human visual perception is able to recognize a wide range of targets under challenging conditions, but has limited throughput. Machine vision and automatic content analytics can process images at a high speed, but suffers from inadequate recognition accuracy for general target classes. In this paper, we propose a new paradigm to explore and combine the strengths of both systems. A single trial EEG-based brain machine interface (BCI) subsystem is used to detect objects of interest of arbitrary classes from an initial subset of images. The EEG detection outcomes are used as input to a graph-based pattern mining subsystem to identify, refine, and propagate the labels to retrieve relevant images from a much larger pool. The combined strategy is unique in its generality, robustness, and high throughput. It has great potential for advancing the state of the art in media retrieval applications. We have evaluated and demonstrated significant performance gains of the proposed system with multiple and diverse image classes over several data sets, including those from Internet (Caltech 101) and remote sensing images. In this paper, we will also present insights learned from the experiments and discuss future research directions.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
N. Bigdely-Shamlo, A. Vankov, R. Ramirez, and S. Makeig. Brain Activity-Based Image Classification From Rapid Serial Visual Presentation. IEEE Trans. on NSRE, 16(5):432--441, 2008.
|
| |
2
|
N. Bigdely-Shamlo, A. Vankov, R. R. Ramirez, and S. Makeig. Brain activity-based image classification from rapid serial visual presentation. IEEE Trans. on NSRE, 16(5):432--441, Oct. 2008.
|
| |
3
|
J. Donoghue. Bridging the brain to the world: a perspective on neural interface systems. Neuron, 60(3):511--521, 2008.
|
| |
4
|
M. Dyrholm, C. Christoforou, and L. Parra. Bilinear discriminant component analysis. JMLR, 8:1097--1111, 2007.
|
| |
5
|
M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2008 (VOC2008) Results. http://www.pascalnetwork.org/challenges/VOC/voc2008/workshop/index.html.
|
| |
6
|
L. Fei-Fei, R. Fergus, and P. Perona. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. CVIU, 106(1):59--70, 2007.
|
| |
7
|
R. Fergus and P. Perona. A Visual Category Filter for Google Images. In Proc. ECCV, 2004.
|
| |
8
|
A. Gerson, L. Parra, and P. Sajda. Cortically coupled computer vision for rapid image search. IEEE Trans. on NSRE, 14(2):174--179, 2006.
|
| |
9
|
M. Gladwell. Blink: The power of thinking without thinking. Little, brown and company: Time warner book group, New York, 2005.
|
| |
10
|
M. Hein and M. Maier. Manifold denoising. Proc. NIPS, 19, 2006.
|
| |
11
|
T. Huang, C. Dagli, S. Rajaram, E. Chang, M. Mandel, G. Poliner, and D. Ellis. Active Learning for Interactive Multimedia Retrieval. Proc. of the IEEE, 96(4):648--667, 2008.
|
| |
12
|
T. Jebara, J. Wang, and S.-F. Chang. Graph construction and b-matching for semi-supervised learning. In Proc. ICML, 2009.
|
| |
13
|
Y. Jiang, C. Ngo, and J. Yang. Towards optimal bag-of-features for object categorization and semantic video retrieval. In Proc. of CIVR, pages 494--501, 2007.
|
| |
14
|
Y. Jing and S. Baluja. VisualRank: Applying PageRank to Large-Scale Image Search. IEEE Trans. on PAMI, 12, 2008.
|
| |
15
|
A. Kapoor, P. Shenoy, and D. Tan. Combining brain computer interfaces with vision for object categorization. In Proc. CVPR, 2008.
|
| |
16
|
K. Kay, T. Naselaris, R. Prenger, and J. Gallant. Identifying natural images from human brain activity. Nature, 452(7185):352--355, 2008.
|
| |
17
|
C. Keysers, D. Xiao, P. Foldiak, and D. Perrett. The speed of sight. Journal of Cognitive Neuroscience, 13(1):90--101, 2001.
|
| |
18
|
J. Langford, L. Li, and T. Zhang. Sparse Online Learning via Truncated Gradient. JMLR, 10:777--801, 2009.
|
| |
19
|
D. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 60(2):91--110, 2004.
|
| |
20
|
C. Micchelli and M. Pontil. Learning the kernel function via regularization. JMLR, 6(2):1099--1125, 2006.
|
| |
21
|
K. Mikolajczyk and C. Schmid. Scale & affine invariant interest point detectors. IJCV, 60(1):63--86, 2004.
|
| |
22
|
T. Mitchell, R. Hutchinson, R. Niculescu, F. Pereira, X. Wang, M. Just, and S. Newman. Learning to decode cognitive states from brain images. Machine Learning, 57(1):145--175, 2004.
|
| |
23
|
Y. Miyawaki, H. Uchida, O. Yamashita, M. Sato, Y. Morito, H. Tanabe, N. Sadato, and Y. Kamitani. Visual Image Reconstruction from Human Brain Activity using a Combination of Multiscale Local Image Decoders. Neuron, 60(5):915--929, 2008.
|
| |
24
|
A. Oliva. Gist of the scene. In Encyclopedia of Neurobiology of Attention, pages 251--256, San Diego, CA, 2005. Elsevier.
|
| |
25
|
L. Parra, C. Christoforou, A. Gerson, M. Dyrholm, A. Luo, M. Wagner, M. Philiastides, and P. Sajda. Spatiotemporal linear decoding of brain state: Application to performance augmentation in high-throughput tasks. IEEE Signal Processing Magazine, 25(1):95--115, January 2008.
|
| |
26
|
L. Parra, C. Christoforou, A. Gerson, M. Dyrholm, A. Luo, M. Wagner, M. Philiastides, and P. Sajda. Spatiotemporal linear decoding of brain state: Application to performance augmentation in high-throughput tasks. IEEE Signal Processing Magazine, 25(1):95--115, January 2008.
|
| |
27
|
P. Poolman, R. Frank, P. Luu, S. Pederson, and D. Tucker. A single-trial analytic framework for EEG analysis and its application to target detection and classification. NeuroImage, 42(2):787--798, 2008.
|
| |
28
|
M. Potter and E. Levy. Recognition memory for a rapid sequence of pictures. Journal of Experimental Psychology, 81(1):10, 1969.
|
| |
29
|
Y. Rui, T. Huang, M. Ortega, and S. Mehrotra. Relevance feedback: a power tool for interactive content-based image retrieval. IEEE Trans. on CSVT, 8(5):644--655, 1998.
|
| |
30
|
M. Sanderson and P. Clough. cross-language image retrieval track. http://imageclef.org/.
|
| |
31
|
P. Shenoy and D. Tan. Human-Aided Computing: Utilizing Implicit Human Processing to Classify Images. In Proc. CHI.
|
| |
32
|
A. F. Smeaton, P. Over, and W. Kraaij. Evaluation campaigns and trecvid. In MIR '06: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval, pages 321--330, 2006.
|
| |
33
|
S. Thorpe, D. Fize, and C. Marlot. Speed of processing in the human visual system. Nature, 381(6582):520--522, 1996.
|
| |
34
|
J. Wang, S. F. Chang, X. Zhou, and S. T. C. Wong. Active microscopic cellular image annotation by superposable graph transduction with imbalanced labels. In Proc. CVPR, 2008.
|
| |
35
|
J. Wang, T. Jebara, and S.-F. Chang. Graph transduction via alternating minimization. In Proc. ICML, 2008.
|
| |
36
|
J. Wang, Y.-G. Jiang, and S.-F. Chang. Label diagnosis through self tuning for web image search. In Proc. CVPR, 2009.
|
| |
37
|
D. Zhou, J. Weston, A. Gretton, O. Bousquet, and B. Scholkopf. Ranking on data manifolds. In Proc. NIPS, 2004.
|
|