|
ABSTRACT
We describe a learning-based method for low-level vision problems—estimating scenes from images. We generate a synthetic world of scenes and their corresponding rendered images, modeling their relationships with a Markov network. Bayesian belief propagation allows us to efficiently find a local maximum of the posterior probability for the scene, given an image. We call this approach VISTA—Vision by Image/Scene TrAining. We apply VISTA to the “super-resolution” problem (estimating high frequency details from a low-resolution image), showing good results. To illustrate the potential breadth of the technique, we also apply it in two other problem domains, both simplified. We learn to distinguish shading from reflectance variations in a single image under particular lighting conditions. For the motion estimation problem in a “blobs world”, we show figure/ground discrimination, solution of the aperture problem, and filling-in arising from application of the same probabilistic machinery.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Adelson, E.H. 1995. Personal communication.
|
| |
2
|
Barrow, H.G. and Tenenbaum, J.M. 1981. Computational vision. <i>Proc. IEEE</i>, 69(5):572-595.
|
| |
3
|
Bell, A.J. and Sejnowski, T.J. 1997. The independent components of natural scenes are edge filters. <i>Vision Research</i>, 37(23):3327- 3338.
|
| |
4
|
Berger, J.O. 1985. <i>Statistical Decision Theory and Bayesian Analysis </i>. Springer: Berlin.
|
| |
5
|
Besag, J. 1974. Spatial interaction and the statistical analysis of lattice systems (with discussion). <i>J. Royal Statist. Soc. B</i>, 36:192-326.
|
| |
6
|
Binford, T., Levitt, T. and Mann, W. 1988. Bayesian inference in model-based machine vision. In <i>Uncertainty in Artificial Intelligence </i>, J.F. Lemmer and L.M. Kanal (Eds.), Morgan Kaufmann: Los Alos, CA.
|
| |
7
|
|
| |
8
|
Burt, P.J. and Adelson, E.H. 1983. The Laplacian pyramid as a compact image code. <i>IEEE Trans. Comm.</i>, 31(4):532-540.
|
| |
9
|
Carandini, M. and Heeger, D.J. 1994, Summation and division by neurons in primate visual cortex. <i>Science</i>, 264:1333-1336.
|
| |
10
|
|
| |
11
|
|
| |
12
|
Freeman, W.T. 1994. The generic viewpoint assumption in a framework for visual perception. <i>Nature</i>, 368(6471):542-545.
|
| |
13
|
Freeman, W.T., Haddon, J.A., and Pasztor, E.C. 2001. Learning motion analysis. In <i>Statistical Theories of the Brain</i>, R. Rao, B. Olshausen, and M. Lewicki (Eds.), MIT Press, Cambridge, MA. See also http://www.merl.com/reports/TR2000-32.
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
Frey, B.J. 2000. Filling in scenes by propagating probabilities through layers and into appearance models. In <i>Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition</i>, Hilton Head Island, S.C.
|
| |
18
|
|
| |
19
|
Geman, S. and Geman, D. 1984. Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images. <i>IEEE Pattern Analysis and Machine Intelligence</i>, 6:721-741.
|
 |
20
|
|
| |
21
|
|
| |
22
|
|
| |
23
|
Hurlbert, A.C. and Poggio, T.A. 1988. Synthesizing a color algorithm from examples. <i>Science</i>, 239:482-485.
|
| |
24
|
|
| |
25
|
Jahne, B. 1991. <i>Digital Image Processing</i>. Springer-Verlag: Berlin.
|
| |
26
|
|
| |
27
|
Jordan, M.I., Kearns, M.J., and Solla, S.A. (Eds.), MIT Press, Cambridge, MA. See also http://www.merl.com/reports/TR98-05.
|
| |
28
|
Kersten, D., O'Toole, A.J., Sereno, M.E., Knill, D.C., and Anderson, J.A. 1987. Associative learning of scene parameters from images. <i>Applied Optics</i>, 26(23):4999-5006.
|
| |
29
|
|
| |
30
|
|
| |
31
|
Kschischang, F.R. and Frey, B.J. 1998. Iterative decoding of compound codes by probability propagation in graphical models. <i>IEEE Journal on Selected Areas in Communication</i>, 16(2):219-230.
|
| |
32
|
|
| |
33
|
Luettgen, M.R., Karl, W.C., and Willsky, A.S. 1994. Efficient multi-scale regularization with applications to the computation of optical flow. <i>IEEE Trans. Image Processing</i>, 3(1):41-64.
|
| |
34
|
McEliece, R., MacKay, D., and Cheng, J. 1998. Turbo decoding as as an instance of pearl's 'Belief Propagation' algorithm. <i>IEEE J. on Sel. Areas in Comm.</i>, 16(2):140-152.
|
| |
35
|
Olshausen, B.A. and Field, D.J. 1996. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. <i>Nature</i>, 381:607-609.
|
| |
36
|
|
| |
37
|
|
| |
38
|
|
| |
39
|
Polvere, M. 1998. Mars v. 1.0, A quadtree based fractal image coder/decoder. http://inls.ucsd.edu/y/Fractals/.
|
| |
40
|
Rosenfeld, A., Hummel, R.A., and Zucker, S.W. 1976. Scene labeling by relaxation operations. <i>IEEE Trans. Systems, Man, Cybern</i>, 6(6):420-433.
|
| |
41
|
Saund, E. 1999. Perceptual organization of occluding contours generated by opaque surfaces. In <i>Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition.</i>, Ft. Collins, CO.
|
| |
42
|
Schultz, R.R. and Stevenson, R.L. 1994. A Bayesian approach to image expansion for improved definition. <i>IEEE Trans. Image Processing </i>, 3(3):233-242.
|
| |
43
|
Simoncelli, E.P. 1997. Statistical models for images: Compression, restoration and synthesis. In <i>31st Asilomar Conf. on Sig., Sys. and Computers</i>, Pacific Grove, CA.
|
| |
44
|
Sinha, P. and Adelson, E.H. 1993. Recovering reflectance and illumination in a world of painted polyhedra. In <i>Proc. 4th Intl. Conf. Comp. Vis.</i>, pp. 156-163.
|
| |
45
|
|
| |
46
|
|
| |
47
|
Weiss, Y. 1997. Interpreting images by propagating Bayesian beliefs. <i>Adv. in Neural Information Processing Systems</i>, Vol. 9. pp. 908- 915.
|
| |
48
|
|
| |
49
|
Weiss, Y. and Freeman, W.T. 1999. Correctness of belief propagation in Gaussian graphical models of arbitrary topology. Technical Report UCB.CSD-99-1046, Berkeley Computer Science Dept. www.cs.berkeley.edu/~yweiss/gaussTR. ps.gz.
|
| |
50
|
Weiss, Y. and Freeman, W.T. 2001. On the optimality of solutions of the max-product belief propagation algorithm in arbitrary graphs. <i>IEEE Trans. Info. Theory</i>. Special issue on codes on Graphs and Iterative Algorithms. See also: http://www.merl.com/reports/TR 99-39.
|
| |
51
|
Yedidia, J.S., Freeman, W.T., and Weiss, Y. 2000. Generalized belief propagation. Technical Report 2000-26, MERL, Mitsubishi Electric Research Labs., www.merl.com.
|
| |
52
|
|
CITED BY 67
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hong Chen , Ziqiang Liu , Chuck Rose , Yingqing Xu , Heung-Yeung Shum , David Salesin, Example-based composite sketching of human portraits, Proceedings of the 3rd international symposium on Non-photorealistic animation and rendering, June 07-09, 2004, Annecy, France
|
|
|
|
|
|
|
|
|
Marco Cristani , Dong Seon Cheng , Vittorio Murino , Donato Pannullo, Distilling information with super-resolution for video surveillance, Proceedings of the ACM 2nd international workshop on Video surveillance & sensor networks, October 15-15, 2004, New York, NY, USA
|
|
|
|
|
|
Aseem Agarwala , Ke Colin Zheng , Chris Pal , Maneesh Agrawala , Michael Cohen , Brian Curless , David Salesin , Richard Szeliski, Panoramic video textures, ACM Transactions on Graphics (TOG), v.24 n.3, July 2005
|
|
|
Tian-Tsong Ng , Shih-Fu Chang , Jessie Hsu , Lexing Xie , Mao-Pei Tsui, Physics-motivated features for distinguishing photographic images and computer graphics, Proceedings of the 13th annual ACM international conference on Multimedia, November 06-11, 2005, Hilton, Singapore
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Julian J. McAuley , Tibério S. Caetano , Alex J. Smola , Matthias O. Franz, Learning high-order MRF priors of color images, Proceedings of the 23rd international conference on Machine learning, p.617-624, June 25-29, 2006, Pittsburgh, Pennsylvania
|
|
|
|
|
|
|
|
|
Takashi Sugaya , Koichi Takase , Toshiya Nakaguchi , Norimichi Tsumura , Hideto Motomura , Katsuhiro Kanamori , Yoichi Miyake, Super resolution based on texton substitution, ACM SIGGRAPH 2004 Posters, August 08-12, 2004, Los Angeles, California
|
|
|
Fang Wen , Qing Luan , Lin Liang , Ying-Qing Xu , Heung-Yeung Shum, Color sketch generation, Proceedings of the 4th international symposium on Non-photorealistic animation and rendering, June 05-07, 2006, Annecy, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Feng Liu , Jinjun Wang , Shenghuo Zhu , Michael Gleicher , Yihong Gong, Noisy video super-resolution, Proceeding of the 16th ACM international conference on Multimedia, October 26-31, 2008, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Xiaoguang Li , Kin Man Lam , Guoping Qiu , Lansun Shen , Suyu Wang, Example-based image super-resolution with class-specific predictors, Journal of Visual Communication and Image Representation, v.20 n.5, p.312-322, July, 2009
|
|
|
|
|
|
Shengyang Dai , Mei Han , Wei Xu , Ying Wu , Yihong Gong , Aggelos K. Katsaggelos, SoftCuts: a soft edge smoothness prior for color image super-resolution, IEEE Transactions on Image Processing, v.18 n.5, p.969-981, May 2009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Kai Guo , Xiaokang Yang , Rui Zhang , Songyu Yu, Learning super resolution with global and local constraints, Proceedings of the 2009 IEEE international conference on Multimedia and Expo, p.590-593, June 28-July 03, 2009, New York, NY, USA
|
|
|
Jinjun Wang , Shenghuo Zhu , Yihong Gong, Resolution-invariant image representation for content-based zooming, Proceedings of the 2009 IEEE international conference on Multimedia and Expo, p.918-921, June 28-July 03, 2009, New York, NY, USA
|
|
|
|
|
|
|
|
|
|
|