| Automating camera management for lecture room environments |
| Full text |
Pdf
(677 KB)
|
| Source
|
Conference on Human Factors in Computing Systems
archive
Proceedings of the SIGCHI conference on Human factors in computing systems
table of contents
Seattle, Washington, United States
Pages: 442 - 449
Year of Publication: 2001
ISBN:1-58113-327-8
|
|
Authors
|
|
Qiong Liu
|
Collaboration and Multimedia Systems Group, Microsoft Research, One Microsoft Way, Redmond, WA
|
|
Yong Rui
|
Collaboration and Multimedia Systems Group, Microsoft Research, One Microsoft Way, Redmond, WA
|
|
Anoop Gupta
|
Collaboration and Multimedia Systems Group, Microsoft Research, One Microsoft Way, Redmond, WA
|
|
J. J. Cadiz
|
Collaboration and Multimedia Systems Group, Microsoft Research, One Microsoft Way, Redmond, WA
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 9, Downloads (12 Months): 75, Citation Count: 10
|
|
|
ABSTRACT
Given rapid improvements in network infrastructure and streaming-media technologies, a large number of corporations and universities are recording lectures and making them available online for anytime, anywhere access. However, producing high-quality lecture videos is still labor intensive and expensive. Fortunately, recent technology advances are making it feasible to build automated camera management systems to capture lectures. In this paper we report on our design, implementation and study of such a system. Compared to previous work-which has tended to be technology centric-we started with interviews with professional video producers and used their knowledge and expertise to create video production rules. We then targeted technology components that allowed us to implement a substantial portion of these rules, including the design of a virtual video director. The system's performance was compared to that of a human operator via a user study. Results suggest that our system's quality in close to that of a human-controlled system. In fact most remote audience members could not tell if the video was produced by a computer or a person.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Arijon, D. Grammar of the film language, New York: Communication arts books, Hastings House Publishers, 1976.
|
| |
2
|
|
| |
3
|
Baumberg, A. & Hogg, D., An efficient method for contour tracking using active shape models, TR 94.11, University of Leeds.
|
| |
4
|
Benesty, J., Adaptive eigenvalue decomposition algorithm for passive acoustic source localization, Journal of Acoustics of America, vol. 107, January 2000, 384-391.
|
| |
5
|
Bianchi, M., AutoAuditorium: a fully automatic, multi-camera system to televise auditorium presentations, Proc. of Joint DARPA/NIST Smart Spaces Technology Workshop, July 1998.
|
| |
6
|
Brandstein, M., A pitch-based approach to time delay estimation of reverberant speech, Proc. IEEE ASSP Workshop Appls. Signal Processing Audio Acoustics, 1997.
|
| |
7
|
Brotherton, J. & Abowd, G., Rooms take note: room takes notes, Proc. AAAI Symposim on Intelligent Environments, 1998, 23-30.
|
| |
8
|
Buxton, W., Sellen, A., & Sheasby, M., Interfaces for multiparty videoconferences, Video-mediated communication (edited by Finn, K., Sellen, A., & Wilbur, S.), Lawrence Erlbaum Publishers.
|
 |
9
|
|
| |
10
|
|
 |
11
|
|
 |
12
|
|
 |
13
|
|
| |
14
|
Jiang, W. & Malvar, H., Adaptive speech noise reduction, Microsoft Research Technical Report, Aug. 1999.
|
 |
15
|
|
 |
16
|
Nosa Omoigui , Liwei He , Anoop Gupta , Jonathan Grudin , Elizabeth Sanocki, Time-compression: systems concerns, usage, and benefits, Proceedings of the SIGCHI conference on Human factors in computing systems: the CHI is the limit, p.136-143, May 15-20, 1999, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/302979.303017]
|
| |
17
|
ParkerVision, http://www.parkervision.com/
|
| |
18
|
PictureTel, http://www.picturetel.com/
|
| |
19
|
PolyCom, http://www.polycom.com/
|
| |
20
|
Stanford Online, http://stanford-onlines.stanford.edu/
|
 |
21
|
Rainer Stiefelhagen , Jie Yang , Alex Waibel, Modeling focus of attention for meeting indexing, Proceedings of the seventh ACM international conference on Multimedia (Part 1), p.3-10, October 30-November 05, 1999, Orlando, Florida, United States
[doi> 10.1145/319463.319464]
|
| |
22
|
Wang, C. & Brandstein, M., A hybrid real-time face tracking system, Proc. of ICASSP98, May 1998, Seattle, 3737-3740.
|
 |
23
|
Shumin Zhai , Carlos Morimoto , Steven Ihde, Manual and gaze input cascaded (MAGIC) pointing, Proceedings of the SIGCHI conference on Human factors in computing systems: the CHI is the limit, p.246-253, May 15-20, 1999, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/302979.303053]
|
CITED BY 10
|
|
|
|
|
|
|
|
Qiong Liu , Don Kimber , Jonathan Foote , Lynn Wilcox , John Boreczky, FlySPEC: a multi-user video camera system with hybrid human and automatic control, Proceedings of the tenth ACM international conference on Multimedia, December 01-06, 2002, Juan-les-Pins, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
INDEX TERMS
Primary Classification:
H.
Information Systems
H.5
INFORMATION INTERFACES AND PRESENTATION (I.7)
H.5.1
Multimedia Information Systems
Subjects:
Video (e.g., tape, disk, DVI)
Additional Classification:
H.
Information Systems
H.5
INFORMATION INTERFACES AND PRESENTATION (I.7)
H.5.1
Multimedia Information Systems
Subjects:
Audio input/output
I.
Computing Methodologies
I.4
IMAGE PROCESSING AND COMPUTER VISION
I.4.8
Scene Analysis
Subjects:
Tracking
K.
Computing Milieux
K.3
COMPUTERS AND EDUCATION
General Terms:
Design,
Human Factors,
Management,
Measurement,
Performance,
Theory
Keywords:
automated camera management,
sound source localization,
speaker tracking,
video production rules,
virtual video director
|