|
ABSTRACT
Applications that use sensor-based estimates face a fundamental tradeoff between true positives and false positives when examining the reliability of these estimates, one that is inadequately described by the straightforward notion of accuracy. To address this tradeoff, this paper examines the use of Receiver Operating Characteristic (ROC) curve analysis, a method that has a long history but is under-appreciated in the human computer interaction research community. We present the fundamentals of ROC analysis, the use of the A' statistic to compute the area under an ROC curve, and the equivalence of A' to the Wilcoxon statistic. We then present several case studies, framed in the context of our work on human interruptibility, demonstrating how ROC analysis can yield better results than analyses based on accuracy. These case studies compare sensor-based estimates with human performance, optimize a feature selection process for the area under the ROC curve, and examine end-user selection of a desirable tradeoff.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Bradley, A. P. (1997) The Use of the Area Under the ROC Curve in the Evaluation of Machine Learning Algorithms. Pattern Recognition, 30. 1145--1159.
|
| |
2
|
Efron, B. and Tibshirani, R. J. (1993) An Introduction to the Bootstrap. Chapman & Hall, London.
|
 |
3
|
James Fogarty , Scott E. Hudson , Christopher G. Atkeson , Daniel Avrahami , Jodi Forlizzi , Sara Kiesler , Johnny C. Lee , Jie Yang, Predicting human interruptibility with sensors, ACM Transactions on Computer-Human Interaction (TOCHI), v.12 n.1, p.119-146, March 2005
[doi> 10.1145/1057237.1057243]
|
 |
4
|
|
 |
5
|
James Fogarty , Andrew J. Ko , Htet Htet Aung , Elspeth Golden , Karen P. Tang , Scott E. Hudson, Examining task engagement in sensor-based statistical models of human interruptibility, Proceedings of the SIGCHI conference on Human factors in computing systems, April 02-07, 2005, Portland, Oregon, USA
[doi> 10.1145/1054972.1055018]
|
| |
6
|
Goffmann, E. On Facework. In Goffmann, E. ed. Interaction Ritual, Random House, New York, 1982, 5--45.
|
| |
7
|
Green, D. and Swets, J. Signal Detection Theory and Psychophysics, John Wiley and Sons, New York, 1966, 45--49.
|
| |
8
|
Hand, D. J. (1997) Construction and Assessment of Classification Rules. Wiley, Chichester.
|
| |
9
|
|
| |
10
|
Hanley, J. A. and McNeil, B. J. (1982) The Meaning and Use of the Area Under a Receiver Operating Characteristic (ROC) Curve. Radiology, 143. 29--36.
|
 |
11
|
Scott Hudson , James Fogarty , Christopher Atkeson , Daniel Avrahami , Jodi Forlizzi , Sara Kiesler , Johnny Lee , Jie Yang, Predicting human interruptibility with sensors: a Wizard of Oz feasibility study, Proceedings of the SIGCHI conference on Human factors in computing systems, April 05-10, 2003, Ft. Lauderdale, Florida, USA
[doi> 10.1145/642611.642657]
|
| |
12
|
|
| |
13
|
McFarlane, D. C. (2002) Comparison of Four Primary Methods for Coordinating the Interruption of People in Human-Computer Interaction. Human-Computer Interaction, 17 (1). 63--139.
|
| |
14
|
Metz, C. E. (1978) Basic Principles of ROC Analysis. Seminars in Nuclear Medicine, 8 (4). 283--298.
|
|