|
ABSTRACT
We are developing technology for generating English textual summaries of time-series data, in three domains: weather forecasts, gas-turbine sensor readings, and hospital intensive care data. Our weather-forecast generator is currently operational and being used daily by a meteorological company. We generate summaries in three steps: (a) selecting the most important trends and patterns to communicate; (b) mapping these patterns onto words and phrases; and (c) generating actual texts based on these words and phrases. In this paper we focus on the first step, (a), selecting the information to communicate, and describe how we perform this using modified versions of standard data analysis algorithms such as segmentation. The modifications arose out of empirical work with users and domain experts, and in fact can all be regarded as applications of the Gricean maxims of Quality, Quantity, Relevance, and Manner, which describe how a cooperative speaker should behave in order to help a hearer correctly interpret a text. The Gricean maxims are perhaps a key element of adapting data analysis algorithms for effective communication of information to human users, and should be considered by other researchers interested in communicating data to human users.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Dale R. and Reiter E. Computational Interpretations of the Gricean Maxims in the Generation of Referring Expressions. Cognitive Science 19:233--263, 1995.
|
| |
2
|
Dasgupta, D. and Forrest, S. Novelty Detection in Time Series Data using Ideas from Immunology. In: Proceedings of the 5th International Conference on Intelligent Systems, Reno, June 19--21, 1996.
|
| |
3
|
Ewing G., Ferguson L., Freer Y., Hunter J. and McIntosh, N. Observational Data Acquired on a Neonatal Intensive Care Unit. Technical Report AUCS/TR0205, Dept. of Comp. Science, Univ. of Aberdeen, 2002.
|
| |
4
|
Grice, H. P. Logic and Conversation. In Cole P. and Morgan J. (Eds), Syntax and Semantics: Vol 3, Speech Acts. Academic Press, New York, pp. 43--58, 1975.
|
| |
5
|
Grishman R, Kittredge R (Eds). Analyzing Language in Restricted Domains: Sublanguage Description and Processing. Hillsdale, New Jersey: Lawrence Erlbaum Associates, 1986.
|
| |
6
|
|
| |
7
|
|
 |
8
|
|
| |
9
|
Levinson S. C. Pragmatics. Cambridge University Press, 1983.
|
| |
10
|
Lin, J. Keogh, E. Patel, P. & Lonardi, S. Finding motifs in time series. In: Proceedings of the 2nd Workshop on Temporal Data Mining, at the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. (July 23--26, 2002). Edmonton, Alberta, Canada, 2002.
|
 |
11
|
|
| |
12
|
Parikh, R. Vagueness and utility: The semantics of common nouns. Linguistics and Philosophy, 17:521--535, 1994.
|
| |
13
|
Plaisant, C., Mushlin, R., Snyder, A., Li, J., Heller, D., and Shneiderman, B. LifeLines: Using Visualization to Enhance Navigation and Analysis of Patient Records. Revised version in 1998 American Medical Informatic Association Annual Fall Symposium (Orlando, Nov. 9--11, 1998), p. 76--80, AMIA, Bethesda MD, 1998.
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
Reiter E., Sripada, S. and Robertson, R. Acquiring Correct Knowledge for Natural Language Generation. Journal of Artificial Intelligence Research, 18:491--516, 2003.
|
| |
18
|
Reiter E., Sripada, S. and Williams, S. Acquiring and Using Limited User Models in NLG. In Proceedings of ENLGW 2003, Budapest, Hungary, pp. 87--94, 2003.
|
| |
19
|
John F. Roddick , Kathleen Hornsby , Myra Spiliopoulou, An Updated Bibliography of Temporal, Spatial, and Spatio-temporal Data Mining Research, Proceedings of the First International Workshop on Temporal, Spatial, and Spatio-Temporal Data Mining-Revised Papers, p.147-164, September 12, 2000
|
| |
20
|
|
| |
21
|
Somayajulu G. Sripada , Ehud Reiter , Jim Hunter , Jin Yu, A two-stage model for content determination, Proceedings of the 8th European workshop on Natural Language Generation, p.1-8, July 06-07, 2001, Toulouse, France
[doi> 10.3115/1117840.1117842]
|
| |
22
|
Sripada, S., Reiter, E., Hunter J., and Yu, J. Segmenting Time Series for Weather Forecasting,. In: Macintosh, A., Ellis, R. and Coenen, F. (ed) Applications and Innovations in Intelligent Systems X, Proceedings of ES2002, pp. 193--206, 2002.
|
| |
23
|
|
| |
24
|
Sripada, S., Reiter, E., Hunter J. and Yu J. Exploiting a parallel TEXT-DATA corpus. In Proceedings of Corpus Linguistics 2003, p. 734--743. Lancaster, U.K. 2003.
|
| |
25
|
Sripada S., Reiter, E., Hunter, J. Yu J. and Davy, I. Modelling the Task of Summarising Time Series Data using KA Techniques. In: Macintosh, A., Moulton, M. and Preece, A. (ed) Applications and Innovations in Intelligent Systems IX, Proceedings of ES2001, pp. 183--196, 2001.
|
| |
26
|
|
| |
27
|
|
| |
28
|
Yager, R. R., "On Linguistic Summaries of Data," in Knowledge Discovery in Databases, Piatetsky-Shapiro, G. & Frawley, B. (eds.), Cambridge, MA.: MIT Press, 347--363, 1991.
|
| |
29
|
Yu, J., Hunter, J., Reiter E., and Sripada, S. SUMTIME-TURBINE: A Knowledge-Based System to Communicate Time Series Data in the Gas Turbine Domain. To appear in The 16th International Conference on Industrial & Engineering Applications of Artificial Intelligence and Expert Systems, Loughborough, UK, June 23--26, 2003.
|
|