ABSTRACT
Contents of documents can be characterized not just by subject matter or topic, but also in terms of more thematic or even stylistic attributes, such as level of detail, emotional vs. unemotional presentation, personal vs. impersonal voice, positive or negative attitude or "spin," theoretical vs. practical presentation, etc. As implied by these examples, such attributes often form continuous dimensions along which documents can be distinguished. In the best case, the appropriate place on these dimensions -- i.e., which documents will be of most value to the user at the current time -- can be determined by the user's current task, prior knowledge, etc. Does the user need more detail, or less, for example? In the current state of the art, it is difficult to compute this automatically. Instead, we propose that users should be enabled view these dimensions using themometers and to navigate them using themostats -- controls that offer users a choice of documents on the same topic, but with differing levels of detail, positive or negative spin, personal or impersonal voice, etc. The dimensions along which we can provide this kind of control must be easily computable. We have been able to characterize documents with respect to a number of these dimensions using simple statistical measures as well as specialized dictionaries, and describe these methods herein.