ACM Home Page
Please provide us with feedback. Feedback
Semantic anomaly detection in online data sources
Full text PdfPdf (1.45 MB)
Source International Conference on Software Engineering archive
Proceedings of the 24th International Conference on Software Engineering table of contents
Orlando, Florida
SESSION: Technical papers: dynamic program analysis table of contents
Pages: 302 - 312  
Year of Publication: 2002
ISBN:1-58113-472-X
Authors
Orna Raz  Carnegie Mellon University, Pittsburgh PA
Philip Koopman  Carnegie Mellon University, Pittsburgh PA
Mary Shaw  Carnegie Mellon University, Pittsburgh PA
Sponsors
IEEE-CS\DATC : IEEE Computer Society
ACM: Association for Computing Machinery
SIGSOFT: ACM Special Interest Group on Software Engineering
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 40,   Citation Count: 21
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/581339.581378
What is a DOI?

ABSTRACT

Much of the software we use for everyday purposes incorporates elements developed and maintained by someone other than the developer. These elements include not only code and databases but also dynamic data feeds from online data sources. Although everyday software is not mission critical, it must be dependable enough for practical use. This is limited by the dependability of the incorporated elements.It is particularly difficult to evaluate the dependability of dynamic data feeds, because they may be changed by their proprietors as they are used. Further, the specifications of these data feeds are often even sketchier than the specifications of software components.We demonstrate a method of inferring invariants about the normal behavior of dynamic data feeds. We use these invariants as proxies for specifications to perform on-going detection of anomalies in the data feed. We show the feasibility of our approach and demonstrate its usefulness for semantic anomaly detection: identifying occasions when a dynamic data feed is delivering unreasonable values, even though its behavior may be superficially acceptable (i.e., it is delivering parsable results in a timely fashion).


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Alexa browser enhancement. www.alexa.com. Accessed April 2001.
2
 
3
 
4
Stock quotes data source. finance.northernlight.com. Accessed September-November 2000.
 
5
Stock quotes data source. qs2.cnnfn.com. Accessed September-November 2000.
 
6
Stock quotes data source. quote.pathfinder.com. Accessed September-November 2000.
 
7
 
8
 
9
Google search engine. www.google.com. Accessed April 2001.
 
10
Go!Zilla download manager. www.gozilla.com. Accessed April 2001.
 
11
GritBot. http://www.rulequest.com/gritbot-info.html. Accessed January 2002.
12
 
13
Kangaroo. www.kangaroonet.com. Accessed August 2001.
 
14
C. Knoblock, K. Lerman, S. Minton, and I. Muslea. Accurately and reliably extracting data from the web: A machine learning approach. In Data Engineering Bulletin, 1999.
 
15
16
 
17
 
18
 
19
 
20
21
 
22
 
23
 
24
O. Raz and M. Shaw. Software risk management and insurance. Position paper. In 3rd Workshop on Economics-Driven Software Engineering Research, 2001.
 
25
Dow Jones average collapses to 0.20. TheRegister, March 19 2001. www.theregister.co.uk/content/28/17700.html.
 
26
 
27
 
28
XML 1.0, W3C recommendation. w3c.org, http://www.w3.org/TR/2000/REC-xml-20001006. Accessed Nov. 2001.

CITED BY  21

Collaborative Colleagues:
Orna Raz: colleagues
Philip Koopman: colleagues
Mary Shaw: colleagues