|
Warning: The download time has expired please click on the item to try again.
ABSTRACT
Programmers often omit input validation when inputs can appear in many different formats or when validation criteria cannot be precisely specified. To enable validation in these situations, we present a new technique that puts valid inputs into a consistent format and that identifies "questionable" inputs which might be valid or invalid, so that these values can be double-checked by a person or a program. Our technique relies on the concept of a "tope", which is an application-independent abstraction describing how to recognize and transform values in a category of data. We present our definition of topes and describe a development environment that supports the implementation and use of topes. Experiments with web application and spreadsheet data indicate that using our technique improves the accuracy and reusability of validation code and also improves the effectiveness of subsequent data cleaning such as duplicate identification.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Allen, E. et. al. The Fortress Language Specification, Sun Microsystems, 2006.
|
| |
3
|
Aslam, T., Krsul, I., and Spafford, E. Use of a Taxonomy of Security Faults. Tech. Rpt. TR-96-051, Purdue University, 1996.
|
| |
4
|
|
| |
5
|
|
| |
6
|
Margaret Burnett , Curtis Cook , Omkar Pendse , Gregg Rothermel , Jay Summet , Chris Wallace, End-user software engineering with assertions in the spreadsheet paradigm, Proceedings of the 25th International Conference on Software Engineering, May 03-10, 2003, Portland, Oregon
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
Fisher II, M., and Rothermel, G. The EUSES Spreadsheet Corpus: A Shared Resource for Supporting Experimentation with Spreadsheet Dependability Mechanisms. Tech. Rpt. 04-12-03, University of Nebraska?Lincoln, 2004.
|
| |
11
|
|
| |
12
|
|
 |
13
|
|
| |
14
|
|
| |
15
|
Kennedy, A. Programming Languages and Dimensions. PhD thesis, Tech. Rpt. 391, University of Cambridge, 1996.
|
| |
16
|
Marsh, E., and Perzanowski, D. MUC-7 Evaluation of IE Technology: Overview of Results. 7th Message Understanding Conf., 2001.
|
| |
17
|
|
 |
18
|
|
| |
19
|
|
| |
20
|
|
| |
21
|
Plasmeijer, R., and Achten, P. The Implementation of iData?A Case Study in Generic Programming. Tech Rpt. TCD-CS-2005-60, Dublin University, 2005.
|
| |
22
|
Porter, M. An Algorithm for Suffix Stripping. Program, 14, 3 (July 1980), 130--137.
|
| |
23
|
Rahm, E., and Do, H. Data Cleaning: Problems and Current Approaches. IEEE Data Eng. Bulletin, 23, 4 (Dec. 2000), 3--13.
|
| |
24
|
|
 |
25
|
|
| |
26
|
|
 |
27
|
Karen J. Rothermel , Curtis R. Cook , Margaret M. Burnett , Justin Schonfeld , T. R. G. Green , Gregg Rothermel, WYSIWYT testing in the spreadsheet paradigm: an empirical evaluation, Proceedings of the 22nd international conference on Software engineering, p.230-239, June 04-11, 2000, Limerick, Ireland
[doi> 10.1145/337180.337206]
|
| |
28
|
Scaffidi, C. Unsupervised Inference of Data Formats in Human-Readable Notation. Proc. 9th Intl. Conf. Enterprise Integration Systems ? HCI Volume, 2007, 236--241.
|
| |
29
|
Scaffidi, C., Shaw, M. Accommodating Data Heterogeneity in ULS Systems. 2nd Intl. Workshop on Ultra-Large-Scale Software-Intensive Systems, at the 30th Intl. Conf. Software Engineering, to appear.
|
| |
30
|
Scaffidi, C., Myers, B., and Shaw, M. Challenges, Motivations, and Success Factors in the Creation of Hurricane Katrina "Person Locator" Web Sites. Psychology of Programming Interest Group Workshop, 2006.
|
| |
31
|
Scaffidi, C., Myers, B., and Shaw, M. The Topes Format Editor and Parser. Tech Rpt. CMU-ISRI-07-104, Carnegie Mellon University, 2007.
|
 |
32
|
|
| |
33
|
|
| |
34
|
Scaffidi, C., Shaw, M., and Myers, B. Games Programs Play: Obstacles to Data Reuse, 2nd Workshop on End User Soft. Eng, 2006.
|
 |
35
|
|
| |
36
|
Zadeh, L. Fuzzy Logic. Tech Rpt. CSLI-88-116, Stanford University, 1988.
|
CITED BY 6
|
|
Christopher Scaffidi , Allen Cypher , Sebastian Elbaum , Andhy Koesnandar , James Lin , Brad Myers , Mary Shaw, Using topes to validate and reformat data in end-user programming tools, Proceedings of the 4th international workshop on End-user software engineering, p.11-15, May 12-12, 2008, Leipzig, Germany
|
|
|
|
|
|
|
|
|
Andhy Koesnandar , Sebastian Elbaum , Gregg Rothermel , Lorin Hochstein , Christopher Scaffidi , Kathryn T. Stolee, Using assertions to help end-user programmers create dependable web macros, Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering, November 09-14, 2008, Atlanta, Georgia
|
|
|
James Lin , Jeffrey Wong , Jeffrey Nichols , Allen Cypher , Tessa A. Lau, End-user programming of mashups with vegemite, Proceedings of the 13th international conference on Intelligent user interfaces, February 08-11, 2009, Sanibel Island, Florida, USA
|
|
|
|
|