ACM Home Page
Please provide us with feedback. Feedback
Sanitization's slippery slope: the design and study of a text revision assistant
Full text PdfPdf (5.66 MB)
Source
ACM International Conference Proceeding Series archive
Proceedings of the 5th Symposium on Usable Privacy and Security table of contents
Mountain View, California
SESSION: Tools table of contents
Article No. 13  
Year of Publication: 2009
ISBN:978-1-60558-736-3
Authors
Richard Chow  PARC
Ian Oberst  Oregon State University
Jessica Staddon  PARC
Sponsors
: Carnegie Mellon CyLab
: Google
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 16,   Downloads (12 Months): 23,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1572532.1572550
What is a DOI?

ABSTRACT

For privacy reasons, sensitive content may be revised before it is released. The revision often consists of redaction, that is, the "blacking out" of sensitive words and phrases. Redaction has the side effect of reducing the utility of the content, often so much that the content is no longer useful. Consequently, government agencies and others are increasingly exploring the revision of sensitive content as an alternative to redaction that preserves more content utility. We call this practice sanitization. In a sanitized document, names might be replaced with pseudonyms and sensitive attributes might be replaced with hypernyms. Sanitization adds to redaction the challenge of determining what words and phrases reduce the sensitivity of content. We have designed and developed a tool to assist users in sanitizing sensitive content. Our tool leverages the Web to automatically identify sensitive words and phrases and quickly evaluates revisions for sensitivity. The tool, however, does not identify all sensitive terms and mistakenly marks some innocuous terms as sensitive. This is unavoidable because of the difficulty of the underlying inference problem and is the main reason we have designed a sanitization assistant as opposed to a fully-automated tool. We have conducted a small study of our tool in which users sanitize biographies of celebrities to hide the celebrity's identity both both with and without our tool. The user study suggests that while the tool is very valuable in encouraging users to preserve content utility and can preserve privacy, this usefulness and apparent authoritativeness may lead to a "slippery slope" in which users neglect their own judgment in favor of the tool's.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
 
4
K. Crawford. Have a blog, lose your job? CNN/Money. February 15, 2005.
 
5
IntelliDact. CSI Computing Systems Innovations. http://www.csisoft.com
6
 
7
D. Lopresti and A. Spitz. Information leakage through document redaction: attacks and countermeasures. Proceedings of Document Recognition and Retrieval XII. January 2005.
 
8
Google Directory. http://www.google.com/dirhp
 
9
C. Johnson, III. Memorandum M-07-16, "Safeguarding against and responding to the breach of personally identifiable information". FAQ. May 22, 2007.
 
10
Judicial Watch. FBI protects Osama bin Laden's "Right to Privacy" in document release. April 20, 2005. http://www.judicialwatch.org/printer_5286.shtml
 
11
J. Markoff. Researchers develop computer techniques to bring blacked-out words to light. The New York Times. May 10, 2004.
 
12
Amazon Mechanical Turk. https://www.mturk.com/mturk/welcome
 
13
OpenNLP. http://opennlp.sourceforge.net/
 
14
RapidRedact. http://www.rapidredact.com/
 
15
S. Shane. Spies do a huge volume of work in invisible ink. The New York Times. October 28, 2007.
 
16
B. Sullivan. California data leak raises questions. Experts wonder: Why do agencies share SSNs? MSNBC. October 27, 2004.
 
17
 
18
V. Plame Wilson. Fair Game: My life as a spy, my betrayal by the White House. Simon and Schuster, 2007.
 
19
A. Witt. Blog Interrupted. The Washington Post. August 15, 2004.
 
20
TrackMeNot. http://mrl.nyu.edu/dhowe/trackmenot/
 
21
WordNet. http://wordnet.princeton.edu

Collaborative Colleagues:
Richard Chow: colleagues
Ian Oberst: colleagues
Jessica Staddon: colleagues