ACM Home Page
Please provide us with feedback. Feedback
Facilitating discovery on the private web using dataset digests
Full text PdfPdf (163 KB)
Source International Conference on Information Integration and web-based Applications and Services archive
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services table of contents
Linz, Austria
WORKSHOP SESSION: iiWAS 2008 workshops: RED 2008 table of contents
Pages 451-455  
Year of Publication: 2008
ISBN:978-1-60558-349-5
Authors
Peter Mork  The MITRE Corporation, McLean, VA
Ken Smith  The MITRE Corporation, McLean, VA
Barbara Blaustein  The MITRE Corporation, McLean, VA
Chris Wolf  The MITRE Corporation, McLean, VA
Keri Sarver  The MITRE Corporation, McLean, VA
Sponsor
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): n/a,   Downloads (12 Months): n/a,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1497308.1497391
What is a DOI?

ABSTRACT

Whereas strategies for discovering content on the surface web are commonplace, similar strategies for the private web are nonexistent. In this paper we first establish a formal framework for advertising the existence of private web resources that subsumes many existing summarization strategies based on succinct statistical summaries (which we call digests). We then investigate the tradeoff between the data owners' desires to minimize disclosure and the searchers' desires to minimize query error, demonstrating that our techniques are superior to k-anonymity. Finally, we show that our techniques for summarization do, in fact, make it possible to discover private web data resources.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
"Neuroscience Information Framework," http://neurogateway.org/.
 
2
J. Callan, "Distributed Information Retrieval," in Advances in Information Retrieval, W. B. Croft, Ed.: Kluwer Academic Publishers, 2000, pp. 127--150.
 
3
A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum Likelihood from Incomplete Data via the EM algorithm," Journal of the Royal statistical Society, vol. 39, pp. 1--38, 1977.
 
4
N. Fost and R. J. Levine, "The Dysregulation of Human Subjects Research," Journal of the American Medical Association, vol. 298, pp. 2196--2198, 2007.
 
5
M. Y. Galperin, "The Molecular Biology Database Collection: 2008 update," http://www3.oup.co.uk/nar/database/c/.
 
6
D. Gardner, A. W. Toga, G. Ascoli, J. Beatty, J. F. Brinkley, A. Dale, P. Fox, E. Gardner, J. George, N. Goddard, K. Harris, E. Herskovits, M. Hines, G. Jacobs, R. Jacobs, E. Jones, D. Kennedy, D. Kimberg, J. Mazziotta, P. Miller, S. Mori, D. Mountain, A. Reiss, G. Rosen, D. Rottenberg, G. Shepherd, N. Smalheiser, K. Smith, T. Strachan, D. Van Essen, R. Williams, and S. Wong, "Towards Effective and Rewarding Data Sharing," NeuroInformatics, vol. 1, pp. 289--296, 2003.
7
8
 
9
N. Hamilton, "The Mechanics of a Deep Net Metasearch Engine," http://turbo10.com/papers/deepnet.pdf.
10
11
 
12
13
 
14
L. Kaufman and P. J. Rousseuw, Finding Groups in Data: An Introduction to Cluster Analysis: Wiley, 1990.
 
15
S. H. Koslow, "Sharing Primary Data: A Threat or Asset to Discovery?," Nature Reviews Neuroscience, vol. 3, pp. 311--313, 2002.
16
17
 
18
Library of Congress, "Search/Retrieval via URL," http://www.loc.gov/standards/sru/.
 
19
 
20
 
21
R. B. Ness, "Influence of the HIPAA Privacy Rule on Health Research," Journal of the American Medical Association, vol. 298, pp. 2164--2170, 2007.
 
22
NIH, "NIH Policy on data sharing," http://grants2.nih.gov/grants/policy/data_sharing.
23
 
24
J. Qiu, F. Shao, M. Zatsman, and J. Shanmugasundaram, "International Workshop on Web and Databases," in WebDB, San Diego, CA, 2003, pp. 79--86.
 
25
 
26
 
27
J. P. Stenbit, "Department of Defense Net-Centric Data Strategy," http://www.defenselink.mil/nii/org/cio/doc/Net-Centric-Data-Strategy-2003-05-092.pdf.
 
28
29
 
30
G. K. Zipf, Human Behavior and the Principle of Least Effort. Reading, MA: Addison-Wesley, 1949.

Collaborative Colleagues:
Peter Mork: colleagues
Ken Smith: colleagues
Barbara Blaustein: colleagues
Chris Wolf: colleagues
Keri Sarver: colleagues