| Language support for processing distributed ad hoc data |
| Full text |
Pdf
(449 KB)
|
Source
|
International Conference on Principles and Practice of Declarative Programming
archive
Proceedings of the 11th ACM SIGPLAN conference on Principles and practice of declarative programming
table of contents
Coimbra, Portugal
SESSION: Distribution
table of contents
Pages 243-254
Year of Publication: 2009
ISBN:978-1-60558-568-0
|
|
Authors
|
|
Kenny Q. Zhu
|
Shanghai Jiao Tong University, Shanghai, China
|
|
Daniel S. Dantas
|
Princeton University, Princeton, NJ, USA
|
|
Kathleen Fisher
|
AT&T Labs Research, Florham Park, NJ, USA
|
|
Limin Jia
|
University of Pennsylvania, Philadelphia, PA, USA
|
|
Yitzhak Mandelbaum
|
AT&T Labs Research, Florham Park, NJ, USA
|
|
Vivek Pai
|
Princeton University, Princeton, NJ, USA
|
|
David Walker
|
Princeton University, Princeton, NJ, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 6, Citation Count: 0
|
|
|
ABSTRACT
This paper presents the design, theory and implementation of Gloves, a domain-specific language that allows users to specify the provenance (the derivation history starting from the origins), syntax and semantic properties of collections of distributed data sources. In particular, Gloves specifications indicate where to locate desired data, how to obtain it, when to get it or to give up trying, and what format it will be in on arrival. The Gloves system compiles such specification into a suite of data-processing tools including an archiver, a provenance tracking system, a database loading tool, an alert system, an RSS feed generator and a debugging tool. In addition, the system generates description-specific libraries so that developers can create their own applications. Gloves also provides a generic infrastructure so that advanced users can build new tools applicable to any data source with a Gloves description. We show how Gloves may be used to specify data sources from two domains: CoMon, a monitoring system for PlanetLab's 800+ nodes, and Arrakis, a monitoring system for an AT&T web hosting service. We show experimentally that our system can scale to distributed systems the size of CoMon. Finally, we provide a denotational semantics for Gloves and use this semantics to prove two important theorems. The first shows that our denotational semantics respects the typing rules for the language, while the second demonstrates that our system correctly maintains the provenance.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Gene ontology project. http://www.geneontology.org/.
|
| |
2
|
HP OpenView products. http://www.managementsoftware.hp.com/products/.
|
| |
3
|
Nagios. http://www.nagios.org/.
|
| |
4
|
P. Amagbégnon, L. Besnard, and P.L. Guernic. Implementation of the data-flow synchronous language SIGNAL. In PLDI, pages 163--173, 1995.
|
| |
5
|
H. Balakrishnan, M.F. Kaashoek, D. Karger, R. Morris, and I. Stoica. Looking up data in p2p systems. Commun. ACM, 46(2):43--48, 2003.
|
| |
6
|
J. Case, M. Fedor, M. Schoffstall, and J. Davin. A simple network management protocol (SNMP). RFC 1157, May 1990.
|
| |
7
|
P. Caspi, D. Pilaud, N. Halbwachs, and J. Plaice. Lustre: A declarative language for programming synchronous systems. In POPL, pages 178--188, 1987.
|
| |
8
|
J. Cheney, A. Ahmed, and U.A. Acar. Provenance as dependency analysis. In Database Programming Languages, volume 4797, pages 138--152. Lecture Notes in Computer Science, 2007.
|
| |
9
|
C. Elliott and P. Hudak. Functional reactive animation. In ICFP, pages 263--273, 1997.
|
| |
10
|
R. Ennals and D. Gay. User-friendly functional programming for web mashups. In ICFP, pages 223--233, 2007.
|
| |
11
|
M. Fernandez, K. Fisher, J. Foster, M. Greenberg, and Y. Mandelbaum. A generic programming toolkit for PADS/ML: First-class upgrades for third-party developers. In PADL, pages 133--149, 2008.
|
| |
12
|
K. Fisher and R. Gruber. PADS: A domain specific language for processing ad hoc data. In PLDI, pages 295--304, 2005.
|
| |
13
|
M.J. Freedman, E. Freudenthal, and D. Mazieres. Democratizing content publication with Coral. In NSDI, 2004.
|
| |
14
|
L. Golab and M.T. Özsu. Issues in data stream management. SIGMOD Record, 32(2):5--14, 2003.
|
| |
15
|
R. Hinze. Generics for the masses. In ICFP, pages 19--22, 1998.
|
| |
16
|
Y. Mandelbaum, K. Fisher, D.Walker, M. Fernandez, and A. Gleyzer. PADS/ML: A functional data description language. In POPL, 2007.
|
| |
17
|
M.L. Massie, B.N. Chun, and D.E. Culler. The Ganglia distributed monitoring system: Design, implementation, and experience. Parallel Computing, 30(7), July 2004.
|
| |
18
|
T.A. Mogensen. Efficient self-interpretations in lambda calculus. Journal of Functional Programming, 2(3):345--363, 1992.
|
| |
19
|
Motion-Twin. XML-Light. http://tech.motion-twin.com/xmllight.html.
|
| |
20
|
C. Myers, D. Barrett, M. Hibbs, C. Huttenhower, and O. Troyanskaya. Finding function: evaluation methods for functional genomic data. BMC Genomics, 7:187, 2006.
|
| |
21
|
C. Myers, D. Robson, A. Wible, M. Hibbs, C. Chiriac, C. Theesfeld, K. Dolinski, and O. Troyanskaya. Discovery of biological networks from diverse functional genomic data. Genome Biology, 6(13), 2005.
|
| |
22
|
T. Oetiker. Round robin database tool. http://oss.oetiker.ch/rrdtool/index.en.html.
|
| |
23
|
T. Oetiker and D. Rand. Multi Router Traffic grapher. http://people.ee.ethz.ch/oetiker/webtools/mrtg.
|
| |
24
|
V. Pai and K. Park. CoMon: Monitoring infrastructure for PlanetLab. http://comon.cs.princeton.edu/.
|
| |
25
|
PlanetLab. An open testbed for developing, deploying and accessing planetary-scale services, September 2002.
|
| |
26
|
R. Sealfon, M. Hibbs, C. Huttenhower, C. Myers, and O. Troyanskaya. GOLEM: An interactive graph-based gene ontology navigation and analysis tool. BMC Bioinformatics, 7:443, 2006.
|
| |
27
|
C. Stark, B.-J. Breitkreutz, T. Reguly, L. Boucher, A. Breitkreutz, and M. Tyers. BioGRID: A general repository for interaction datasets. Nucl. Acids Res., 34:D535--539, 2006.
|
| |
28
|
G. Stolpmann and P. Doane. Ocamlnet 2. http://projects.camlcity.org/projects/ocamlnet.html.
|
| |
29
|
Z. Wan and P. Hudak. Functional reactive programming from first principles. In PLDI, pages 242--252, 2000.
|
| |
30
|
M. Wand. The theory of fexprs is trivial. Lisp and Symbolic Computation, 10:189--199, 1998.
|
| |
31
|
S. Weirich. Encoding intensional type analysis. In ESOP, pages 92--106, 2001.
|
| |
32
|
H. Xi, C. Chen, and G. Chen. Guarded recursive datatype constructors. In POPL, pages 224--235, 2003.
|
| |
33
|
Yahoo pipes. http://pipes.yahoo.com
|
|