|
ABSTRACT
Traditionally, types describe the internal data manipulated by programs. To accommodate the variety of desired data structures, language designers and type theorists have developed a wide variety of types and type constructors. But not all useful data is in programs; in fact, enormous amounts of it sit on disks or stream by on wires in a dizzying array of encodings and formats. It turns out that many of the types developed for internal data can be used to describe external data: tuples, records, unions, options, and lists come to mind as obvious examples. Perhaps more surprisingly, recursive types, singletons, functions, parametric polymorphism, and dependent types are relevant as well. Using types to describe external data leads naturally to the insight that we can reuse the same type to define an internal data structure and to generate parsing and printing functions to map between the two representations. The PADS project [1] has exploited this idea, building data description languages based on the type structure of C (PADS/C [3] and on ML (PADS/ML [5] and exploring the theoretical basis for such languages with the Data Description Calculus (DDC) [4]. Other groups have also leveraged this insight, most closely the work on DataScript [2] and PacketTypes [6]. Continuing the analogy, it turns out that other concepts from the types world are also relevant to ad hoc data processing, including generic programming, type inference, type isomorphisms, and subtyping.In this talk, I will describe the domain of ad hoc data processing and explain how types enable precise descriptions of such data. I will then explore the question of type inference, describing quantitative techniques we are currently developing to construct a description of ad hoc data given example instances.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
padsproject PADS project.
|
| |
2
|
|
 |
3
|
|
 |
4
|
Kathleen Fisher , Yitzhak Mandelbaum , David Walker, The next 700 data description languages, Conference record of the 33rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages, p.2-15, January 11-13, 2006, Charleston, South Carolina, USA
|
 |
5
|
Yitzhak Mandelbaum , Kathleen Fisher , David Walker , Mary Fernandez , Artem Gleyzer, PADS/ML: a functional data description language, Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages, January 17-19, 2007, Nice, France
|
 |
6
|
Peter J. McCann , Satish Chandra, Packet types: abstract specification of network protocol messages, Proceedings of the conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, p.321-333, August 28-September 01, 2000, Stockholm, Sweden
|
|