2.1.3 What Is Meant by Data?
- View
- Revisions

We began the rational-number implementation in section 2.1.1 by implementing the rational-number operations add-rat
, sub-rat
, and so on in terms of three unspecified procedures: make-rat
, numer
, and denom
. At that point, we could think of the operations as being defined in terms of data objects — numerators, denominators, and rational numbers — whose behavior was specified by the latter three procedures.
But exactly what is meant by data? It is not enough to say “whatever is implemented by the given selectors and constructors.” Clearly, not every arbitrary set of three procedures can serve as an appropriate basis for the rational-number implementation. We need to guarantee that, if we construct a rational number x from a pair of integers n and d, then extracting the numer
and the denom
of x and dividing them should yield the same result as dividing n by d. In other words, make-rat
, numer
, and denom
must satisfy the condition that, for any integer n and any non-zero integer d, if x is (make-rat n d
), then
In fact, this is the only condition make-rat
, numer
, and denom
must fulfill in order to form a suitable basis for a rational-number representation. In general, we can think of data as defined by some collection of selectors and constructors, together with specified conditions that these procedures must fulfill in order to be a valid representation.[5]
This point of view can serve to define not only “high-level” data objects, such as rational numbers, but lower-level objects as well. Consider the notion of a pair, which we used in order to define our rational numbers. We never actually said what a pair was, only that the language supplied procedures cons
, car
, and cdr
for operating on pairs. But the only thing we need to know about these three operations is that if we glue two objects together using cons
we can retrieve the objects using car
and cdr
. That is, the operations satisfy the condition that, for any objects x
and y
, if z
is (cons x y)
then (car z)
is x
and (cdr z)
is y
. Indeed, we mentioned that these three procedures are included as primitives in our language. However, any triple of procedures that satisfies the above condition can be used as the basis for implementing pairs. This point is illustrated strikingly by the fact that we could implement cons
, car
, and cdr
without using any data structures at all but only using procedures. Here are the definitions:
(define (cons x y)
(define (dispatch m)
(cond ((= m 0) x)
((= m 1) y)
(else (error "Argument not 0 or 1 -- CONS" m))))
dispatch)
(define (car z) (z 0))
(define (cdr z) (z 1))
This use of procedures corresponds to nothing like our intuitive notion of what data should be. Nevertheless, all we need to do to show that this is a valid way to represent pairs is to verify that these procedures satisfy the condition given above.
The subtle point to notice is that the value returned by (cons x y)
is a procedure — namely the internally defined procedure dispatch
, which takes one argument and returns either x
or y
depending on whether the argument is 0 or 1. Correspondingly, (car z)
is defined to apply z
to 0. Hence, if z
is the procedure formed by (cons x y)
, then z
applied to 0 will yield x
. Thus, we have shown that (car (cons x y))
yields x
, as desired. Similarly, (cdr (cons x y))
applies the procedure returned by (cons x y)
to 1, which returns y
. Therefore, this procedural implementation of pairs is a valid implementation, and if we access pairs using only cons
, car
, and cdr
we cannot distinguish this implementation from one that uses “real” data structures.
The point of exhibiting the procedural representation of pairs is not that our language works this way (Scheme, and Lisp systems in general, implement pairs directly, for efficiency reasons) but that it could work this way. The procedural representation, although obscure, is a perfectly adequate way to represent pairs, since it fulfills the only conditions that pairs need to fulfill. This example also demonstrates that the ability to manipulate procedures as objects automatically provides the ability to represent compound data. This may seem a curiosity now, but procedural representations of data will play a central role in our programming repertoire. This style of programming is often called message passing, and we will be using it as a basic tool in chapter 3 when we address the issues of modeling and simulation.
Exercises
Comments
Post new comment