Monday, February 3, 2014

SICP - Compound Data and Data Abstraction

Due to ongoing technical issues with my work computer, I'm shifted gears from .NET and back to the MIT lecture series. Today's lecture dealt with data abstraction.  I found this to be an interesting lecture, particularly the discussion of constructors and selectors. I couldn't help but mentally connect it constructors and properties in a .NET context.  Some more LISP today though I didn't get a chance to try and run it as I haven't installed the LISP interpreter on my laptop, and I am not in the mood to deal with installing it in linux. If I still don't have my PC back from IT tomorrow I might think about installing it.



Also interesting was the bit in the last five minutes or so that dealt with defining pairs (CONS, CAR, CDR) strictly in terms of a procedure, effectively making pairs out of "nothing".

Kept my notes in Geany today, whoop.  Here is the note dump:

Compound Data

So far we have used abstraction to create a layered system. We use procedures and we don't really care about how these procedures are implemented.

Task of building things is separated from task of implementing the parts.

Large systems will use many layers of abstraction.

Example: system that does arithmetic on rational numbers... Fraction arithmetic

(MAKE-RAT N D)
RETURN A {RATIONAL}

(NUMER {RATIONAL})
RETURNS A NUMERATOR

(DENOM {RATIONAL})
RETURNS A DENOMINATOR

(DEFINE (+RAT X Y)
(MAKE-RAT
(+ (* (NUMER X)(DENOM Y))
(* (NUMER Y)(DENOM X)))
(* (DENOM X)(DENOM Y))))

(DEFINE (*RAT X Y)
(MAKE-RAT
(* (NUMER X)(NUMER Y))
(* (DENOM X)(DENOM Y))))

At this point, we haven't acutally defined the {RATIONAL} data object.

MAKE-RAT is a constructor
NUMER and DENOM are selectors

The reason to do this is because it allows us to think of rational numbers in the same way that we think of them mathematically. Breaking them down makes everything more complicated. By abstracting the data concept of the rational number, we can program in the same was that we think.

In LISP, the glue for creating compound data objects is the list structure.

CONS takes two arguements x and y and returns a pair, first part is x and second part is y. 

CAR returns first part of pair, CDR returns second part of pair.

(CONS 2 3) 

    |
  [ | ]
  /   \
 2     3

Box and pointer notation. Beginning to think this is going to be the building block of binary trees and lists and such...

(DEFINE (MAKE-RAT N D)
(CONS N D))

(DEFINE (NUMER X)
(CAR X))

(DEFINE (DENOM X)
(CDR X))

1/2 + 1/4 = ?
(DEFINE A (MAKE-RAT 1 2))
(DEFINE B (MAKE-RAT 1 4))
(DEFINE ANS (+RAT A B))
(NUMER ANS) -> 6
(DENOM ANS) -> 8

Our procedure does not yet simplify the answer. {UUUGH FFS WHY IS THE VIDEO CRAPPING OUT?!?!?!?!?!?}

Could assume that reducing to simplest terms is Georges problem, implement in MAKE-RAT:
(DEFINE (MAKE-RAT N D)
   (LET ((G (GCD N D)))
(CONS (/ N G)
     (/ D G))))
     
Rational number arithmetic was implemented using pairs, with a layer of abstraction between the pairs and the arithmetic functions. The MAKE-RAT constructure and the two selectors were the implementation of this abstraction layer.  Separated the use of the objects from the representation of the objects. Methodology is called data abstraction.

Possible to implement +RAT without data abstraction:

(DEFINE (+RAT X Y)
(CONS (+ (* (CAR X)(CDR Y))
    (* (CAR Y)(CDR X)))
 (* (CDR X)(CDR Y))))
 
This ignores the problem of reducing to simplest terms. The POINT of seperating use from representation is naming.  Nowhere in the non-abstracted system does there actually exist the idea of a rational number. 

When data is abstracted, it can be represented different ways. If a system is more abstract, it is more flexible.


Designing all your code before implementing is probably not the axiom of anyone who has implemented large systems.

LET vs DEFINE : LET establishes a local name (variable).

Another example, Points in a plain:

(DEFINE (MAKE-VECTOR X Y)(CONS X Y))
(DEFINE (XCOR P)(CAR P))
(DEFINE (YCOR P)(CDR P))

We can represent line segments in much the same way

(DEFINE (MAKE-SEG P Q)(CONS P Q))
(DEFINE (SEG-START S)(CAR S))
(DEFINE (SEG-END S)(CDR S))

(DEFINE (MIDPOINT S)
(LET ((A (SEG-START S))
 (B (SEG-END S)))
(MAKE-VECTOR
(AVERAGE (XCOR A)(XCOR B))
(AVERAGE (YCOR A)(YCOR B)))))

(DEFINE (LENGTH S)
(LET 
 ((DX (- (XCOR (SEG-END S))
 (XCOR (SEG-START S))))
  (DY (- (YCOR (SEG-END S))
 (YCOR (SEG-START S)))))
 (SQRT (+ (SQUARE DX)
  (SQUARE DY)))))

Layered system:
  
SEGMENTS
---------------------------------------
abstraction: MAKE-SEG etc.
---------------------------------------
VECTORS
---------------------------------------
abstraction: MAKE-VECTOR, XCOR, YCOR
---------------------------------------
PAIRS

SEGMENT
           |
         [ | ]
        /     \
   VECTOR      VECTOR
     |            |
   [ | ]        [ | ]
   /   \        /   \
  1     2      2     3

 Closure: The means of combination in a system are such that when things are combined, those combinations can then be combined in the same way. Pairs of pairs or arrays of arrays.

 Non-abstracted LENGTH is a big hairy mess with lots of CAR and CDR calls, is much harder to read and understand. A change in the implementation of Points creates a huge problem with code maintenance. Lol @ punching George in the mouth...

 Question about objects more complex than pairs. Next lecture will have discussion of more complex constructions. Oy vey the Bach synthesizer music is making my eyes bleed...

 Abstract data. In our rational number case, we don't care how MAKE-RAT, DENOM, and NUMER are implemented, all we care about is that the three procedures can fulfill the axiom:

 if x = (MAKE-RAT N D)
then 
(NUMER N)      N   
---------  =  ---
(DENOM D)      D


 We can build pairs out of nothing at all:
 AXIOM OF PAIRS:
 For any x and y
(car (cons x y)) is x
(cdr (cons x y)) is y


 (DEFINE (CONS A B)
(LAMBDA (PICK)
(COND ((= PICK 1) A)
 ((= PICK 2) B))))
 
(DEFINE (CAR X)(X 1))
(DEFINE (CDR X)(X 2))

In order to show that this representation is accurate, all we have to show is that is satisfies the axiom.

(CAR (CONS 37 49))
|
(CAR (LAMBDA (PICK)
(COND ((= PICK 1) 37)
 ((= PICK 2) 49))))
|
((LAMBDA (PICK)
(COND ((= PICK 1) 37)
     ((= PICK 2) 49)))
1)
|
(COND ((= 1 1) 37)
 ((= 1 2) 49))
|
37

CONS, CAR, and CDR are primatives, but they could actually be implemented as above. The point is that this demonstrates the blurred line between data and procedures. Will see that more and more.

No comments:

Post a Comment