Easy Extend



                               


                       
                           Author: Kay Schluehr
                           Date: 04.05.2006
                           Version: 1.0
                                                                          
                                                             

                                    

Overview

Gallery is an EasyExtend sketch pad. It is used to experiment with the system and introduces a couple of new and almost unrelated
expressions/statements that extends Python in a straightforward manner.

1. repeat - statement

    The inevitable repeat statement. Maybe do-while would have been the better choice?
    But anyway here is the "Hello World" of EasyExtend.

2. on - statement

    Contraction of certain classes of if-statements. It is a showcase of a typical comp.lang.python requirement.

3.  Chainlets and the switch - statement

    The switch statement is another classic but the semantics of Gallerys switch might be surprising because
    it is not imperative but supports OO. A new lightweight object type the so called chainlet is introduced and it's
    usage is compared to more conventional OO solutions.

Used Gallery Grammar for Python2.4

1. repeat - statement

An implementation of the classical repeat-statement known e.g. from Pascal. The repeat-statement bears not much surprises.
Opposed to a while statement the conditional expression follows the suite. The body/suite of the repeat-statement is evaluated
until the test of the until-clause returns true.

Syntax: 

repeat_stmt
::= 'repeat' ':' suite 'until' ':' (NEWLINE INDENT test NEWLINE DEDENT | test NEWLINE )

Translation:

The translation to Python is straightforward:


repeat:
    SUITE
until:
    TEST
 

    ==>
while 1:
    SUITE
    if TEST:
        break

Examples:

gal> x = 0
gal> repeat:
....     x+=1
.... until:
....     x==10
....
gal> x
10

The until-clause is allowed to be inlined :

gal> x = 0
gal> repeat:
....     x+=1
.... until: x==10
....
gal> x
10



2. on - statement

It happens quite often that programmer have to type statements like the following:


var = dictionary.get(key)
if var:
    SUITE
else:
    SUITE


  This becomes tedious if several of such statements get nested. The on-statetement provides a simple shortcut:


on var = dictionary.get(key):
    SUITE
else:
    SUITE


Note that the assignment produces a side-effect and var will always be accessible outside the on-statements scope.

Syntax:  

on_stmt
::= 'on' NAME '=' test ':' suite ['else' ':' suite]

Translation: 

The on-statement transforms like:



on NAME = test:
    SUITE1
else:
    SUITE2


==>   

NAME = test
if NAME:  
    SUITE
else:
    SUITE

Example:

gal> on n0 = find_node(tree, symbol.stmt):
....     on n1 = find_node(n0, symbol.flow_stmt):
....         print n1


Using the semantics of on- and repeat statements it is easy to see how they can be combined and how they do fit with other Python statements.

d = {2:5, 3:9}
x = 0
repeat:
    on m = d.get(x):
        x = m
        break
    x+=1
until: x == 3

assert x == 5



3. Chainlets and the switch-statement

The introduction of a switch-statement was motivated by the search for a programming language construct that supports chainlets naturally. Both the repeat-statement and the on-statement introduced in the previous chapters were trivial excercises. Chainlets on the other hand do not only have a real use and show benefit in certain programming tasks but their implementation also demonstrates a very common and important theme in EasyExtend: chainlets are implemented in a separated Python module that can be imported independently. EasyExtend just implements a syntactical wrapper and passes chainlets into the builtins such as they were a primitive type. Well, they actually are - mostly.

3.1 Chainlets

                                                                


One of the most basic concepts in programming is that of imperative control expressed by if- or switch-statements. They seem to be so simple and natural that it needed quite some time to notice their flaws. The most severe drawback is that they refer to the identity of an object under control. If you write:


if language == Python_2_4:
   BLOCK

with a numerical constant Python_2_4 you see immediately that the condition is evaluated to False whenever you or your customer uses a different Python release. In C programs you often set integer defines like:

                  #define Python_2_3      1
                  #define Python_2_4      2
                  #define Python_2_4_1    4
                       ....

and combine them to extend the condition:


if language & (Python_2_4_2|Python_2_5):
   BLOCK

But you actually have to extend this condition wherever you want to deal with the new release in the existing program instead of adapting the program in just those places where the new release makes a difference. What we need here is an ordering scheme. When release numbers are totally ordered and backwards compatibility among releases is strictly preserved you might rewrite your condition


if language <= Python_2_4_2:
   BLOCK

and everything works fine. But often you can't do this and each version could be a bifurkation point where a new development branch takes off. Then you get a tree or an even more complicated structure when some features vary independently and run through an own release sequence.

Now OO suggested to abandon those handy control structures in favour for more robust but also more heavyweight solutions. Everything is organized in classes and instead of writing a conditional as above you might write a dispatch_on_language() method or alike that is called with the correct language class. A class representing Python_2_5 is derived from a Python_2_4_2 class and if you really have to change the behaviour in Python_2_5 compared to Python_2_4_2 you overwrite the dispatch_on_language() method in your Python_2_5 class. Finally you have just to instantiate the object of the right class and let client objects operate on an abstract representation of your object. Those clients aren't interested in the particular language version. All they have to know is how and when to call the dispatch_on_language method. This OO approach while elegant is sometimes to heavyweight and programmers unlikely change their architecture ( if they created any ;-) for very tiny changes. So a mixture of OO and imperative code is the consequence.

From a technical point of view a method call on an object calls either the method implemented by the objects class or it propagates along the chain of parent classes to the root class ( in a single inheritance simplification ) and calls the first method of the same name and signature it finds on the way up. This is the basic ordering scheme we are looking for. We have generalized our #defines to classes and encapsulated the BLOCKs within methods. But after getting the structural idea why not stripping down the whole class chain and doing a restart with what we already had
 

if language <= Python_2_4_2:
   BLOCK

but now giving the conditional expression consisting of the variable language, the constant Python_2_4_2 and the comparison '<='  a different interpretation and representation than we did in the naive, though runtime efficient C-ish approach?

3.1  Definition

A chainlet is a constant object without internal structure that is organized in hierarchies. A chainlet hierarchy mimicks a class-hierarchy on an object level. Therefore chainlet hierarchies can be implemented by any OO language even if it lacks classes as first order types like C++/Java/C# etc. While a chainlet has no own properties it can be combined to built new chainlets using the operators | ( alternative ) and * ( product ).

Example:

gal> A = Chainlet()
gal> B = A()
gal> B<A            # A is parent of B
True
gal> C = A()
gal> C<A           
# A is parent of C
True
gal> C<=B          
# The pair (B,C) is not element of the parental order relationship
False
gal> B<=C
False


The parental relationship is expressed using the "<" or "<=" operator respectively.
__lt__( chainlet)
Returns True if the calling chainlet is derived from the chainlet passed into __lt__ otherwise it returns False. The parental relationship expressed by__lt__ is irreflexive, antisymmetric and transitive. In particular we have: if A < B and B < C then A < C. The __lt__ relation establishes a partial order on the set of chainlets.
__le__( chainlet)
Like __lt__ but being reflexive: A <=A is always True.

A chainlet has a select method used to create a selector from a chainlet. In languages with restricted polymorphism of switch like C++ the selector must have a different type than a chainlet. It has to be either an integer type, a float type or a char type. Values of other types cannot be passed into a switch. In Python we don't need to restrict the switch arguments but can use chainlets directly.

select( *chainlets)
Seeks for the best approximation of the calling chainlet in the passed tuple of chainlets and returns the approximation. The approximation is evaluated according to the partial order on chainlets. If S and T are chainlets with S<=T and we have a chainlet S' with S<=S'<=T than S' is a better approximation to S than T. If no approximation could be found for the calling chainlet the select method returns None.


Example:

gal> PY = Chainlet()
gal> PY_1X = PY()
gal> PY_24 = PY()
gal> PY_25 = PY_24()
gal> PY_25.select(PY, PY_1X)
PY
gal> PY_25.select(PY,PY_24)
PY_24
gal> PY_24.select(PY_25)  # returns None
gal>


3.2  Chainlet combinations 

A chainlet implements the methods __or__ and __mul__. While the first method can be used to create alternatives the second operator is used to create conjunctions.

__or__( chainlet)
Associative and commutative operator | on chainlets ( "alternative" ). For chainlets A, B and C following rules hold:

           C < A|B  <=>  C < A or C < B
                A|B < C  <=>  A < C or B < C

          The approximation quality of a chainlet alternative is the best approximation of one of it's parts.

__mul__( chainlet)
Associative and non-commutative operator * on chainlets ("product"). The product is used to express an independent specialization - a combination of different features. If some product A*B should be compared with a chainlet C without product structure one assumes that C = C*Top | Top*C where Top is the root chainlet in a chainlet hierarchy. We state following rules:
                                
          A*B < C    <=>  A < C or B < C
                C < A*B  <=>  C = C1*C2 and C1 < A and C2 < B               
         A|B < C*D  <=>  A < C*D or B < C*D

3.3  Why chainlets - revisiting a refactoring example

Chainlets support a kind of "neo-structured" programming without dismissing the value of change. Instead of replacing switches by whole class diagrams you generalize conditional expressions using chainlets. We want to revisit an example given Martin Fowler in his famous refactoring book ( Refactoring, Addison-Wesley 1st edition (June 28, 1999)). The section we revisit here is called "Replacing condition with polymorphism" which is related  to our topic and provides following initial example, translated into Python:


def get_charge(days_rented):
    result = 0
    price_code = get_price_code()
    if price_code == MOVIE.REGULAR:
        result+=2
        if days_rented>2:
            result+= (days_rented-2)*1.5
    elif price_code == MOVIE.NEW_RELEASE:
        result+= days_rented*3
    elif
price_code == MOVIE.CHILDRENS:
        result+=1.5
        if days_rented>3:
            result+= (days_rented-3)*1.5
    return result
       

Instead of refactoring this function into 4 classes ( a Price class and derived classes ChildrenPrice, NewReleasePrice, and RegularPrice ) as suggest by Fowler et al we start with the observation of the common pattern in the charge-algorithm:


def get_charge(days_rented, start, barrier, factor = 1.5):
    result = start      
    if days_rented>barrier:
        result+=(days_rented-barrier)*factor
    return result
     

This algorithm could be slightly extended to incorporate the case handled by a NEW_RELEASE:


def get_charge(days_rented, start, barrier, factor = 1.5):
    result = start      
    if barrier:
        if days_rented>barrier:
            result+=(days_rented-barrier)*factor
    else:
        result+=2*factor
    return result
     

Instead of numerical price codes we are using chainlets instead. Now think about a new price code category for students. Students are most likely not children but
in case of price charging they might be treated as such. We can either adapt the current chainlet selections yet in each place where they are used in the application code but this is not superiour to the classical imperative solution:

  
code = price_code.select(MOVIE.REGULAR, MOVIE.CHILDRENS, MOVIE.NEW_RELEASE)     

                        ====>

code = price_code.select(MOVIE.REGULAR, MOVIE.CHILDRENS|MOVIE.STUDENTS, MOVIE.NEW_RELEASE)     

or we introduce a STUDENTS chainlet derived from a CHILDRENS chainlet where no transformation is needed at all. The select() method has to be nowhere adapted:


STUDENTS = CHILDREN()     

An even better solution would be a reclassification.


REDUCTION   = PRICE_CODE()
STUDENTS    = REDUCTION()
CHILDREN    = REDUCTION()
PUPILS      = REDUCTION()
PENSIONEERS = REDUCTION()

Now you can use all kinds of combinations and special cases:


code = price_code.select(MOVIE.REDUCED)     
code = price_code.select(MOVIE.REDUCED, MOVIE.PUPILS|MOVIE.STUDENTS)     
code = price_code.select(MOVIE.REDUCED, MOVIE.CHILDREN, MOVIE.PENSIONEERS)     
...

A switch over the code chainlet is robust against the introduction of a new chainlet that is derived from any of the chainlets we have introduced before. We just have to adapt those cases where the new chainlet really makes a difference.

3.4   switch

We now define a chainlet aware switch. If the selector expression that is passed into a switch is a chainlet the behaviour of the switch is adapted so that the select method of the chainlet is executed with the selectors of the case branches as arguments. Finally the correct case branch ( if any ) is selected and evaluated. Instead of a default case branch the switch statement contains an optional else branch which is executed if no chainlet matches. If the switch selector is no chainlet but e.g. an int the selector will be compared to the case selectors directly using the comparison operator '=='.

Syntax:  

switch_stmt
::= 'switch' expr ':' NEWLINE INDENT case_stmt DEDENT ['else' ':' suite]
case_stmt ::= 'case' expr ':' suite ('case' expr ':' suite)*

Translation
:
switch EXPR:
    case EXPR1:
          SUITE1
    case EXPR2:
          SUITE2
else:
    SUITE3



==>   
if isChainlet(EXPR):
    SELECT = EXPR.select(EXPR1,EXPR2)
else:
    SELECT = EXPR
if SELECT == EXPR1:
    SUITE1
elif SELECT == EXPR2:
    SUITE2
else:
    SUITE3

Note that the switch statement uses an expr instead of a test grammar node. It's usefull with respect to direct value comparisons in a switch and causes the parser raising a syntax error, if the switch is defined like this:


switch x:
    case x in range(100):
        SUITE
    ....


Chainlets are not particular fast. Each switch needs linear effort in the number of chainlets that have to be compared in the select method. Personally I've never used them in tight loops but mostly for configuration purposes in application programming. The execution model introduced here might be a starting point to find and implement them more efficiently.

Examples for switch:

Using integers as switch selectors. The switch does not perform a select( ):

gal> x = 0; y = 5
gal> switch x:
....    case y-4:
....        print 1
....    case y-5:
....        print 2
.... else:
....    print 3
2
gal>


Using chainlets as switch selectors:

switch code:
    case MOVIE.REGULAR:
       
charge = get_charge(days_rented, 2, 2)
    case MOVIE.REDUCED:
       
charge = get_charge(days_rented, 1.5, 3)
    case MOVIE.NEW_RELEASE:
       
charge = get_charge(days_rented, 0, 0, 3)
else:
    raise ValueError, "Unknown price code %s"%code