Package EasyExtend ::
Module eetoken
|
|
Module eetoken
This module defines the EEToken class. The EEToken class is essenially a namespace for all kinds
of token definitions and some tokenizer flags ( NoNumber, NoName ... ) that are mentioned for
so called "tiny" fibers ( small languages defined within the EasyExtend framework but not extending
Python and not defining sets of aforementioned token ). The token are defined as triples
cls.TOKEN-NAME = (TOKEN-ID, TOKEN-VALUE, TOKEN-TYPE)
The token.py module of the std-library defines only TOKEN-NAME = TOKEN-ID correspondences whereas
in parser modules TOKEN-NAME = TOKEN-VALUE correspondences are created.
Typically parser modules like PyGen.py or DFAParser.py but also the tokenizer and cst modules access
id's directly at the module level using token names: token.NAME, token.STRING, .... In order to preserve
this interface EEToken defines the important method gen_token. A single parameter is passed to gen_token namely
the fiber specific token.py module. It generates all these token constants with the correct id's but also
dictionaries tok_name and TOKEN_MAP that where previously used for accessing token names from token id's
and token values from names.
Another important aspect of the gen_token implementation is the fiber-space support: node id's will be
automatically lifted to node id's of the range [FIBER_OFFSET, FIBER_OFFSET+512].
The EEToken class is dedicated to be extended in proper token.py modules of fibers in the following manner:
class FiberToken(EEToken):
def __new__(cls):
# some token definitions...
cls.MY_TOKEN = (100, '%$=..', SPECIAL)
# other token definitions ...
return cls
The residual uppercase class methods Name, Funny, StrPrefix,... are used by the eetokpattern.py.
They return a size ordered list of TOKEN-VALUEs ( sizes are ordered s.t. the longest TOKEN-VALUEs
precedes the shorter ones ) according to a TOKEN-TYPE.
|
_ = ' THIS-NAME '
|
|
SPECIAL = ' SPEC '
|
|
OPERATOR = ' OP '
|
|
LBRA = ' LBRA '
|
|
RBRA = ' RBRA '
|
|
TOKENIZE_PY = ' TOK '
|
|
NUM = ' NUM '
|
|
FAT_MODE = ' FAT '
|
|
STR_PREFIX = ' STR_PREFIX '
|