CyberArmy University | Open Source Institute | CyberArmy Intelligence & Security | CyberArmy Privacy Watch Initiative

[Programming] Design of Lexical Analyzers



    [Programming] Design of Lexical Analyzers [View] [Reply] [Top]
    Posted by Author Rae On 2007-04-29 10:02:33
    View and vote on the article here: Design of Lexical Analyzers


    Design of Lexical Analyzers

    Category
    Programming
    Summary
    Body
    Design of Lexical Analyzers


    Generally the representation of the design model of any tool in computing is done with the help of a flowchart. The flowchart actually is nothing but a diagrammatic representation of an algorithm. To graphically represent lexical analyzers, we change our notation in flowcharts slightly, and the resulting chart is called a 'state transition diagram' or simply 'state diagram' or 'transition diagram'.

    The design of the lexical analyzer is like this :-
    • Upon start, it moves to a state 0.
    • - If we encounter a letter, we move to state 1.
    • Now, if we encounter a letter or a digit, we remain in state 1.
    • If we encounter a delimiter, we move to state 2, the termination state.
    The condition imposed in this design is that the first token must be a letter, and not a digit. Then only the analyzer moves to state 1, where the following characters can be letters or digits. The terminating character or delimiting character must not be a letter or a digit, but may be a special character like ;.

    To make this lexical analyzer, we have to first define and code a primitive function GETCHAR, which returns the next character.

    Program Pseudo Code for State 0 :-
    C = GETCHAR()
    IF C IS Letter
       THEN GOTO State 1
    ELSE
       FAIL()
    FAIL() is the error handling routine and 'C IS Letter' checks if variable C is a letter and if it is, returns true.

    Program Pseudo Code for State 1 :-
    C = GETCHAR()
    IF C IS Letter OR Digit
       THEN GOTO State 1
    ELSE 
       IF C IS Delimiter
          THEN GOTO State 2
    ELSE
       FAIL()
    State 2 indicates that an identifier has been found. As the delimiter is not a part of the identifier, we must retract the look ahead pointer one character. We can use '*' to indicate states on which input retraction must take place. If the identifier is not already in the symbol table, we install it there.

    Program Pseudo Code for State 2 :-
    RETRACT()
    return(id,INSTALL())
    The return function returns to the parser the pair of integer code for an identifier, and a pointer to the symbol table returned by the function INSTALL().

    References :-
    • Elementry Discrete Mathematics by C L Liu
    • Theory of Computer Science by E V Krishnamurthy
    • Compilers : Construction and Design by Aho, Sethi and Ullman
    • System Programming by John Donovan
    Written By Rae (3 February 2005)

    Rae is a member of CAU Knowledge-Bank Tutorial Writers


    This article was imported from the CyberArmy University site. (original author: Rae)
     


CyberArmy::Forum v0.6
Generated In 0.01903 seconds


About Us | Privacy Policy | Mission Statement | Help