Open Source Institute | CyberArmy Intelligence & Security | CyberArmy Services & Projects

[Programming] Design of Lexical Analyzers


[Reply] [View by Thread] [Help]
[Back To Article Discussion Forum]

Posted by Author Rae On 2007-04-29 10:02:33




View and vote on the article here: Design of Lexical Analyzers


Design of Lexical Analyzers

Category
Programming
Summary
Body
Design of Lexical Analyzers


Generally the representation of the design model of any tool in computing is done with the help of a flowchart. The flowchart actually is nothing but a diagrammatic representation of an algorithm. To graphically represent lexical analyzers, we change our notation in flowcharts slightly, and the resulting chart is called a 'state transition diagram' or simply 'state diagram' or 'transition diagram'.

The design of the lexical analyzer is like this :-
  • Upon start, it moves to a state 0.
  • - If we encounter a letter, we move to state 1.
  • Now, if we encounter a letter or a digit, we remain in state 1.
  • If we encounter a delimiter, we move to state 2, the termination state.
The condition imposed in this design is that the first token must be a letter, and not a digit. Then only the analyzer moves to state 1, where the following characters can be letters or digits. The terminating character or delimiting character must not be a letter or a digit, but may be a special character like ;.

To make this lexical analyzer, we have to first define and code a primitive function GETCHAR, which returns the next character.

Program Pseudo Code for State 0 :-
C = GETCHAR()
IF C IS Letter
   THEN GOTO State 1
ELSE
   FAIL()
FAIL() is the error handling routine and 'C IS Letter' checks if variable C is a letter and if it is, returns true.

Program Pseudo Code for State 1 :-
C = GETCHAR()
IF C IS Letter OR Digit
   THEN GOTO State 1
ELSE 
   IF C IS Delimiter
      THEN GOTO State 2
ELSE
   FAIL()
State 2 indicates that an identifier has been found. As the delimiter is not a part of the identifier, we must retract the look ahead pointer one character. We can use '*' to indicate states on which input retraction must take place. If the identifier is not already in the symbol table, we install it there.

Program Pseudo Code for State 2 :-
RETRACT()
return(id,INSTALL())
The return function returns to the parser the pair of integer code for an identifier, and a pointer to the symbol table returned by the function INSTALL().

References :-
  • Elementry Discrete Mathematics by C L Liu
  • Theory of Computer Science by E V Krishnamurthy
  • Compilers : Construction and Design by Aho, Sethi and Ullman
  • System Programming by John Donovan
Written By Rae (3 February 2005)

Rae is a member of CAU Knowledge-Bank Tutorial Writers


This article was imported from the CyberArmy University site. (original author: Rae)


There are no replies to this post yet.



Guest:
Subject:
Message:
Signature:
Optional Image Link:
http://

CyberArmy::Forum v0.6
Generated In 0.05803 seconds


About Us | Privacy Policy | Mission Statement | Help