My Weblog

Blog about programming and math

While Interpreter

Now a days I am trying to learn about static program analysis and started reading the Principal of Program Analysis. In order to understand the concepts, the book introduces a small programming language While.

I thought about writing interpreter for While to improve my Haskell skills. The very first task is in compilation/interpretation is breaking the source code into keywords, identifiers, operators, numbers and other tokens, known as lexical analysis. These tokens are passed to syntax analysis phase to combine the tokens into well formed expressions, statements and programs according to the grammar. The output of syntax analysis is abstract syntax tree which gives structural representation of of the input. Once we have abstract syntax tree, we can interpret or manipulate it for code generation. If you are interested in code generation then see Stephen Dielh blog.

The grammar for while program is

Grammar for expression
a  ::= x | n | - a | a opa a
b  ::= true | false | not b | b opb b | a opr a
opa ::= + | - | * | /
opb ::= and | or
opr ::= > | < 

Grammar for statements
S  ::= x := a | skip | S1; S2 | ( S ) | if b then S1 else S2 | while b do S 

The very first task is to define abstract syntax tree and Haskell algebraic data type makes this task almost trivial.

module AST ( Opa (..), Opb (..), Opr (..), AExpr (.. ) , BExpr ( .. ), 
             Stmt ( .. )  ) where



data Opa = Add
         | Sub
         | Mul
         | Div
         deriving Show

data Opb = And 
         | Or 
         deriving Show

data Opr = Greater 
         | Less 
         deriving Show


data AExpr = Var String
           | Num Integer
           | Neg AExpr
           | ABin Opa AExpr AExpr
           deriving Show

data BExpr = Con Bool
           | Not BExpr
           | BBin Opb BExpr BExpr
           | AL Opr AExpr AExpr
           deriving Show

data Stmt = List [ Stmt ]
          | Assing AExpr  AExpr
          | If BExpr Stmt Stmt
          | While BExpr Stmt
          | Skip
          deriving Show

The next task is design the lexical analyzer. We will use Parsec for this purpose.

module Lexer where
import Text.Parsec
import qualified Text.Parsec.Token as T
import Text.Parsec.Language ( emptyDef )
import Text.Parsec.String ( Parser )


lexer :: T.TokenParser ()
lexer = T.makeTokenParser emptyDef 
            {
                T.commentStart = "{-"
              , T.commentEnd = "-}"
              , T.reservedOpNames = [ "+", "-", "*", "/", ":=", ">", "<",
                                      "not", "and", "or" ]
              , T.reservedNames = [ "if", "then", "else", "while", "do", "skip",
                                  "true", "false" ]

            }


identifier :: Parser String
identifier = T.identifier lexer

whiteSpace :: Parser ()
whiteSpace = T.whiteSpace lexer

reserved :: String -> Parser ()
reserved = T.reserved lexer

reservedOp :: String -> Parser ()
reservedOp = T.reservedOp lexer

parens :: Parser a -> Parser a
parens = T.parens lexer

integer :: Parser Integer
integer = T.integer lexer

semi :: Parser String
semi =  T.semi lexer

semiSep :: Parser a -> Parser [ a ]
semiSep = T.semiSep lexer

Now lexical analyzer will assist the parser for creating the abstract syntax tree. A while program is list of statements and statement consists of expressions. We can build parse the expression by Parsec expression builder by giving the table of operators and associativity.

module Parser ( whileParser ) where

import Text.Parsec
import Text.Parsec.Expr
import Text.Parsec.String ( Parser )
import Control.Applicative  hiding ( (<|>) )
import Lexer
import AST

aTable = [  [   Prefix ( Neg <$ reservedOp "-" )                ]  
          , [   Infix  ( ABin Mul <$ reservedOp "*" ) AssocLeft
              , Infix  ( ABin Div <$ reservedOp "/" ) AssocLeft ]
          , [   Infix  ( ABin Add <$ reservedOp "+" ) AssocLeft
              , Infix  ( ABin Sub <$ reservedOp "-" ) AssocLeft ]
         ]



bTable = [  [  Prefix ( Not  <$ reservedOp "not" )               ]
          , [  Infix  ( BBin And <$ reservedOp "and" ) AssocLeft ] 
          , [  Infix  ( BBin Or  <$ reservedOp "or"  ) AssocLeft ]
         ] 
      
aExpression :: Parser AExpr
aExpression = buildExpressionParser aTable aTerm where 
         aTerm =  parens aExpression 
              <|> Var <$> identifier
              <|> Num <$> integer


bExpression :: Parser BExpr
bExpression = buildExpressionParser bTable bTerm where 
         bTerm =  parens bExpression 
              <|> (  Con True   <$ reserved "true"  ) 
              <|> (  Con False  <$ reserved "false" )
              <|> try (  AL Greater <$>  ( aExpression  <* reserved ">" ) 
                                    <*>    aExpression )
              <|> (  AL Less    <$>  ( aExpression  <* reserved "<" ) 
                                <*>    aExpression ) 


  

whileParser :: Parser Stmt
whileParser = whiteSpace *> stmtParser <* eof where 
            stmtParser :: Parser Stmt
            stmtParser =  parens stmtParser 
                      <|> List <$> sepBy stmtOne semi
            stmtOne :: Parser Stmt
            stmtOne =  ( Assing <$> ( Var <$> identifier ) 
                                <*> ( reserved ":=" *> aExpression ) )
                   <|> ( If <$> ( reserved "if" *> bExpression <* reserved "then" ) 
                            <*>   stmtParser 
                            <*> ( reserved "else" *> stmtParser ) )
                   <|> ( While <$> ( reserved "while" *> bExpression <*  reserved "do" ) 
                               <*>   stmtParser )
                   <|> ( Skip <$ reserved "nop" ) 

We have abstract syntax tree so we can interpret our program. You can think of a program as collection of commands which manipulates some memory location.

module Interpreter ( evalProgram )  where

import qualified Data.Map as M
import AST

type Store = M.Map String Integer

evalA :: AExpr -> Store -> Integer
evalA ( Var v ) s  =  M.findWithDefault 0 v s
evalA ( Num n ) _ = n
evalA ( Neg e ) s = - ( evalA e  s ) 
evalA ( ABin Add e1 e2 ) s = evalA e1 s + evalA e2 s
evalA ( ABin Sub e1 e2 ) s = evalA e1 s - evalA e2 s
evalA ( ABin Mul e1 e2 ) s = evalA e1 s * evalA e2 s
evalA ( ABin Div e1 e2 ) s = div ( evalA e1 s ) ( evalA e2 s )


evalB :: BExpr -> Store -> Bool
evalB ( Con b ) _ = b 
evalB ( Not e ) s = not ( evalB e s )
evalB ( BBin And e1 e2 ) s = ( && ) ( evalB e1 s )  ( evalB e2 s ) 
evalB ( BBin Or  e1 e2 ) s = ( || ) ( evalB e1 s )  ( evalB e2 s )
evalB ( AL Greater e1 e2 ) s = ( evalA e1 s ) > (  evalA e2 s )
evalB ( AL Less e1 e2 ) s = ( evalA e1 s ) < ( evalA e2 s )    

interpret :: Stmt -> Store -> Store
interpret ( Assing ( Var v ) expr ) s =  M.insert v ( evalA expr s )  s
interpret ( List [] ) s = s
interpret ( List ( x : xs ) ) s = interpret ( List xs ) ( interpret x s )
interpret ( If e st1 st2 ) s 
          | evalB e s = interpret st1 s
          | otherwise = interpret st2 s
interpret ( While e st ) s 
    | not t = s   
    | otherwise =  interpret ( While e st ) w 
    where
     t = evalB e s
     w = interpret st s

evalProgram :: Stmt -> Store
evalProgram st = interpret st M.empty

Now every thing is complete so let’s add main and write some while program.

module Main where
import System.Environment
import Text.Parsec
import Parser
import Interpreter




main = do 
     ( file : _ ) <- getArgs
     program <- readFile file
     case parse whileParser "" program of 
        Left e -> print e >> fail "Parse Error"
        Right ast -> print  ( evalProgram  ast )

While program to compute the greatest common divisor

Mukeshs-MacBook-Pro:whileinterpreter mukeshtiwari$ cat Gcd.while 
a := 10 ;
b := 100  ;
while ( b > 0 ) do 
 (
	t := b ;
    	b := a - ( a / b ) * b ;
	a := t 
)

Factorial program

Mukeshs-MacBook-Pro:whileinterpreter mukeshtiwari$ cat Fact.while 
x := 10 ;
y := x ;
z := 1 ;
while ( y > 1 )
  do 
   ( 
      z := z * y ;
      y := y - 1 
   );
y := 0 

Since our language doesn’t have IO instruction so we will have to see which variable store the result. In gcd program, the variable t stores the result and variable z stores the factorial of number.

Mukeshs-MacBook-Pro:src mukeshtiwari$ ghc -fforce-recomp Main.hs
[1 of 5] Compiling AST              ( AST.hs, AST.o )
[2 of 5] Compiling Lexer            ( Lexer.hs, Lexer.o )
[3 of 5] Compiling Interpreter      ( Interpreter.hs, Interpreter.o )
[4 of 5] Compiling Parser           ( Parser.hs, Parser.o )
[5 of 5] Compiling Main             ( Main.hs, Main.o )
Linking Main ...
Mukeshs-MacBook-Pro:src mukeshtiwari$ ./Main ../Gcd.while 
fromList [("a",10),("b",0),("t",10)]
Mukeshs-MacBook-Pro:src mukeshtiwari$ ./Main ../Fact.while 
fromList [("x",10),("y",0),("z",3628800)]

I am not expert in either Haskell or compiler so if you have any comments then please let me know. The complete source code on github.

Advertisements

January 30, 2014 Posted by | Haskell, Programming | , , , , , | Leave a comment