By atmostmediocre


2019-02-11 18:56:40 8 Comments

After reading this article on writing a Brainfuck interpreter in Haskell and achieving awful performance with it (for example, mandelbrot generating 37.5 minutes on an old Intel Atom powered laptop) I've decided to write a compiler for it.

I picked NASM for my output language, because that is the only assembly language I somewhat understand, and after some problems (resolved in this StackOverflow question) I've finally arrived with a working compiler!

I would like get more feedback about good coding practices in Haskell more than optimal Assembly generation, because I am interested more in the former language (and also because programs generated by my compiler are quite fast, the same mandelbrot set generator took only 7s 24ms to execute on the same old laptop, so I feel like it is good enough, although there is a lot of room for improvement).

Also, please note that I have a very hard time understanding monads.

Data types

Starting with the data types, the most important one is BfCommand.

data BfCommand = GoLeft Int     -- <
               | GoRight Int    -- >
               | Add Int        -- +
               | Sub Int        -- -
               | LoopL Int      -- [
               | LoopR Int      -- ]
               | WriteChar      -- .
               | ReadChar       -- ,
               | BfConst Int    -- ???
               deriving (Eq, Show)

The only thing I do not like about this type, is the BfConst constructor. It just does not feel like it belongs to this data type, because it was added later (to properly represent [-] or [+], which sets selected cell's value to 0). It is just awkward.

What would be the correct and elegant way of doing something like this?

Parsing text into instructions

newtype BfSource = BfSource [BfCommand] deriving (Show)

parseBf :: String -> BfSource
parseBf =
    optimiseBf . BfSource .
    pairLoops [] [] . countLoopLs 0 .
    reduceConsts . mapMaybe char2bfc
  where
    char2bfc :: Char -> Maybe BfCommand
    char2bfc '<' = Just $ GoLeft 1
    char2bfc '>' = Just $ GoRight 1
    char2bfc '+' = Just $ Add 1
    char2bfc '-' = Just $ Sub 1
    char2bfc '[' = Just $ LoopL 0
    char2bfc ']' = Just $ LoopR 0
    char2bfc '.' = Just WriteChar
    char2bfc ',' = Just ReadChar
    char2bfc _   = Nothing
    countLoopLs :: Int -> [BfCommand] -> [BfCommand]
    countLoopLs _ [] = []
    countLoopLs n (LoopL _:bs) = LoopL n : countLoopLs (n + 1) bs
    countLoopLs n (b:bs) = b : countLoopLs n bs
    reduceConsts :: [BfCommand] -> [BfCommand]
    reduceConsts [] = []
    reduceConsts (LoopL _:Sub 1:LoopR _:bs) = BfConst 0 : reduceConsts bs
    reduceConsts (LoopL _:Add 1:LoopR _:bs) = BfConst 0 : reduceConsts bs
    reduceConsts (b:bs) = b : reduceConsts bs

How to rewrite countLoopLs and reduceConsts in a more elegant way? If not for the pattern matching, I would have done it with a fold.

Controlling the flow of the program

To pair the loops correctly I used a stack approach:

pairLoops :: [Int] -> [BfCommand] -> [BfCommand] -> [BfCommand]
pairLoops _      q []           = reverse q
pairLoops st     q (LoopL x:bs) = pairLoops (x:st) (LoopL x : q) bs
pairLoops (s:st) q (LoopR _:bs) = pairLoops st (LoopR s : q) bs
pairLoops st     q (b:bs)       = pairLoops st (b : q) bs

The main flaw here is the lack of syntax error checking, but I did not really know how to add it.

Optimising

This compiler makes very simple optimisations:

  • Grouping equal elements (+++ is represented as Add 3)
  • Reducing excluding operators (+++-- is represented as Add 1)
  • Turning [-] and [+] into BfConst 0

optimiseBf does exactly that (excluding the last point, this is done by reduceConsts).

optimiseBf :: BfSource -> BfSource
optimiseBf (BfSource bs) =
    if bs /= obs
        then optimiseBf (BfSource obs)
        else BfSource obs
  where
    obs = opthelper bs
    opthelper :: [BfCommand] -> [BfCommand]
    opthelper []  = []
    opthelper [x] = [x]
    opthelper (x:y:xs) =
        let r        = reduceBf x y
            single   = fromOne r
            (s1, s2) = fromTwo r
         in case r of
                Zero      -> opthelper xs
                (One _)   -> single : opthelper xs
                (Two _ _) -> s1 : opthelper (s2 : xs)

This function groups and throws out excluding instructions. Doing it with this if

if bs /= obs
    then optimiseBf (BfSource obs)
    else BfSource obs

is probably a horrible way to do it, but I could not think of another one.

Reducing individual instructions is done using

reduceBf :: BfCommand -> BfCommand -> TwoOrLess BfCommand
data TwoOrLess a = Zero
                 | One a
                 | Two a a
                 deriving (Show, Eq)

There is not much to talk about reduceBf because it is just hardcoded rules.

Generating assembly

The final stage. This one was surprisingly easy to do (except the loops part).

bf2asm :: Handle -> BfCommand -> IO ()
bf2asm handle (GoLeft x) = hPutStrLn handle $
    "    " ++
        if x == 1
            then "dec rcx"
            else "sub rcx, " ++ show x
bf2asm handle (GoRight x) = hPutStrLn handle $
    "    " ++
        if x == 1
            then "inc rcx"
            else "add rcx, " ++ show x
bf2asm handle (Add x) =
    mapM_ (hPutStrLn handle)
        [ "    mov al, [rcx]"
        , "    " ++
            if x == 1
                then "inc al"
                else "add al, " ++ show x
        , "    mov [rcx], al"
        ]
bf2asm handle (Sub x) =
    mapM_ (hPutStrLn handle)
        [ "    mov al, [rcx]"
        , "    " ++
            if x == 1
                then "dec al"
                else "sub al, " ++ show x
        , "    mov [rcx], al"
        ]
bf2asm handle (LoopL x) =
    mapM_ (hPutStrLn handle)
        [ "_LS" ++ show x ++ ":"
        , "    mov al, [rcx]"
        , "    test al, al"
        , "    jz _LE" ++ show x
        ]
bf2asm handle (LoopR x) =
    mapM_ (hPutStrLn handle)
        [ "    jmp _LS" ++ show x
        , "_LE" ++ show x ++ ":"
        ]
bf2asm handle WriteChar = hPutStrLn handle "    call _printChar"
bf2asm handle ReadChar  = hPutStrLn handle "    call _readChar"
bf2asm handle (BfConst x) =
    mapM_ (hPutStrLn handle)
        [ "    " ++
            if x == 0
                then "xor al, al"
                else "mov al, " ++ show x
        , "    mov [rcx], al"
        ]

Is it better to provide a Handle and write to it like I did, or maybe this function should just create Strings?


If the need arises, feel free to look at the code as a whole here.

1 comments

@Gurkenglas 2019-02-11 22:25:44

BfSource isn't Brainfuck, it's an intermediate language that's easier to work with. Don't worry about BfConst more than about Add's Int parameter. You can reuse the Const identifier, this is not a library.

type BfSource = [BfCommand]

parseBf = optimiseBf . mapAccumL pairLoops []
  . mapAccumL countLoopLs 0 . mapMaybe char2bfc

...
char2bfc '[' = Just $ LoopL undefined -- a hack should look like one
...

countLoopLs n (LoopL _) = (n+1, LoopL n)
countLoopLs n b = (n, b)

pairLoops :: [Int] -> BfCommand -> ([Int], BfCommand)
pairLoops st     (LoopL x) = (x:st, LoopL x)
pairLoops (s:st) (LoopR _) = (st, LoopR s)
pairLoops st     b         = (st, b)

optimiseBf :: BfSource -> BfSource
optimiseBf = head . head . filter ((>1) . length) . group
  . iterate (unfoldr $ uncons . reduceBf)

reduceBf :: [BfCommand] -> [BfCommand]
reduceBf (Add a : Add b : bs) = Add (a + b) : bs
...
reduceBf (LoopL _ : Add 1 : LoopR _ : bs) = BfConst 0 : bs

bf2asm :: Handle -> BfCommand -> IO ()
bf2asm handle = hPutStrLn handle . \case
  GoLeft x -> "    " ++ if x == 1 then "inc rcx" else "add rcx, " ++ show x
  ...
  Add x -> unlines
    [ "    mov al, [rcx]"
    , "    " ++ if x == 1 then "inc al" else "add al, " ++ show x
    , "    mov [rcx], al"
    ]
  ...

Sub is superfluous. Just replace Sub 1 with Add (-1). So is one of GoLeft or GoRight.

Related Questions

Sponsored Content

1 Answered Questions

[SOLVED] Brainfreeze: A Brainfuck compiler in C

1 Answered Questions

2 Answered Questions

[SOLVED] Brainfuck to Python compiler

2 Answered Questions

[SOLVED] Interactive Brainfuck interpreter in Haskell

2 Answered Questions

2 Answered Questions

[SOLVED] Brainfuck to x86 Assembly Compiler

1 Answered Questions

[SOLVED] Python basic arithmetic compiler

1 Answered Questions

[SOLVED] A BrainF*ck-ish compiler in C

1 Answered Questions

[SOLVED] Printing prime numbers in Assembly

  • 2015-01-15 16:14:33
  • user1720897
  • 5059 View
  • 7 Score
  • 1 Answer
  • Tags:   primes assembly

Sponsored Content