Tuesday | 03 DEC 2024
[ next ]

Cratfing Interpreters - Interpreter

Title:
Date: 2020-05-16
Tags:  

This is just a stream of consciousness as I worked through the first half ofCrafting Interpreters using TypeScript.

It is a great book and I recommend it whole heartedly!

http://craftinginterpreters.com/

I'm going to try following this book using javascript, specifically typescript.I don't think it'll be too bad because I am familar with typescript and I shouldbe able to at least port over the logic.

After working through the scanner it is actually amazing how much of the code issimilar to Java which the book follows. This may be because they are bothderived from C or it could be that the java in JAVAscript is really coming outnow. I didn't notice before that Java is very similar. I could have done it inJava but oh well. For the most part you should be able to just use the examplesto write it in Typescript.

I wrote the Abstract Syntax Tree classes and this was really simple. The bookwrote a script to do it but I wasn't sure what I was making so I did it by hand.I also learned about Interfaces and the Visitor paradigm this chapter.Interfaces are okay, they seem to be sort of contract enforced and I think itcould be skipped maybe. But the Visitor paradigm is very cooool. I really likeworking out the logic and tracing out what its doing and looking at it now, itis a very clean way of implementing the ASTPrinter class. All the print logic isone place and this also means any future methods needs to only go in methodspecific class. It's a strange differention but I could see it being useful inthe future. The tree structure comes from the fact that the AST classes all takein more Expr type in their nodes which is very cool. Literal is a terminalnode. I don't fully see it as a tree but it definitely is. Just not intuitive.

As I work through this the vim addon YouCompleteMe is so friggin useful andTypescript is also. i was surprised that the code I was writing was working sofar without me having to run it and I finally figured it out. The book isfantastic but doing a straight copy and paste should have caused me all sorts ofissues. The reason I was able to write so much without really having to run itto test it is because my editor was throwing red flags. Stuff that was cominglater down in a chapter would cause red highlights for errors and I would stubthings out. Being able to match types and make sure what I was trying to pass inand out of methods was termendously helpful. I imagine this in pure javascriptwould be quite an experience!

Learned how to do exception handling and how to create my Error classes and howto catch them. Very neat and very helpful for debugging.

Scanning

Very cool! I really liked this and it really got at what I thought happed. Thescanner goes character by character and it does a giant case statement to createtokens for all the things it sees. Whats cool is the advance method and peekmethod and how characters are consumed. Alot of really cool logic where it isvery simple looking at it but I went about it the wrong way before. Really smartthat i followed the book instead of doing my own thing as it became clear whycertain things were done the way they are. One example is the isAtEnd functionthat I was going to skip and inline the check, however what I didn't know wasthat this function is used everywhere. Advance method is so wonderful. I learnedalot about tokenizing in this chapter

Abstract Syntax Tree

Very cool. Very simple and implements the visitor paradigm which is pretty neatand its somethign I can see being useful. It is pretty clear how they are treesand the work I did with trees just last week are already coming into play. idon't intuively see the AST as trees unfortunately, I keep expecting it to bemore intutive but I think I just need to let it sit. I am starting to see therecursive nature of it and the building blocks of traversing it.

Parsing

This uses alot of the core logic of the advance, peek, isAtEnd functionality togo through the tokens. Very cool. I really liked that. The big different is theway the tree gets built and how the precdence rules are encoded. it's actuallyvery cool and I feel like I'm close to understand it. When it first starts, itpasses everything to the general case and from there it gets more and morespecific or moves up the precedence chart. This means that by layering thefunction calls within each other then you can automatically get precdence ratherthan whatever the hell I did when I was trying to do stuff like this. Thecoolest part was grouping where once you hit a ( you trigger a grouping andbuild a brand new expression. Everythings a tree but this is even more tree.

Syntax errors are cool, once it hits an invalid case, the parser works its wayback up and discards enough tokens to get it back to a point where it cancontinue parsing. Very neat.

Very fucking cool. I got the parser working with expressions and its very neat.It is very clear what the ensted functions are doing. First thing it does isgoes and find all the low end stuff such as picking up addition then it looksfor multiplication and on and this builds up a tree and its very clear lispstyle. Mind blowing stuff. Fuck.

I want to do the challenges but they may be a touch out of my reach. I alsothink the ternary operator would be easier once I can test my code. A bigdifference in the way the book is laid out is that never does he run the code. Iwould run and debug as I go through but in this case he seems to have a map thatwe are following. it's neat but It does require me to sort of run it andmentally run through things to see whats happening. I'm pretty happy that I canmentally run through the scanner and parsing logic.

Evaluating

Fuuuucck. This was a cool chapter, not as heavy as the scanning or parsing. Thischapter was actually pretty easy compared to the previous two. This was allabout evaluating stuff and it went smoothly. There are probably some issues withthe fact that I'm using javascript but so far everything is still holding andworking the way I and I think the book expects. What's crazy is I have a fullyworking calculator now! However now I don't fully grasp what's happening. ithink I need ruminate on this for awhile before it all makes sense. I'm happy Ihave something though. Very happy. Very neat. Scanning -> AST -> Parsing ->Evaluating

Multiline source isn't working and I'm not sure why. The coolest thing is beingable to debug the language. I got multiline files working. I needed to looparound the parser to have it go all the way to the end of the tokens. Verycoool. Had a minor bug with quotations where the last one was missing. Had atypo I think in my slice. Adding strings is wrong. Had to slice the trailingquotation on the left and the leading quotation on the right.

Implemented challenge 2 where I added in support for "test" + 3 = "test3". Veryneat. I can't seem to generalize it to javascript quite yet. I logically get itbut I think i have disconnect between the AST class and the visitor class,really i should think of the AST as containing the interpreter code and I thinkthat make it more clear. When i ask the ast to interpret it then goes in andfinds that it has 2 nodes on it, one is a string and the other is a number. itthem smoshes them together and gives me a string back, this is the evaluation. Iwonder if javascript does the same thing.

Divide by 0 works and spits out Infinity. Which is cool. This is javascriptsreturn;

Statements

Wired up the print statement and plain expression. Very friggin cool. What'samazing is that it works without running the code constantly which is still avery new and different way to develop. Not sure how I feel about it.Unfortunately it broke my repl and now I need a print statement to get my replto work and semicolons to end things off or throws an error. I think i want tofix that before I move to variables but this is already shaping up quite a bit.It's not fair. i don't intuit it enough. But is it so cool. I kind of want to dothis with BASIC and make a repl. That could be so fucknig cool.

I added in a isrepl variable to the interpreter and now it will let me skipsemicolons and output expressions in the repl. Very cool to be messing with theguys of the language and its so easy.

Variables

    assignment() :Expr {
        const expr: Expr = this.equality();
        
        if(this.match(TokenType.EQUAL)) {
            let equals :Token = this.previous();
            const value :Expr = this.assignment();
            
            if(expr instanceof Variable) {
                const name :Token = expr.name;
                return new Assign(name, value);
            }
            
            this.error(equals, "Invalid assignment target.");
        }
        
        return expr;
    }

Very fucking cool. The first line of this function basically begins processing,during assignment this will fall through to the button and get IDENTIFIER tokentype and will return back a new variable class object. We thenn check if thenext token is an equals, if it is we are going to evaluate the right hand side,this.assignment could be this.expression because it needs to read in the nexttoken. If the expr was indeed a variable type then it will do the assignment. Ifit wasnt equals as the next token then it treaks the identifier as an expressionwhich will go and retrieve the expression from the environment through theVariable class and through the visitVariable method it will reach theenvironment.

Mind blowing. I have implemented global variables. It is pretty big and I can'tkeep everything in my head but I'm happy with it so far. Very neat and I imaginehacking away it will eventually make things a bit more clear and obvious.Amazing. The best part is that I was mentally running the code and it didn'tmake sense, something wasn't quite adding up and I figured out that there was abug. I then checked the book again and found that indeeed, I had forgotten acrucial step in the assignment logic and that was causing me issues. This bookis definitely testing my thinking.

Scope

A linked list of environments fuuuuck. Environments point to their parents andassignment and and shadowing of variables is innate to the structure of walkingup the tree. Very cool and very nice.

Scopes were simpler to implement and damn is it a function of its structure.Very neat. I really wish i could come up with this on my own.

Lol the challenge in this chapter was to get the repl working again. That makesme feel good.

Control Flow

Implementing Ifs was straighforward there is a pattern to adding stuff to thelanguage. First is to create the AST class, then in the parser we implement thethe AST builder function as part of the chain that alreadt exists. In theinterpreter we need to implement the visit method to evaluate the AST. Verysimple but very powerful. And Simple!

I fixed a bug where I wasn't making tokens for the || and &&. Very nice to beable to debug effectively and I think I have some inkling of whats going onunder the hood. The best part is that I took a quick glance at Go and I wasthinking of it interms of what I learned. They didn't have the semicolons. itried to screw around with the number of parameters going into functions andsuch. Very cool. Strange thing about Lox seems to be 0 is not falsey beucae onlynull and false are falsey. I may change this.

Implemented the while was dead easy now that I can see the pattern. and I canread the grammer diagrams i think. I ended up implementing while myself and itmatched up with the book which is to say that the book is really good at makingthis all simple and intutive enough that I can feel myself around. It has areally good structure for learning I think.

Brilliant. I got for loops working in terms of while and it's pretty cool. I raninto a bug with having my greater and less than signs mixed up but it was quickto find and fix. The repl is very helpful because I test aspects of the languagequickly.

I know the answers to the challenges because of the SICP class which is verynice. With first class functions and dispatch you can use use functions toevaluate true and false and pass those as keywords to dispatch. So you wouldwrite an IF function that dispatches 2 functions passed on some boolean. Loopscan be recreated using recursion and base cases. The big thing I'd like to addis i++. That would be so cool. That and the ternary operator.

Functions

Relatively straight to get the parsing done. Interpret is far more work. Stillsimple but it does require thinking and fudging things as Java is different fromTypescript in the way interfaces work. But I'm getting it done slowly. I'm gonigto be pretty excited if this actually works as I have no idea what I'm doingwith the interfaces and just going with my gut.

There are 2 halves to functions, there is the implementation part where we setup the parser to handle () and make function calls. The second part is thedefining of functions by doing function name.

Functions are a helluva flow. Lets see if i can get this out. So when we firstprocess a lox file it tokenises everything and so it take a function declarationand creates the AST for it and also adds it to the environment. Next when wetokenize the rest of the script any calls to a function will count as avariable, this means that when we go to evaluate it it will go intoVisitVariable and come out with a function instead. This will then spin into theCall AST which will call visitCall() which will then execute the function astfrom before except it will be ina brand new environment. This very much theouroboros. Very fucking cool.

In the scanner we split up every so function, name, (, args, ) then in the astwe build up an AST for the function declaration by matching against the functionkeyword. We then also process all way to an ending left brace. With this wecreate an entry in the environment. Next time we see name() we can then.

Incorrect! The Call AST triggers right before the Variable one and that is whatcauses it to be considered a call and triggers a visitCall in the future. So notas Ouroboros but damn is confusing. But oh so cool.

I implemented i++ because that is way too useful in a for loop. There were 2ways I could think but I couldn't figure out which was the correct or betterway. The first way was to add a DOUBLE_PLUS to the scanner and then parse it anddesugar it the same way as FOR and WHILE. The other way that feels easier was tomodify the addition method and check there if the next token was a an operator.If it was then I just desugar it right then and there and return a new Assignobject. This works. But I'm sure there could be all sorts of issues and bugs. Ithink the first option might be the cleaner way but this is certainly thequicker way. Or maybe not.

Wrote a fibonacci calculator using recursion and damn is it slow. Really slow! Itried to use caching but I don't have objects yet. I don't have proof closureworks yet but this is pretty cool to be this far into a programming language.

Scopes

Very cool bug with scopes and how binding variables introduces weird scopingissues.

The example can be seen in the closure example.

  var a = "global";
{
    function showA() {
        print a;
    }
    
    showA();
    var a = "block";
    showA();
}

The answer then is to go through before hand an find all the variables. This iscalled a variable resolution pass and is before the interpretation but after theparsing. I don't quite understand it yet.Mind isn't full in it because this kind of a boring chapter. It looks like ithas sommething to do with processing a stack of environments with varablesinside them.

All variables get setup during the resolver stage and we make a stack of scopewhere each expr resides, we keep the distance from the global environment in amap and we attach this map to the interpreter. When it looks up a variable itwill first check this map to find out how far it should go to find a variable.This way in the closure above when it runs showA the second time, it will go tolook up a when printing and it will know that variable was bound in the globalspace rather than looking at its immediate scope. It is a very cooool way to doresolution. I need to think about it some more but it is clean.

The resolver step is pretty neat but I need to walk through it more deeply.

So we begin. Pass in the interpreter and run resolve(statements) againsteverything that came back from the parser. At this point we have no scopes.While running the statements, we have a few things that will trigger scopecreation, functions and blocks. These two things will add maps to form in astack and any variables declared inside those things will get added to thosemaps. The ultimate goal of the resolver is to run resolveLocal on variables andassignments so that the interpreter can write down, where in the stack thosevariables came from. The interpreter holds this in the locals variable. Now whenit goes to get a variable value or do an assignment it will go to the localsobject with the expression it is looking up and it get the depth from it. onceit has the depth it will then go into the environment and travel up it to getthe correct value. The reason this works is that expression is really areference so when it looks up that reference it will find the depth for thatreference. AKA in the above snippet it runs showA. inside when it does the printa it goes to find that a. The reference to this a is different from thereference to a that comes after. the a that comes after would be a differentreference and it would have the environment as the block. The function's ahowever would have an environment pointing to the global space.

The resolver is also semantic analysis and we can do stuff like checking forvariables being declared but not used and for incorrect return statements. Veryneat. Very boring.

Classes

A class is a way to package data and the code that uses it and it is whatallowed people to write massive amounts of code. You have a constructor thatlets you build instances of that object, you have fields that hold data and youhave methods that act on those fields.

A class without method is just a map. Neat idea.

Very big. Debugged it and found abug in how I was checking for nulls and holycow was it a mess to debug. Didn't take me long but trying to juggle and walkthrough the code was intense. I'm going to have to think. But damn is it cool.Very cool how it works. Classes are magical. Everything kind of flowed but thisis something else.

The closest thing so far is how the keyword plays into all this. At this pointinstances can't hold onto themselves and use their own fields. The keword thisis basically a way to reference itself but in a closure so it stays attached tothe instance.

Without a this, then methods inside can't access fields. this is variabledefined by the system when an instance is made. Amazing. Now methods willtrigger an extra environment that only contains this. Then each method willbuild its own scope with the parent being the one with this in it. Pretty colland pretty neat. This happens in the resolver and this needs to happen in theinterpreter. When we go to get an instance of an object to run a method, theinterpreter needs to create an extra environment containing the this to doeverything properly.

Fucking magical. Fuck. It works.

A constructor basically looks for an init function to run and then binds it tothe instance and then calls that method immediately. If there is no initfunction then there is no constructor! If there is an init then it will runimmediately. So I'm guessing even without a constructor we can duplicate it bysimply running a method at the beginning. We just need to remember.

Very stupid bug where I forgot to return the instance. Very brutal to debug aprogramming language. Because i don't fully grasp it but I slowly made my way tothe error starting at the very beginning. Very fun and very frustrating becauseI don't know where to start.

Inheritance

Simple but very powerful. The way super methods are done are gorgeously simpleand very cool. The super keyword is similar to this but for an entire class.Basically when a super is hit, it creates a closure where super will referencethe parent class imeddiately that way class chaining works as it should.

This and super are based on closures and that is so friggin cool. I kind of wishthere was no resolver bewcause a direct connection from parser to interpreter Ifeel was better and more intuive. Thinking about it though the resolver is fine,it does basically the same thing as the interpreter but only does theenvironment building and calcualtes the number of steps it has to go.

Conclusion

Very good book! I really enjoyed it. Overall I learned alot and I got reallycomfortable with Typescript. 2100 lines of code whereas the book mentions 1klines. I'm not sure why mine is so bloated but could be the javascript. Eitherway I'm pretty happy that I went through this first part and finished it. Thelexer and parser were probably the coolest and the ones I understand the most.The resolver and interpreter are much harder to figure out. I wonder if thevisitor paradigm is the best way to do this. I think having the methods insidetheir AST would make it more intuitive and easier to grasp mentally.