Thursday | 10 OCT 2024
[ previous ]
[ next ]

Yacc Shaving - Stage 2

Title:
Date: 2022-08-27
Tags:  

I've now finished up reading and implementing stage 2. This stage adds simple 1 character variables and the ability for the program to continue even if it hits an error during parsing or lexing. This stage is from page 242 - page 245.

The full code including the exercises:

%{
#include <math.h>
float mem[26];
%}
%union {
    float val;
    int index;
}
%token <val> NUMBER
%token <index> VAR
%type   <val>   expr
%right '='
%left '+' '-'
%left '*' '/'
%left '%'
%left '^'
%left UNARYMINUS
%%
list: 
    | list ';'
    | list expr ';' { 
        mem['p'-'a'] = $2;
        printf("s => \t%.8g\n",$2); 
    }
    | list '\n'
    | list expr '\n' { 
        mem['p'-'a'] = $2;
        printf("s => \t%.8g\n",$2); 
    }
    | list error '\n' { yyerrok; }
    ;
expr:   NUMBER          { $$ = $1; }
    | VAR               { $$ = mem[$1]; }
    | VAR '=' expr      { $$ = mem[$1] = $3; }
    | '+' expr          %prec UNARYMINUS { $$ = $2; }
    | '-' expr          %prec UNARYMINUS { $$ = -$2; }
    | expr '+' expr     { $$ = $1 + $3; }
    | expr '-' expr     { $$ = $1 - $3; }
    | expr '*' expr     { $$ = $1 * $3; }
    | expr '/' expr     { 
            if ($3 == 0.0) execerror("Division by 0.", "");
            $$ = $1 / $3; 
    }
    | expr '%' expr     { $$ = fmod($1,$3); }
    | expr '^' expr     { $$ = pow($1,$3); }
    | '(' expr ')'     { $$ = $2; }
    ;
%%

#include <stdio.h>
#include <ctype.h>
#include <math.h>
#include <signal.h>
#include <setjmp.h>

jmp_buf begin;
char *progname;
int lineno = 1;

main(argc, argv) 
    char *argv[];
{
    int fpecatch();
    progname = argv[0];
    setjmp(begin);
    signal(SIGFPE, fpecatch);
    yyparse();
}

execerror(s,t) 
    char *s, *t;
{
    warning (s, t);
    longjmp(begin, 0);
}

fpecatch()
{
    execerror("floating point exception", (char*)0);
}

yylex()
{
    int c;
    while ((c=getchar()) == ' ' || c == '\t');
    if (c == EOF) return 0;

    if (c == '.' || isdigit(c)) {
        ungetc(c,stdin);
        scanf("%f",&yylval.val);
        return NUMBER;
    }

    if (islower(c)) {
        yylval.index = c - 'a';
        return VAR;
    }

    if (c == '\n') 
    {
        lineno++;
    }
    return c;
}

yyerror(s) 
    char *s;
{
    warning(s, (char *) 0);
}

warning(s,t)
    char *s, *t;
{
    fprintf(stderr, "%s: %s", progname, s);
    if (t) fprintf(stderr, " %s", t);
    fprintf(stderr, "near line %d\n", lineno);
}

The error handling is quite cool, we save the start location in main and when we run into errors, we just jump back to the save point and continue parsing. The other cool thing was that the grammer takes in C code so in it is where you check for divide by 0 errors or do things like saving the last computation.

Very cool little addition but I'm looking forward to the next stage where we implement proper variable names. I imagine that is what opens up the doors to being able to run functions by name. I already pulled in fmod and pow but I can only get to them by using a single character operator. The goal would be to be able to do something like x = fmod(1.0,2.0); which would be basically the grammer of C but built in my own little custom language.

Side note - I left off a semicolon in the %{%} block at the top and that caused some weird errors. I'm guessing that block of code actually gets put into the C code verbatim. I'll need to look at the file that yacc generates at some point. IBM has some good documentation on yacc and describe how yacc and the output file work.