Tuesday | 30 APR 2024
[ previous ]
[ next ]

Writing a Markdown Renderer

Title:
Date: 2023-08-21
Tags:  pick

I'm currently working on a project to get my blog fully running in Pick and I chose scarletdme to do it in. It's a GPL version of OpenQM which is a reimplementation of Pick.

I already finished up the HTTP.SERVER and I built a templating language for it that I call Basic Template Language. This post will be my thoughts on writing the markdown renderer. I write my blog posts in markdown and I want to have them render just in time. This means that I won't be caching the generated output or generating them in advance. My reasoning is that I want any slowness to be felt so that I optimize it further. I probably won't feel it as even a bad method of parsing markdown will be fast if it's a single page. Still worth doing.

I first took a look at commonmark as it is well specified and it comes with 600 tests that I could program against. This would also let me support markdown in general and allow others to use my subroutine. I wrote a routine to convert markdown to html that I use right now but it is very much limited. I can't use lists and you can't have markdown within other blocks. It is very much a line oriented rendering routine. My goal is to make something a bit better.

I started off importing the tests to BASIC and writing a tester that would let me specify what to test and test each subsection directly. This was a pain because the format of the json wasn't native to BASIC and so I had to deal with the newlines, tabs and trimming spaces. I also realized that my JSON.PARSE function is fundementally incorrect and I'll need to deal with that at some point.

I started with the headers and slowly banged through the tests. This was very much a slog of writing code, testing, writing some more. I hated it. I don't think TDD is for me because it is focused too much on the details. I also think I started the wrong way. I should have wrote the program, at least a small one and then began testing it. I started by using a blank page and having that fail. I think I saw this being the method to use at some point but it made programming deeply unfun. For a project that I'm doing for entertainment more than anything this was a bad sign.

I ended up implementing a few different subsections and the commonmark parser was getting to a useable state. However I wasn't dealing with blocks, I only dealth with the easy inline stuff. Once I got to the blocks, I realized I really wasn't having fun and killed the project.

Luckily I had the energy to start again immediately. Usually once a project fails, I lose the will to work on it and let it languish for a bit before coming back. This time around, I quickly got back into it and worked on writing a markdown parser more holistically. I eyeballed the commonmark spec and began working on it. I implemented the parts that I considered important and this flew by. I definitely didn't do it correctly which is why I probably had fun doing it.

There is a lesson here somewhere. Though I don't think its a good one.

My RENDER.MARKDOWN routine takes about 1700 ms to generate 550 posts. This is about double the time my current routine does it in. However I have recursive calls and my new routine handles a much larger subset of markdown so I can live with it. The render time of a single page will only be a few milliseconds and will be dwarfed by the network.

Now that I have a working markdown renderer I can turn my attention to the glue work of getting everything wired up in scarletdme!

This post has no code but below is my routine. Hopefully it's simple enough to follow. The core idea is to first process and characters that trigger blocks and build those blocks. Then you can call RENDER.MARKDOWN recursively on the blocks. The bottommost call will handle the inline characters and generate html.

Code