I wrote my code search engine’s indexer such that I was building the inverted index in memory and then at the very end writing out all the leaves. On a small code base it was fine but once I wanted to index many different code bases, that’s when I hit the memory limit in node. I actually made the decision to keep it memory at the beginning because I thought the writing to the database part might add too much time. Now I got the chance to try it.
I re-wrote the saving part of the indexer to save things as the tokens were created rather than wait for the index to be built. It worked but it was significantly slower. So much so that I couldn’t even finish it for one code base as I gave up and just increased the memory limit.
Throw more hardware at the problem! A bad thing to do as writing the indexer in a much better way would give me a better base to work with in the future but for now I think I’m happy with it. I just want it working.
The command to increase node’s memory is as follows:
> node --max-old-space-size=8192 index.js
Voila! We have 8gigs of memory that node can now use.