Thursday | 21 NOV 2024
[ previous ]
[ next ]

HTML to Epub in 30 Lines

Title:
Date: 2023-03-17
Tags:  

Using the readability package from mozilla and the epub-gen package, it becomes very easy to take a web page and generate an epub for it.

The libraries used:

pnpm install axios @mozilla/readability jsdom epub-gen

The code:

const axios = require("axios");
const { Readability } = require('@mozilla/readability');
const { JSDOM } = require('jsdom');
const Epub = require("epub-gen");

async function makeBook(url) {
    try {
        const response = await axios.get(url);
        const doc = new JSDOM(response.data, { url });
        const reader = new Readability(doc.window.document);
        const article = reader.parse();

        new Epub({
            output: 'some.epub',
            title: article.title,
            author: article.siteName,
            content: [
                {
                    title: article.title,
                    data: article.content
                },
            ]
        });
    } catch (err) {
        console.log(err);
    }
}

This function will generate an epub given a url. The readability library is a best effort library so some web pages will look weird or have strange cutoffs but I can live with that.

This is part of my project to push articles to my kindle.