From HTML to DOM Nodes

Every once in a while, I need to turn HTML strings into DOM nodes. The mechanisms for doing so are less than obvious and nuanced.

Having HTML but wanting DOM nodes is reasonably common in my little universe, whether for simple templating, transclusion or a host of other use cases.

Injecting HTML into a DOM node seems fairly straightforward at first: That’s what innerHTML (or even outerHTML) is for! References to any particular DOM node can then be obtained via the usual DOM-traversal APIs:

let tmp = document.createElement("div");
tmp.innerHTML = "<p>hello world</p>";
let message = tmp.firstChild;

Even disregarding security concerns1, this might not be what we want though: It implies that our HTML string only contains flow content, so metadata content (think <head> or <title>) might lead to unexpected results. This particular implementation also assumes that our HTML string contains exactly one root node, which might not always be the case.

We can avoid worrying about content categories by employing the built-in parser’s parseFromString method, which always returns a proper document:

let parser = new DOMParser();
let doc = parser.parseFromString("<p>hello world</p>", "text/html");
let message = doc.body.firstChild;

While <script> elements will not be executed here (for somewhat arcane reasons related to document.write), the aforementioned security concerns still apply: There are other ways to inject markup which results in JavaScript being evaluated.

Now, sometimes we do want to execute <script> elements within our HTML string (e.g. if we’re perpretating crimes of transclusion). For that we can employ the somewhat obscure createContextualFragment API:

let fragment = document.createRange().createContextualFragment(`
<p>hello world</p>
<script>console.log("hello world");</script>
`.trim());

This works pretty much like the innerHTML approach above – with all the drawbacks – except <script> tags are indeed executed as soon as the respective element is added to the document (e.g. via document.body.append(fragment)).

If we want to avoid worrying about content categories while also executing <script> tags, we might combine both approaches:

let tmp = new DOMParser().parseFromString(`
<head>
    <title>hello world</title>
</head>
<body>
    <h1>hello world</h1>
    <script>console.log("hello world");</script>
</body>
`.trim(), "text/html");
document.documentElement.innerHTML = tmp.documentElement.innerHTML;

for (let node of document.querySelectorAll("script")) {
    let fragment = document.createRange().createContextualFragment(node.outerHTML);
    node.replaceWith(fragment);
}

(Turns out node.replaceWith(node.cloneNode(true)) does not suffice here.)

In this example, we’re replacing the entire document. Well, almost: Transferring <html> attributes between old and new document is left as an exercise for the reader.