Does Node.js have a built-in way to diff two strings?

Does Node.js have a built-in way to diff two strings?
javascript
Ethan Jackson

I know there are third-party libraries for generating diffs in JavaScript, but I'm wondering if recent versions of Node.js include a native way to compare two strings and show their differences.

Is there a built-in API for this?

Answer

Since v22.15.0 / v23.11.0 node natively exposes util.diff, which:

compares two string or array values and returns an array of difference entries. It uses the Myers diff algorithm to compute minimal differences, which is the same algorithm used internally by assertion error messages.

If the values are equal, an empty array is returned.

Example Usage

import { diff } from 'node:util'; console.log(diff('hello', 'h3llo'));

Which produces output like:

[ [ 0, 'h' ], [ 1, 'e' ], [ -1, '3' ], [ 0, 'l' ], [ 0, 'l' ], [ 0, 'o' ] ]
  • Each entry is [op, value]:
    • 0 = unchanged
    • 1 = inserted
    • -1 = deleted

Edge case example: __proto__, emoji, and UTF-8

In this older question, the asker pointed out that many older JavaScript diff libraries fail on certain edge cases—like strings containing "__proto__", emojis, or non-Latin UTF-8 text.

These issues were often due to libraries treating input as object keys, which could trigger prototype pollution or encoding bugs.

Node.js’s util.diff() avoids this entirely by comparing raw values (strings or arrays) and safely handles a wide range of inputs:

import { diff } from 'node:util'; const a = '👋 __proto__ こんにちは'; const b = '👋 __proto__! こんばんは'; console.log(diff(a, b));

Which produces output like:

[ [ 0, '\ud83d' ], [ 0, '\udc4b' ], [ 0, ' ' ], [ 0, '_' ], [ 0, '_' ], [ 0, 'p' ], [ 0, 'r' ], [ 0, 'o' ], [ 0, 't' ], [ 0, 'o' ], [ 0, '_' ], [ 0, '_' ], [ -1, '!' ], [ 0, ' ' ], [ 0, 'こ' ], [ 0, 'ん' ], [ 1, 'に' ], [ 1, 'ち' ], [ -1, 'ば' ], [ -1, 'ん' ], [ 0, 'は' ] ]

A few things worth noting:

  • "__proto__" is treated as plain text, avoiding prototype pollution
  • Emoji like 👋 are split into UTF-16 code units (e.g. '\ud83d', '\udc4b') but still diffed correctly
  • Multibyte characters (like Japanese kana) are handled cleanly per code unit

ℹ️ util.diff() works at the UTF-16 code unit level, not full Unicode grapheme clusters. While this is fine for most practical cases, complex characters like emojis may appear split in the diff result.

Related Articles