Unicode / UTF-8 Inspector

Inspect any character: code point, decimal, HTML entity, JS escape, UTF-8 bytes, UTF-16 units. Handles emoji and surrogate pairs correctly.

Encoding

Text or character

Common use cases

Code points above U+FFFF need a surrogate pair in UTF-16 (2×16 bits = 4 bytes) but encode in 4 UTF-8 bytes (because it's outside the BMP).

JavaScript .length returns UTF-16 code units. Visible characters (graphemes) require Array.from(str).length, which this tool uses.