Unicode / UTF-8 Inspector

Inspect any character: code point, decimal, HTML entity, JS escape, UTF-8 bytes, UTF-16 units. Handles emoji and surrogate pairs correctly.

Encoding
Ad

Common use cases

Frequently asked questions

Why is "πŸŽ‰" 4 UTF-8 bytes but 2 UTF-16 code units?

Code points above U+FFFF need a surrogate pair in UTF-16 (2Γ—16 bits = 4 bytes) but encode in 4 UTF-8 bytes (because it's outside the BMP).

What's the difference between length and visible characters?

JavaScript .length returns UTF-16 code units. Visible characters (graphemes) require Array.from(str).length, which this tool uses.

Related tools