Ad
Common use cases
- Debug character encoding issues
- Find the code point of an emoji
- Verify UTF-8 byte sequences
- Decode mojibake
Frequently asked questions
Why is "π" 4 UTF-8 bytes but 2 UTF-16 code units?
Code points above U+FFFF need a surrogate pair in UTF-16 (2Γ16 bits = 4 bytes) but encode in 4 UTF-8 bytes (because it's outside the BMP).
What's the difference between length and visible characters?
JavaScript .length returns UTF-16 code units. Visible characters (graphemes) require Array.from(str).length, which this tool uses.