Unicode / UTF-8 Converter

Convert text to Unicode code points, UTF-8 bytes, UTF-16 encoding, and character names. Decode U+XXXX code points to characters. Supports emoji, CJK, Arabic, Cyrillic.

Understanding Unicode and Character Encoding

Unicode is a universal character set that assigns a unique code point to every character from every writing system. Unlike legacy encodings like ASCII (128 characters) or Latin-1 (256 characters), Unicode can represent over 1.1 million characters including emoji, CJK ideographs, Arabic, Cyrillic, and historical scripts.

Code points are written as U+ followed by hexadecimal digits. For example, U+0041 is the Latin letter A, U+4E2D is 中 (Chinese), and U+1F600 is 😀 (grinning face). The Unicode standard also defines character names, properties, and normalization rules.

UTF-8: The Dominant Web Encoding

UTF-8 is the most common encoding on the web because it's backward-compatible with ASCII — the first 128 code points use single bytes identical to ASCII. Characters 128–2047 use 2 bytes, 2048–65535 use 3 bytes, and characters above 65535 use 4 bytes.

This variable-length design means English text stays compact while supporting the full Unicode repertoire. UTF-8 is the default for HTML, JSON, and most modern APIs. BOM (Byte Order Mark) is optional and rarely used for UTF-8.

UTF-16 and UTF-32 Encoding

UTF-16 uses 16-bit code units. Characters in the Basic Multilingual Plane (U+0000–U+FFFF) use one unit; characters above use surrogate pairs (two units). JavaScript strings are UTF-16 internally. UTF-32 uses exactly 4 bytes per character, providing fixed-width encoding at the cost of space — rarely used except in specialized contexts.

Character Names and Properties

The Unicode standard assigns a unique name to each character. For example, U+0041 is 'LATIN CAPITAL LETTER A' and U+00A9 is 'COPYRIGHT SIGN'. These names help identify characters when the glyph isn't displayed or when debugging encoding issues. Unicode also defines properties like script, category (letter, digit, punctuation), and case mapping.

Frequently Asked Questions

Related Tools

Explore More Tools

Find this tool useful? Buy us a coffee to keep DuskTools free and ad-light.