UTF8 Converter

Convert text to UTF-8 byte formats or decode UTF-8 byte sequences back into readable text. Use it to inspect Unicode, emoji, API payloads, and encoding bugs.

Last updated: May 30, 2026

Characters

Code points

UTF-8 bytes

Text input

UTF-8 byte formats

Hex: 48 65 6C 6C 6F 20 F0 9F 98 80
Decimal: 72 101 108 108 111 32 240 159 152 128
Binary: 01001000 01100101 01101100 01101100 01101111 00100000 11110000 10011111 10011000 10000000
Percent encoded: %48%65%6C%6C%6F%20%F0%9F%98%80
Escaped: \x48\x65\x6C\x6C\x6F\x20\xF0\x9F\x98\x80

Client-Side Processing

Input Data Stays on Device

Instant Local Execution

Find this tool useful? Support the project to keep it free!

Buy me a coffee

What is UTF8 Converter?

UTF-8 is the standard character encoding used by the web, JSON, APIs, source files, databases, and most modern software. It stores ASCII characters in one byte and uses multiple bytes for many Unicode characters such as accented letters, symbols, CJK text, and emoji.

This UTF8 converter gives you both directions in one tool. Encode text to see its hex, decimal, binary, percent-encoded, and escaped byte forms. Decode byte input to validate and recover UTF-8 text. It is designed for developers debugging payloads, teaching encoding concepts, and confirming exactly how text is represented at the byte level.

How to Use UTF8 Converter

Choose Text to UTF-8 bytes to encode normal text

Paste text and review hex, decimal, binary, percent, and escaped byte formats

Switch to UTF-8 bytes to text when you need to decode byte input

Select auto-detect or a specific input format for decoding

Copy the output for code, docs, tests, or debugging notes

Common Use Cases

Inspecting how emoji and accented characters are represented as UTF-8 bytes.
Decoding API or log payloads that contain hex, percent, or escaped UTF-8 bytes.
Validating that byte input is legal UTF-8 before passing it to an application.
Preparing byte-level examples for documentation, tests, or protocol specs.
Debugging mojibake and character encoding issues in web apps.

Example Input and Output

The text Hello plus an emoji is shown as UTF-8 byte values.

Text input

Hello 😀

UTF-8 byte formats

Hex: 48 65 6C 6C 6F 20 F0 9F 98 80
Decimal: 72 101 108 108 111 32 240 159 152 128

Privacy

All UTF-8 conversion runs in your browser. Text and byte payloads are not sent to a server.

Code points and bytes

Use the code point and byte counts to spot emoji, combined characters, and other multi-byte text that may affect storage or API limits.

Frequently Asked Questions

What is UTF-8?

UTF-8 is a variable-width Unicode encoding. ASCII characters use one byte, while other Unicode characters use two, three, or four bytes.

Why are characters and bytes different?

A JavaScript string character count is not always the same as UTF-8 byte length. Emoji and many non-Latin characters require multiple bytes.

What byte input formats can I decode?

The decoder accepts hex bytes, decimal bytes, percent-encoded bytes, and \xNN escaped bytes. Auto-detect handles common formats.

Does the decoder validate UTF-8?

Yes. Invalid byte sequences are reported as errors instead of silently replacing characters.

Is my text uploaded?

No. UTF-8 conversion runs locally in your browser.

Related Tools

UTF8 Decode

Decode UTF-8 byte sequences from hex, decimal, percent-encoded, or escaped input into readable text.

UTF-8 to ASCII Converter

Convert UTF-8 text to its raw byte values. Encodes every character as a UTF-8 byte sequence and outputs decimal, hex, or octal ASCII codes for each byte.

Hex to UTF8

Decode hexadecimal byte values as UTF-8 text. Supports compact hex, spaced bytes, and 0x-prefixed values.

String to Hex

Convert strings into hexadecimal UTF-8 byte values. Choose space, comma, or newline separators and copy the hex output.

Byte to String

Convert decimal byte values into readable strings. Decode UTF-8 bytes or use Latin-1 mode for direct byte mapping.

Calculate String Length

Count string characters, Unicode code points, UTF-8 bytes, words, lines, paragraphs, sentences, and whitespace.