Text to Unicode

Convert text to Unicode HTML entities and back

No uploads — conversion happens locally in your browser

Last updated: January 26, 2026
Frank Zhao - Creator
CreatorFrank Zhao

Text to Unicode

Unicode to Text

Introduction / overview

Text to Unicode converts ordinary text into Unicode decimal HTML entities (also called numeric character references), like A\texttt{\&\#65;}. It also converts those entities back into readable text.

Think of it as a reversible “label maker” for characters: each character becomes a number, wrapped as&#n;\texttt{\&\#}\,n\,\texttt{;}.

Who is this useful for?

  • Developers debugging strings inside HTML, templates, and logs.
  • Writers/ops teams sanitizing snippets for documentation or tickets.
  • Anyone who needs a fast “what numbers are in this string?” sanity check.

If you frequently transform text for sharing or debugging, you might also like our Text to ASCII binary and Case Converter.

How to use / quick start

1

Type or paste your text

Use the “Text to Unicode” box. The output updates instantly.

2

Copy the Unicode output

Click “Copy unicode to clipboard”. You’ll get a string shaped like Hi\texttt{\&\#72;\&\#105;}.

3

Convert back when needed

Paste a string containing entities into “Unicode to Text”. Every&#digits;\texttt{\&\#}\,\text{digits}\,\texttt{;} becomes a character.

4

Share or reset safely

Use Share/Reset at the bottom. If the input contains sensitive content, avoid sharing links that include results.

How to interpret the output

Each entity is a decimal number. In this tool, the number comes from the JavaScript notion of a character code unit, so a single visible emoji may become two entities.

Step-by-step examples

Example 1: "Hi" → Unicode entities

Input text: Hi\texttt{Hi}. The tool converts each character into&#n;\texttt{\&\#}\,n\,\texttt{;}.

code(H)\mathrm{code}(\texttt{H})==7272,,code(i)\mathrm{code}(\texttt{i})==105105
Hi\texttt{Hi}\RightarrowHi\texttt{\&\#72;\&\#105;}

In the calculator, type Hi\texttt{Hi} in the first box and copy the output.

Example 2: entities → text (mixed input)

Input Unicode string: Ticket: Hi!\texttt{Ticket:\ \&\#72;\&\#105;!}. Only the matching entities are replaced.

Ticket: Hi!\texttt{Ticket:\ \&\#72;\&\#105;!}\RightarrowTicket: Hi!\texttt{Ticket:\ Hi!}

Practical takeaway: you can paste an entire HTML snippet and the tool will replace every decimal entity it recognizes.

Real-world examples / use cases

1) Preparing a safe HTML demo snippet

Background: you want to show a string in HTML without it being interpreted as markup.

Input: A&B\texttt{A\&B}

Result:A&B\texttt{\&\#65;\&\#38;\&\#66;}

How to use it: paste the output into your HTML content area to display the original characters.

2) Debugging “invisible” characters

Background: a string fails to match because it contains a non-breaking space.

Input: a text that visually looks like Hello World\texttt{Hello\ World}.

Result: if the space is a non-breaking space, you will see a different number than a normal space.

How to use it: compare the entity numbers to confirm what character is actually present.

3) Sharing a snippet in a support ticket

Background: you need to paste a string that keeps breaking formatting in a ticket.

Input: your problematic line

Result: a compact entity string you can paste anywhere

How to use it: include the entity output plus a short note that it is decimal HTML entities.

4) Quick readability checks for encoding pipelines

Background: you are moving text through systems with uncertain encoding behavior.

Input: a short sample phrase

Result: stable numeric representation you can compare before/after

How to use it: run “text → unicode” on the source and destination to spot changes.

For other ways to represent strings, try Text to ASCII binary for byte-ish representations, or Base64 String Encoder/Decoder for compact transport.

Common scenarios / when to use

Template debugging

Convert text to entities to see exactly what characters your template is outputting.

Strange whitespace

Spot non-breaking spaces or odd punctuation by comparing their numeric values.

Documentation snippets

Share strings safely in docs and tickets without worrying about rendering rules.

Quick obfuscation

Lightly obscure text for casual sharing (not security). Use encryption for real secrecy.

Pipeline verification

Compare the entity output before/after copying through systems to catch silent changes.

HTML entity decoding

Turn decimal entities back into readable text for quick inspection.

When it might not be the right tool

  • If you need real secrecy, use Encrypt / Decrypt Text instead.
  • If you need Unicode code points like U+1F600\texttt{U+1F600}, this tool won’t output that format.

Tips & best practices

Practical tips

  • If you’re dealing with HTML, remember this tool uses decimal entities&#...;\texttt{\&\#...;}, not hexadecimal&#x...;\texttt{\&\#x...;}.
  • For long strings, convert a short “probe” substring first to confirm the format you need.
  • If you’re investigating emoji, expect two numbers because JavaScript uses UTF-16 code units.

Pro tip: if the decoded output looks odd, check whether your input actually contains semicolons —A\texttt{\&\#65;} works, but&#65\texttt{\&\#65} will not be recognized by this tool.

Calculation method / formula explanation

The implementation intentionally stays simple. It mirrors the behavior of a common reference tool: split the input into JavaScript “characters” (UTF-16 code units), then wrap each code unit value as a decimal HTML entity.

Entity(c)=&#charCodeAt(c);\text{Entity}(c)=\texttt{\&\#}\,\mathrm{charCodeAt}(c)\,\texttt{;}

(decimal numeric character reference per UTF-16 code unit)

Reverse conversion (decoding)

The tool searches for a strict decimal entity pattern and replaces it with the corresponding character.

&#n;\texttt{\&\#}n\texttt{;}\RightarrowfromCharCode(n)\mathrm{fromCharCode}(n)

Variables:cc is one UTF-16 unit, and nn is a decimal integer.

Related concepts / background

Code points vs code units

Unicode assigns a code point to each character (likeU+0041\texttt{U+0041} for “A”). JavaScript strings are stored as UTF-16, which means some characters (notably many emoji) use two 16-bit code units. This tool outputs decimal entities per code unit, not per code point.

HTML numeric character references

HTML supports numeric references like ©\texttt{\&\#169;} for ©. They can be decimal or hex. This tool focuses on the decimal form because it’s easy to compare and copy.

If your goal is a byte-level representation for debugging encoding boundaries, use Text to ASCII binary (for ASCII-range strings) or Base64 for transport.

Frequently asked questions (FAQs)

Why does an emoji turn into two entities?

Because this tool encodes UTF-16 code units. Some characters are represented as a surrogate pair, which means two 16-bit units, so you’ll see two decimal entities.

Does this tool output \"U+XXXX\" code points?

No. The output format is decimal HTML entities like A\texttt{\&\#65;}.

Will it decode hexadecimal entities like A?

Not in this tool. It only recognizes the decimal pattern&#digits;\texttt{\&\#}\,\text{digits}\,\texttt{;}.

What if my entities are missing semicolons?

Then they won’t be decoded. For example,A\texttt{\&\#65;} works, but&#65\texttt{\&\#65} does not.

Is this safe for secrets or passwords?

It’s reversible and not encryption. For secrets, use Encrypt / Decrypt Text with a key you control.

Can I decode entities inside a larger text?

Yes. The decoder replaces all matching entities in the input. Non-matching text stays the same.

Limitations / disclaimers

Limitations to keep in mind

  • Output is based on UTF-16 code units, so it may differ from code point-based tools.
  • The decoder only handles decimal entities of the form&#digits;\texttt{\&\#}\,\text{digits}\,\texttt{;}.
  • This tool is for convenience and debugging and does not replace security, legal, or compliance guidance.

External references / sources

Further reading

Note: this page uses decimal entities for readability and compatibility with the tool’s output format.

If you’re looking for code-focused transformations, you may also like our Hash Text tool for fingerprints and comparisons.

Text to Unicode - Unicode HTML Entity Converter