Skip to content
epitometool

HTML entities encoder / decoder

Encoders & decoders

Encode and decode HTML entities like & and '.

Updated

Mode

Plain text

0 chars

HTML entities

0 chars

  • EnterCopy output
  • KClear input

Quick start

How to encode and decode HTML entities

Paste text on one side, get the safely-escaped (or unescaped) version on the other.

  1. Step 1
    Pick a strategy

    Minimal escapes only the five XSS-risk characters. Named uses pretty entities (©, €). Numeric and Hex escape every non-ASCII character as &#NN; or &#xNN;.

  2. Step 2
    Paste your text

    Drop HTML, markup, code or any string into the left pane. Decode mode accepts every entity your browser supports.

  3. Step 3
    Copy the result

    Hit Copy or press ⌘/Ctrl+Enter. Use Swap ⇄ to round-trip back without re-pasting.

In-depth guide

Encode and decode HTML entities safely

HTML entity encoding replaces characters that have special meaning in HTML (or that aren't part of ASCII) with named references like © or numeric references like ©. Done right, it's the simplest defence against cross-site scripting. Done wrong, it's the source of the oddest display bugs on the web.

The five-character minimum

For safely inserting user-provided text into HTML, you only need to escape five characters:

  • && (must come first, or you double-escape later substitutions)
  • <&lt;
  • >&gt;
  • "&quot; (only matters inside double-quoted attributes)
  • '&#39; (only inside single-quoted attributes — we emit numeric for HTML4 safety)

That's the "Minimal" strategy in the encoder above. It's idiomatic for any modern templating pipeline and matches what frameworks like React, Vue and Astro produce automatically.

Named, numeric or hex?

Beyond the five-character minimum, three styles are interchangeable:

  • Named&copy;, &euro;, &mdash;. Most readable, but the named list is finite (~250 entries); rarer Unicode falls back to numeric.
  • Numeric (decimal)&#169;. Works for the entire Unicode range. Slightly less readable but universally supported.
  • Hex&#xA9;. Same coverage as decimal; preferred when you want hex codepoints to match Unicode tables verbatim.

Pick named for human-readable HTML, numeric for legacy email tooling, hex for documentation that lists Unicode code points alongside characters.

Decoding without an entity table

Most JavaScript entity-decode libraries ship a 100 KB table of every named entity. We use a single-line browser trick instead:

const ta = document.createElement("textarea");
ta.innerHTML = input;
return ta.value;

This delegates to the browser's actual HTML parser, so every entity it knows (named, numeric, hex) works automatically. Setting innerHTML on a textarea rather than a generic element avoids any script execution because <script> tags inside <textarea> are treated as text.

HTML escaping is contextual

Escaping HTML is not enough inside <script>, <style>, event handlers, or javascript: URLs. Each context has its own escaping rules. If you control the markup, use a templating engine that handles all of them.

For most application code, your framework already escapes user content correctly when you interpolate ({value} in React/Astro, {{ value }} in Vue, \${value} in template literals via tagged-template helpers). Manual escaping is reserved for the small number of places where you assemble raw HTML strings — for example, server-rendered emails, RSS feeds, or static-site generators emitting one-off HTML snippets.

Frequently asked questions

Which encoding strategy should I pick?

For escaping user content into HTML, 'Minimal' is the right answer 99% of the time — it escapes only `& < > " '`, which is enough to neutralise XSS while keeping the output readable. Use 'Named' if you want pretty entities like &copy; and &euro;. Use 'Numeric' / 'Hex' when you need to escape every non-ASCII character (e.g. for ancient email clients).

Are these escapes safe to insert into HTML attributes?

Yes — the Minimal set escapes the three characters that can break out of double- or single-quoted attribute values (`"`, `'`) plus `< > &`. Make sure the attribute itself is properly quoted (`<a href="…">`, not `<a href=…>`).

Can I decode &#xNN; hex entities and &#NN; decimal entities?

Yes — and named entities too (&nbsp;, &copy;, &Aacute;, &hellip;, every one the browser knows). Decoding uses the browser's HTML parser via the `textarea.innerHTML` trick, so coverage matches whatever your browser supports.

Why is &apos; sometimes wrong?

`&apos;` is HTML5 but was not defined in HTML4. For maximum compatibility (especially with old email clients and PDF tools), we encode the apostrophe as `&#39;` — the numeric form, which is valid everywhere.

Does this tool support emoji and CJK characters?

Yes. Both encoding and decoding are Unicode-aware (we iterate by code point so surrogate pairs stay intact). Emoji, CJK, accents and combining marks all round-trip correctly.

Is escaping HTML enough to prevent XSS?

It's the foundation, but context matters. HTML-escaping is correct for element bodies and attribute values. It is NOT sufficient inside `<script>`, `<style>`, event handlers or `javascript:` URLs — those need their own escaping rules. For most app code, template engines and frameworks handle this for you.

Keep exploring

More tools you'll like

Hand-picked utilities that pair well with the one you're on — all free, client-side, and zero-signup.