csvjson

HTML Entity Encode / Decode

Encode special HTML characters to entities (& < >) or decode entities back to readable text. Handles named, decimal (<), and hex (<) entities.

When you need this

Preventing XSS in web output

Any user-supplied content rendered in HTML must have special characters encoded. < becomes &lt;, > becomes &gt;, & becomes &amp;. Skipping this step allows script injection attacks.

Displaying code samples on a page

When you want to show HTML source code on a web page, the < and > characters must be encoded or the browser parses them as real tags. Encoding lets the browser display the literal characters.

Debugging email templates

HTML emails often arrive with doubly-encoded entities or garbled special characters. Decoding shows the original intended text and helps identify where the encoding was applied incorrectly.

CMS and rich text editors

WordPress, Drupal, and similar CMS platforms store content with HTML entities. When extracting content via API, you may need to decode entities to get the plain text representation.

Example

Encoding HTML for display in a code block

Input (HTML)
<script>alert("XSS & injection")</script>
<img src="x" onerror="steal(document.cookie)">
Output (encoded — safe to render)
&lt;script&gt;alert(&quot;XSS &amp; injection&quot;)&lt;/script&gt;
&lt;img src=&quot;x&quot; onerror=&quot;steal(document.cookie)&quot;&gt;

Encoded output renders as literal text in the browser — the script tags and event handlers are displayed, not executed.

Frequently asked questions

What are HTML entities?

HTML entities are sequences used to represent characters that have special meaning in HTML or that aren't easily typed. They start with & and end with ;. Named entities like &amp; represent & and &lt; represents <. Numeric entities like &#60; or &#x3C; also represent < using decimal or hex character codes.

Which characters must be encoded in HTML?

The minimum required: & → &amp;, < → &lt;, > → &gt;. In attribute values: " → &quot; (for double-quoted attributes) or ' → &apos; (for single-quoted). Non-ASCII characters don't need encoding in UTF-8 HTML documents but can optionally be encoded as numeric entities.

What's the difference between named and numeric entities?

Named entities (&amp;, &copy;, &euro;) are human-readable but require the browser to look them up in a table — only standardized names work. Numeric entities (&#38; for decimal, &#x26; for hex) work for any Unicode character and are more universally supported, though less readable.

Should I encode all non-ASCII characters?

No, not in modern HTML. If your document declares UTF-8 (which Next.js and most frameworks do by default), you can use non-ASCII characters directly: é, ü, →, ©. Encoding them as &eacute; or &#233; is optional and adds visual noise. Encode when you're generating HTML for a legacy system that doesn't handle UTF-8.

Does this tool encode spaces as &nbsp;?

Yes — a regular space is encoded as &nbsp; (non-breaking space entity). In most HTML contexts you want regular spaces, not &nbsp;. Use &nbsp; only when you specifically need to prevent line breaks or preserve multiple spaces (which HTML normally collapses to one). If you see &nbsp; appearing unexpectedly, copy the raw space character instead.