HTML Entity Encoder/Decoder Guide: Escape Characters for Safe Web Display
If you have ever tried to display the less-than sign < in an HTML page and found that the browser interpreted it as the start of a tag instead of showing it as text, you have encountered the problem that HTML entities solve. HTML entities are a standardized way to represent characters that would otherwise be misinterpreted by the browser or that cannot be typed directly.
Understanding when and how to use them is essential for anyone who builds web pages, handles user-generated content, or processes HTML in code.
What Are HTML Entities?
HTML entities are special text sequences that represent characters in HTML source code. They always start with an ampersand (&) and end with a semicolon (;). Between them is either a named reference (like amp for ampersand) or a numeric code.
For example, to display the literal text <strong> on a web page without the browser treating it as a tag, you write:
<strong>
The browser reads the entities, converts them back to the characters they represent, and displays them as text rather than interpreting them as HTML markup.
Common HTML Entity Reference Table
| Character | Named Entity | Numeric Entity | Description |
|---|---|---|---|
| & | & | & | Ampersand |
| < | < | < | Less-than sign |
| > | > | > | Greater-than sign |
| " | " | " | Double quote |
| ' | ' | ' | Single quote / apostrophe |
| (non-breaking space) | |   | Non-breaking space |
| © | © | © | Copyright symbol |
| ® | ® | ® | Registered trademark |
| ™ | ™ | ™ | Trademark symbol |
| — | — | — | Em dash |
| – | – | – | En dash |
| € | € | € | Euro sign |
| £ | £ | £ | Pound sign |
Named Entities vs. Numeric Entities
Named entities like & and © are easier to read and remember. However, not all characters have named equivalents — only a subset defined by the HTML specification have names.
Numeric entities can represent any Unicode character by its code point. They come in two forms:
- Decimal:
©â€” the#followed by the decimal Unicode code point - Hexadecimal:
©â€” the#xfollowed by the hex code point
Both forms work in all browsers. Named entities are preferred where they exist because they are self-documenting. Use numeric entities for characters that have no named form.
When Encoding Is Necessary
Displaying Code and Markup as Text
If you are writing a tutorial, a code documentation page, or any content that shows HTML, XML, or other markup as visible text, you must encode the angle brackets and ampersands. Without encoding, the browser interprets the markup and either renders it or silently discards it.
Handling User-Generated Content
This is the most security-critical use case. When a user submits text through a form — a comment, a username, a product review — and you render that text back onto an HTML page, you must encode it. If you do not, a user who submits <script>alert('XSS')</script> as their name could inject executable JavaScript into your page.
HTML entity encoding converts < to <, so the browser displays the literal text rather than executing a script tag. This is called output encoding and is a core defense against Cross-Site Scripting (XSS) attacks.
Ensuring Correct Rendering Across Platforms
Some characters — particularly quotation marks, dashes, and typographic symbols — may render differently or break layout depending on the encoding of the document. Using HTML entities guarantees correct rendering regardless of the page's character encoding declaration.
When Decoding Is Useful
You need to decode HTML entities when:
- Processing scraped HTML — content scraped from web pages often contains entities that you want as plain text for storage or further processing
- Reading encoded API responses — some APIs return HTML-encoded strings (especially older RSS feeds) that need decoding before display
- Text comparison — comparing
&and&will not match; you need to decode first - Generating plain text from HTML — when stripping HTML tags to produce plain text, entities should also be decoded
A Worked Example: From Raw Text to Encoded HTML
Suppose a user submits this comment:
Great article! Use <strong> for bold & <em> for italics. It's "easy" to learn.
After HTML entity encoding, it becomes safe to insert into your page:
Great article! Use <strong> for bold & <em> for italics. It's "easy" to learn.
When the browser renders the encoded version, the user sees the original text exactly as they typed it — angle brackets, ampersands, quotes and all — displayed as text content, not interpreted as HTML.
Encode or Decode HTML Entities Instantly
Paste your text and encode or decode HTML entities with one click — free, no login required.
Open HTML Entity ToolHow to Use the HTML Entity Encoder/Decoder Tool
- Open the HTML Entity Encoder/Decoder tool.
- Paste your text into the input area.
- Choose Encode to convert special characters to entities, or Decode to convert entities back to characters.
- The result appears in the output area. Copy it for use in your HTML document, code, or database.
The Relationship Between Entity Encoding and XSS
Cross-Site Scripting (XSS) is one of the most common web vulnerabilities, and HTML entity encoding is one of the primary defenses against it. The attack works by injecting malicious HTML or JavaScript into a page that other users then view. If your application renders user input as raw HTML, an attacker can inject a <script> tag or an event handler like onload="stealCookies()".
Encoding the output means the browser never interprets user input as markup — it always treats it as text data. Modern web frameworks (React, Vue, Angular, Django, Rails) encode HTML output automatically by default, which is why injecting raw HTML in these frameworks requires explicit opt-in with methods like dangerouslySetInnerHTML in React or v-html in Vue.
If you are building with plain HTML and server-side templating, you are responsible for encoding all user-supplied values before inserting them into HTML. This encoder tool is useful for manual encoding during development and testing, and for understanding what the encoded form of a string should look like.
Summary
HTML entities represent characters that have special meaning in HTML or that are otherwise difficult to type directly. The five most critical entities to know are &, <, >, ", and '. Encoding user-generated content before rendering it in HTML is a fundamental security practice that prevents Cross-Site Scripting attacks. Decoding is useful when processing scraped HTML or encoded API responses. The encoder/decoder tool above handles both directions instantly.