The PHP htmlentities() function converts all applicable characters to HTML entities. This function is similar to htmlspecialchars() function, except with this function, all characters which have HTML character entity equivalents are translated into these entities.
To convert HTML entities back to characters, html_entity_decode() function can be used.
Optional. Specify how to handle quotes, invalid code unit sequences and the used document type. The available flags constants are:
ENT_COMPAT: Converts double-quotes and leave single-quotes alone.
ENT_QUOTES: Converts both double and single quotes.
ENT_NOQUOTES: Leaves both double and single quotes unconverted.
ENT_IGNORE: Silently discards invalid code unit sequences instead of returning an empty string. Using this flag is discouraged, as it may have security implications.
ENT_SUBSTITUTE: Replaces invalid code unit sequences with a Unicode Replacement Character U+FFFD (UTF-8) or FFFD; (otherwise) instead of returning an empty string.
ENT_DISALLOWED: Replaces invalid code points for the given document type with a Unicode Replacement Character U+FFFD (UTF-8) or FFFD;.
ENT_HTML401: Handle code as HTML 4.01.
ENT_XML1: Handle code as XML 1.
ENT_XHTML: Handle code as XHTML.
ENT_HTML5: Handle code as HTML 5.
The default is ENT_COMPAT | ENT_HTML401.
encoding
Optional. A string that specifies which character-set to use. The following character sets are supported:
ISO-8859-1: (Aliases - ISO8859-1) - Western European, Latin-1.
ISO-8859-5: (Aliases - ISO8859-5) - Little used cyrillic charset (Latin/Cyrillic).
ISO-8859-15: (Aliases - ISO8859-15) - Western European, Latin-9. Adds the Euro sign, French and Finnish letters missing in Latin-1 (ISO-8859-1).
cp1252: (Aliases - Windows-1252, 1252) - Windows specific charset for Western European.
KOI8-R: (Aliases - koi8-ru, koi8r) - Russian.
BIG5: (Aliases - 950) - Traditional Chinese, mainly used in Taiwan.
GB2312: (Aliases - 936) - Simplified Chinese, national standard character set.
BIG5-HKSCS: Big5 with Hong Kong extensions, Traditional Chinese.
Shift_JIS: (Aliases - SJIS, SJIS-win, cp932, 932) - Japanese
EUC-JP: (Aliases - EUCJP, eucJP-win) - Japanese
MacRoman: Charset that was used by Mac OS.
'': An empty string activates detection from script encoding (Zend multibyte), default_charset and current locale, in this order. It is not recommended.
If omitted, encoding defaults to the value of the default_charset configuration option. "UTF-8" is the default value and its value is used as the default character encoding if the encoding parameter is omitted.
double_encode
Optional. If set to false, PHP will not encode existing html entities. The default is true which converts everything.
Return Value
Returns the encoded string. If the input string contains an invalid code unit sequence within the given encoding an empty string is returned, unless either the ENT_IGNORE or ENT_SUBSTITUTE flags are set.
Example:
The example below shows the usage of htmlentities() function.
<?php
$str = "A 'quote' is <b>bold</b>";
//returns: A 'quote' is <b>bold</b>
echo htmlentities($str);
echo "\n";
//returns: A 'quote' is <b>bold</b>
echo htmlentities($str, ENT_QUOTES);
?>
The output of the above code will be:
A 'quote' is <b>bold</b>
A 'quote' is <b>bold</b>
Example: Usage of ENT_IGNORE
The ENT_IGNORE flag silently discards invalid code unit sequences instead of returning an empty string. Consider the example below: