The PHP get_html_translation_table() function returns the translation table which is used internally for htmlspecialchars() and htmlentities().
Note: Special characters can be encoded in several ways. For example " can be encoded as ", " or ". This function returns only the form used by htmlspecialchars() and htmlentities().
Syntax
get_html_translation_table(table, flags encoding)
Parameters
table
Optional. Specify which table to return. Possible values are:
HTML_SPECIALCHARS: Translates some characters that need URL-encoding to be shown properly on a HTML page.
HTML_ENTITIES: Translates all characters that need URL-encoding to be shown properly on a HTML page.
The default is HTML_SPECIALCHARS.
flags
Optional. Specify which quotes the table will contain as well as which document type the table is for. The available flags constants are:
ENT_COMPAT: Table contains entities for double-quotes, but not for single-quotes.
ENT_QUOTES: Table contains entities for both double and single quotes.
ENT_NOQUOTES: Table neither contains entities for single quotes nor for double quotes.
ENT_HTML401: Table for HTML 4.01.
ENT_XML1: Table for XML 1.
ENT_XHTML: Table for XHTML.
ENT_HTML5: Table for HTML 5.
Default is ENT_COMPAT | ENT_HTML401.
encoding
Optional. A string that specifies which character-set to use. The following character sets are supported:
ISO-8859-1: (Aliases - ISO8859-1) - Western European, Latin-1.
ISO-8859-5: (Aliases - ISO8859-5) - Little used cyrillic charset (Latin/Cyrillic).
ISO-8859-15: (Aliases - ISO8859-15) - Western European, Latin-9. Adds the Euro sign, French and Finnish letters missing in Latin-1 (ISO-8859-1).
cp1252: (Aliases - Windows-1252, 1252) - Windows specific charset for Western European.
KOI8-R: (Aliases - koi8-ru, koi8r) - Russian.
BIG5: (Aliases - 950) - Traditional Chinese, mainly used in Taiwan.
GB2312: (Aliases - 936) - Simplified Chinese, national standard character set.
BIG5-HKSCS: Big5 with Hong Kong extensions, Traditional Chinese.
Shift_JIS: (Aliases - SJIS, SJIS-win, cp932, 932) - Japanese
EUC-JP: (Aliases - EUCJP, eucJP-win) - Japanese
MacRoman: Charset that was used by Mac OS.
'': An empty string activates detection from script encoding (Zend multibyte), default_charset and current locale, in this order. It is not recommended.
If omitted, encoding defaults to the value of the default_charset configuration option. "UTF-8" is the default value and its value is used as the default character encoding if the encoding parameter is omitted.
Return Value
Returns the translation table as an array, with the original characters as keys and entities as values.
Example:
The example below shows the usage of get_html_translation_table() function.