$Id: encoding.txt 1759 2008-01-28 22:56:08Z mjs $

HOW TO ENCODE/DECODE VARIOUS STRINGS

HTML (NAMED) ENTITIES

PHP

htmlentities($s, ENTQUOTES, $inputcharset) htmlentitydecode($s, ENTQUOTES, $outputcharset)

XML (NUMERIC) ENTITIES

PHP

?

If you can get away with UTF-8, though, you can use htmlentitydecode() as above.

URI

Perl

use URI::Escape;

uriescape($s); uriunescape($s);

PHP

urlencode($s); urldecode($s);

Javascript

encodeURI(s); // does not encode question mark (?) decodeURI(s); // does not encode question mark (?)

encodeURIComponent(s); // encodes question mark (?) decodeURIComponent(s); // encodes question mark (?)

escape(s); // deprecated unescape(s); // deprecated

MIME/EMAIL HEADERS (RFC 2047)

Perl

use MIME::WordDecoder;

unmime($s); // decodes to iso-8859

Python http://docs.python.org/lib/module-email.header.html

python

from email.header import decodeheader decodeheader('=?iso-8859-1?q?p=F6stal?=') [('p\xf6stal', 'iso-8859-1')]

PHP

iconvmimeencode() iconvmimedecode()

MIME BODY

PUNYCODE (DOMAIN NAMES)