Discussion:
AddDefaultCharset considered harmful (was: Mojibake on my Web pages)
Paul Deuter
2003-09-25 21:18:09 UTC
Permalink
Here is a link which describes how some hackers use
%XX and %uXXXX url encoding to mask a malicious request
or to get around an IDS product.

http://www.cgisecurity.com/contrib/hd_spring_2002.pdf

-Paul

-----Original Message-----
From: Martin Duerst [mailto:***@w3.org]
Sent: Thursday, September 25, 2003 1:32 PM
To: Doug Ewell; Unicode Mailing List
Subject: AddDefaultCharset considered harmful (was: Mojibake on my Web
pages)


Hello Doug, others,

Here is my most probable explanation:
Adelphia recently upgraded to Apache 2.0. The core config file (httpd.conf)
as distributed contains an entry
AddDefaultCharset iso-8859-1
which does what you have described. They probably adopted this
because the comment in the config file suggests that it's important.

I have just filed a bug with bugzilla, asking that this default
setting be removed or commented out, and the comment fixed, at
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=23421. You may
want to vote for that bug.

I have also commented on a related bug that I found, at
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=14513.

I suggest you tell your Internet provider:
1) that they change to AddDefaultCharset Off
(or simply comment this out)
2) that they make sure you get FileInfo permission in your directories,
so that you can do the settings you know you are correct.

The comment in the config file contains mostly very strange statements:
#
# Specify a default charset for all pages sent out. This is
# always a good idea and opens the door for future internationalisation
# of your web site, should you ever want it. Specifying it as
# a default does little harm; as the standard dictates that a page
# is in iso-8859-1 (latin1) unless specified otherwise i.e. you
# are merely stating the obvious. There are also some security
# reasons in browsers, related to javascript and URL parsing
# which encourage you to always set a default char set.
#
AddDefaultCharset ISO-8859-1
If anybody knows something about these security issues, please
tell me (any mention of security issues usually has webmasters
in control, for good reasons).


Regards, Martin.
Apologies in advance to anyone who visits my Web site and sees garbage
characters, a.k.a. "mojibake." It isn't my fault.
Adelphia is currently having a character-set problem with their HTTP
servers. Apparently they are serving all pages as ISO 8859-1 even if
they are marked as being encoded in another character set, such as
UTF-8.
If you manually change the encoding in your browser to UTF-8, or
download the page and display it as a local file, everything looks fine
because Adelphia's server is no longer calling the shot. Their tech
support people acknowledge that the problem is at their end and said
they would look into it.
I understand that having the "Unicode Encoded" logo on my page next to
these garbage characters may not reflect well on Unicode, especially to
newbies. I'm considering putting a disclaimer at the top of my pages,
but I'm waiting to see how quickly they solve the problem.
-Doug Ewell
Fullerton, California
http://users.adelphia.net/~dewell/
------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
j***@spin.ie
2003-09-26 09:05:40 UTC
Permalink
Post by Paul Deuter
Here is a link which describes how some hackers use
%XX and %uXXXX url encoding to mask a malicious request
or to get around an IDS product.
http://www.cgisecurity.com/contrib/hd_spring_2002.pdf
I wish hackers would give better references. This doesn't give proper credit to rain.forest.puppy for his work on that hole, rain.forest.puppy didn't give a proper reference to the security warnings already published about UTF-8 (which unfairly made it look like the flaw was in UTF-8 rather than in the way UTF-8 encoded in IRIs was being transcoded).

That particular issue doesn't really involve character set documents are labelled as using, though that would bring other issues. However the issues that do arise here will stem either from a faulty implementation of a transcoder (which a default charset setting won't affect - the cracker will label things in the way that suits their exploit) or through misidentified data - and this default setting misidentifies data and could possibly introduce new issues.

A flipside to the security issues of this sort is that sometimes Unicode can it more difficult to exploit buffer overflows, as the code being used to overflow the buffer is being transcoded from legacy to unicode before the smash (it doesn't make it harder to overflow the buffer, but it makes it harder to do so in a way that runs code you want to run). See <http://www.phrack.org/show.php?p=61&a=11>.






------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Loading...