You are here

ت R0XX0R!

It turns out that Persian has a character ت called "teh". I plan to use it often; it's ت lEEt. (The word "lEEt" means "elite" and is rendered "1334" in haxxor, at least by me.) The phrase "ت 1334" should look like "" if everything is working well.

BTW, try to type "ت 1334" in a Firefox text field for loads of 1334 fun. Get it to look right in a comment attached to this message. Seriously, try it: no peeking with "view source". The implementation of mixing left-to-right Latin text with text from a right-to-left Arabic language seems kind of wacky. Key observation: "arabic" digits are language-independent.

UTF-8 is awesome, but I don't think it's quite ready to save the world yet.

[Note: I have edited the above for clarity several times since posting. Hopefully it's easier to understand now.] Fob

Comments

So very very close to being 1337...

alt+1334 appears to be giving me the digit "6". Is that what you were referring to as wacky fun? Smile

(oh, and captchas that require cookies enabled are even more evil than other captchas Wink )

"alt+1334"? The challenge is to get the 6-character (the number of characters is just a coincidence) phrase "ت 1334"—that's a funny-looking Arabic character followed by a space and then 4 digits—typed into a browser text box here and displayed properly in the comment preview. This may be easier if you're typing at IE—I have no idea. For Firefox 1.5.0.1, it's pretty hard.

Is it possible you're not seeing the funny-looking Arabic character? If you're seeing a funny-looking Unicode symbol for missing character, that's not as good as having an Arabic font, but it's OK. If you're just seeing " 1334", then your browser and/or OS is failing horribly to be internationalized.

I only have the one captcha module, and I didn't write it. Smile I promise I won't do anything evil with cookies. Believe me, I would love to just turn the $#@! captchas off, but I've gotten spammed one too many times…

same as previous poster in windows notepad ALT+1334 gives me a "6"
but in firefox it acts as ALT-LEFTARROW

other unicode pages claim U1334 is ETHIOPIC SYLLABLE PHEE
http://www.fileformat.info/info/unicode/char/1334/index.htm

an ARABIC TEH is
http://www.fileformat.info/info/unicode/char/fe95/index.htm
there are various initial/isolated/terminal forms

Right. The challenge is to type "TEH LEET", but using the Arabic character "ت" for TEH and the digits "1334" for LEET. It should read properly left-to-right in the resulting comment.

Have fun!

"ت 1334"

On Linux, weather Pango is enabled in the mozilla build matters.
Pango is a international text display/input library.
On my fedora core 5 system, where firefox-1.5 has Pango enabled and mozilla-1.7.12 does not, I can easily cut-&-paste with firefox but not with mozilla.

I'm not sure exactly what bug, if any, should be filed? Note that your comment was rendered as "LEET TEH" rather than "TEH LEET" in both the title and the body, at least on Firefox 1.5.

I'm not sure whether Pango is actually correctly inputting / rendering mixed-direction text in this situation or not. I could probably figure it out with a couple hours of research. Maybe I'll try it when I get a moment sometime in the next few days.

"<bdo dir="ltr">&#xfe95; 1334</bdo>" is "ﺕ 1334"

"<span dir="ltr">&#xfe95;</span> 1334" is "ﺕ 1334"

"<span style="direction: ltr; unicode-bidi: bidi-override;">&#xfe95; 1334</span>" is "ﺕ 1334"

See ﺕ HTML spec. But even better:

"&#x202d;&#xfe95; 1334&#x202c;" is "‭ﺕ 1334‬" outside of HTML, such as in ﺕ subject line or on ﺕ console

(Intentional use of character references so that ﺕ HTML code doesn't render ﺕ ﺕ wrong direction.)

I actually had used <span>, having not known about <bdo>.

I'm not quite getting the proliferation of entities in the ﺕ last technique?

Very nice!

&#x202d; is ﺕ "LEFT-TO-RIGHT OVERRIDE" character.
&#x202c; is ﺕ "POP DIRECTIONAL FORMATTING" character.
From ﺕ HTML spec:

Authors may also use special Unicode characters to override the bidirectional algorithm -- LEFT-TO-RIGHT OVERRIDE (202D) or RIGHT-TO-LEFT OVERRIDE (hexadecimal 202E). The POP DIRECTIONAL FORMATTING (hexadecimal 202C) character ends either bidirectional override.

Also, ﺕ direction overrides can either go around ﺕ whole phrase as in ﺕ first post, or just around ﺕ ﺕ: "&#x202D;&#xfe95;&#x202C; 1334" is "‭ﺕ‬ 1334" as well.

Please note that 1334 in h4xx0r is 'leea', and not 'leet', which is written '1337'.

…d00dez u suxxor r u jkn 4 1337?

Thanks much for the correction!

Safari is ت 1337.

Safari, (KHTML-based) does the weird thing you describe:

1. Type "Safari is "
2. Edit > Special Characters...
3. Search for "TEH"
4. Select "ARABIC LETTER TEH" (not TTEH, TTEHEH, TEHEH, or TEH WITH RING :) )
5. Click Insert. The I-bar cursor is now split: top half at the end of ت, bottom half at the start of ت.
6. Type a space, it's appended and cursor returns to normal.
7. Type "1". Whoa! The number inserts before ت and the cursor remains at the end.
8. Same for "337". Only non-space, non-numerals like "." break the RTL entry.

Of course, this is exactly the entry behavior expected for Arabic text. Ain't i18n fun?