<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Gavin Panella &#187; twitter</title>
	<atom:link href="http://gavinpanella.com/blog/tag/twitter/feed/" rel="self" type="application/rss+xml" />
	<link>http://gavinpanella.com/blog</link>
	<description></description>
	<lastBuildDate>Mon, 06 Jul 2009 09:27:44 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Twitter API encoding is somewhat bonkers</title>
		<link>http://gavinpanella.com/blog/2008/12/18/twitter-api-encoding-is-somewhat-bonkers/</link>
		<comments>http://gavinpanella.com/blog/2008/12/18/twitter-api-encoding-is-somewhat-bonkers/#comments</comments>
		<pubDate>Thu, 18 Dec 2008 14:57:44 +0000</pubDate>
		<dc:creator>Gavin Panella</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[api]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://gavinpanella.com/blog/?p=52</guid>
		<description><![CDATA[Take a look at the Encoding section in the Twitter API docs: The Twitter API supports UTF-8 encoding. Please note that angle brackets (&#8220;&#60;&#8221; and &#8220;&#62;&#8221;) are entity-encoded to prevent Cross-Site Scripting attacks for web-embedded consumers of JSON API output. The resulting encoded entities do count towards the 140 character limit. Does anyone notice the [...]]]></description>
			<content:encoded><![CDATA[<p>Take a look at the <a href="http://apiwiki.twitter.com/REST+API+Documentation#Encoding">Encoding</a> section in the Twitter API docs:</p>
<blockquote><p>The Twitter API supports UTF-8 encoding. Please note that angle brackets (&#8220;&lt;&#8221; and &#8220;&gt;&#8221;) are entity-encoded to prevent Cross-Site Scripting attacks for web-embedded consumers of JSON API output. The resulting encoded entities <strong>do</strong> count towards the 140 character limit.</p></blockquote>
<p>Does anyone notice the weirdness there? Apart from the <a href="http://php.net/magic_quotes">MAGIC_QUOTES</a> smell.</p>
<p>If I were feeling pathological, I could tweet a message of 140 characters all between the Unicode code-points U+010000-U+10FFFF. I think that would end up as 560 bytes. And I think that would be all fine with Twitter. Which is another way of saying that Twitter would, I assume, be happy to exceed 140 <em>bytes</em> for a message if it were written in, say, Japanese.</p>
<p>By contrast, while on my pathological holiday from good sense, I would only be able to tweet a message of 35 angle brackets &#8211; hence 140 characters, 140 bytes in UTF-8 &#8211; because the encoded angle-brackets count toward the number of <em>characters</em>. Seems a bit backwards doesn&#8217;t it?</p>
<p>Does anyone know the reasoning here? Or are the docs at fault?</p>
<p>Back to the angle-bracket quoting. Just as the PHP folk are finally ending their own embarrassing journey through that silliness, it looks to me like Twitter are now making a similar mistake. JSON should safely encapsulate angle-brackets, so perhaps I don&#8217;t understand the problem that they are trying to solve?</p>
<p>One more question: what if I tweet &#8220;&amp;gt;&#8221;? When using the API, can that be distinguished from a &#8220;&gt;&#8221;?</p>
<p>(You might have noticed that I&#8217;ve have so far been too lazy to experiment with all this stuff; I just wanted to write it down before I forgot. I&#8217;ll add a comment if I get the time to play.)</p>
]]></content:encoded>
			<wfw:commentRss>http://gavinpanella.com/blog/2008/12/18/twitter-api-encoding-is-somewhat-bonkers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
