<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>RemoteRoot &#187; Projects</title>
	<atom:link href="http://www.remoteroot.net/category/projects/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.remoteroot.net</link>
	<description>The wired world</description>
	<lastBuildDate>Tue, 29 Jul 2008 14:16:38 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=abc</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Reading Facebook E-Mail Image Captcha&#8217;s</title>
		<link>http://www.remoteroot.net/2007/11/17/reading-facebook-e-mail-image-capchas/</link>
		<comments>http://www.remoteroot.net/2007/11/17/reading-facebook-e-mail-image-capchas/#comments</comments>
		<pubDate>Sat, 17 Nov 2007 22:40:32 +0000</pubDate>
		<dc:creator>Joe</dc:creator>
				<category><![CDATA[Projects]]></category>

		<guid isPermaLink="false">http://www.remoteroot.net/2007/11/17/reading-facebook-e-mail-image-capchas/</guid>
		<description><![CDATA[Facebook stops automatic e-mail harvesting, by saving each e-mail address on a profile as an image. It is possible to read some of the images with a certain level of accuracy, but the effort required just isn&#8217;t worth using OCR alone.
Using some fuzzy matching, its possible to get a rough list of addresses for a [...]]]></description>
			<content:encoded><![CDATA[<p>Facebook stops automatic e-mail harvesting, by saving each e-mail address on a profile as an image. It is possible to read some of the images with a certain level of accuracy, but the effort required just isn&#8217;t worth using OCR alone.</p>
<p>Using some fuzzy matching, its possible to get a rough list of addresses for a domain, but manual verification is needed for each address found.</p>
<p>The scripts below can be used to train GOCR on facebook images, and can then attempt to pick addresses matching a certain domain from a directory of images.</p>
<p>The scripts are <a href="http://www.remoteroot.net/wp-content/uploads/2008/02/ocr-fb-trainpl.txt" title="Facebook E-mail Image OCR - Training Script.">Training Script here</a> and  <a href="http://www.remoteroot.net/wp-content/uploads/2008/02/ocr-fbpl.txt" title="Facebook E-mail Image OCR - Matching Script."> Matching Script here</a>. You&#8217;ll need <a href="http://jocr.sourceforge.net/">GOCR</a> installed, <a href="http://search.cpan.org/dist/String-Approx/">String::Approx</a>, and the ability to ignore silly Perl.</p>
<p>First download a selection of Facebook E-Mail images, we&#8217;ll use these with the training script to give GOCR something to go on.</p>
<p>Then run the matching script on the images you wish to convert, it&#8217;ll do some fuzzy matching if you give it  domain to look for.</p>
<p>If I can improve this, I&#8217;ll try and automate it all a little more and work out some stats.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.remoteroot.net/2007/11/17/reading-facebook-e-mail-image-capchas/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Where-fi</title>
		<link>http://www.remoteroot.net/2007/07/19/where-fi/</link>
		<comments>http://www.remoteroot.net/2007/07/19/where-fi/#comments</comments>
		<pubDate>Thu, 19 Jul 2007 14:00:56 +0000</pubDate>
		<dc:creator>Joe</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[Where-Fi]]></category>

		<guid isPermaLink="false">http://www.remoteroot.net/2007/07/19/where-fi/</guid>
		<description><![CDATA[We&#8217;ve been gathering more data for the Where-fi service, in under three hours of driving around a sub-section of Reading we have 2000 plus access points.
This broke the architecture we had for displaying them, so it&#8217;s currently down.
]]></description>
			<content:encoded><![CDATA[<p>We&#8217;ve been gathering more data for the Where-fi service, in under three hours of driving around a sub-section of Reading we have 2000 plus access points.</p>
<p>This broke the architecture we had for displaying them, so it&#8217;s currently down.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.remoteroot.net/2007/07/19/where-fi/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
