Facebook stops automatic e-mail harvesting, by saving each e-mail address on a profile as an image. It is possible to read some of the images with a certain level of acuracy, but the effort required just isn’t worth using OCR alone.
Using some fuzzy matching, its possible to get a rough list of addresses for a domain, but manual verification is needed for each address found.
The scripts below can be used to train GOCR on facebook images, and can then attempt to pick addresses matching a certain domain from a directory of images.
The scripts are Training Script here and Matching Script here. You’ll need GOCR installed, String::Approx, and the ability to ignore silly Perl.
First download a selection of Facebook E-Mail images, we’ll use these with the training script to give GOCR something to go on.
Then run the matching script on the images you wish to convert, it’ll do some fuzzy matching if you give it domain to look for.
If I can improve this, I’ll try and automate it all a little more and work out some stats.