31.03.2006, v. 1.00 Initial revision 01.04.2006, v. 1.01 Added more words to scanlist 02.04.2006, v. 1.02 Check return values of netpbm utils 03.04.2006, v 1.1 Remove the eval function and replace it with parsed_metadata(). Now we can track errors, count the words we found. The plugin detects now forged content type entries. 03.04.2006, v 1.11 Add a check for suspect pictures and add some score for it. There are new pics going around with obfuscated content so ocr scanners are useless again :-( 04.04.2006, v. 1.2 The GIF module from Image::ExifTool doesn't recognize GIFs without colortable as valid pics and just skips them. The result is a matching SPAMPIC_BROKEN_GIF entry which is wrong. You should definitly patch your Image::ExifTool installation with the provided patch at http://antispam.imp.ch/patches/patch-GIF-Colortable 04.04.2006, v. 1.2.1 Don't scan small pictures, even not for header parsing as it seems Image::ExifTool has again problems with this. Fix the size calculations. 04.04.2006, v. 1.2.2 Count the non standard Image::ExifTool failures as soft errors and add NONSTD_ tests for them. 08.04.2006, v 1.3 Much more words to scan for, added a second method to scan jpeg pics which helps with pics having white font and a lot of noisy distorts. Rename the SUSPECT_ tests and lower the scores for them. 08.04.2006, v 1.3.1 Added two other jpeg scanmethods which give a higher match possibility. 12.04.2006, v 1.4 Added a timeout (default 10 seconds) and change the scanmethods. Now we scan also normalized pnm files, this seems to help a lot on some jpegs. Removed some debug statements. 12.04.2006 v. 1.4.1 Add a scanlimit, only scan a limited number of images. Fix a logical error, really redirect all error output to stderr as I've implemented some time ago, but now it works. 12.04.2006 v. 1.4.2 Rename some vars to make it more logic, add new spamwords. Add some perldoc documentation. Change minpixratio_ocr to 4000 as there are more and more supect pics around. 14.04.2006 v. 1.5 Important change. Ignore raw pnm files if parsing has failed or gocr dumped core (yes this can happen, I'll soon post a fix for gocr). 14.04.2006 v. 1.6 Important change. Alter the whole plugin to use pipes and kill stalled pids after we left the 'helper_run_mode'. Added three count rules to count alpha nummeric chars. 14.04.2006 v. 1.6.1 Sort out identical chars. Some moirees and patterns are often found in pictures and they show after a OCR scan repeated chars of the same type. Not really a sign of words. Added some examples about the ALPHA rules. 06.06.2006 v. 1.6.2 Fix typo: pngtpnm -> pngtopnm. Now png pictures finally work too. 09.06.2006 v. 1.7 Add rules against multiple small pictures in HTML mails where OCR is almost useless. 03.09.2006 v. 1.8 Add support for animated gifs. Mostly contributed by Romeo Benzoni. Thanks a lot ! Add ~10 new rules. Important: You need now p5-Imager and libungif support. 08.09.2006 v. 1.9 Handle broken gif pictures and try to fix them if possible. I've fixed some of the regexes and added a lot of new rules to match the recent spams. 21.10.2006 v. 2.0 Catch the recent image spam with combined pictures and transparent backgrounds, or images which have different offsets. Try to catch those tricks all together. 22.10.2006 v. 2.1 Composed anims were not really correctly combined. Fix this issue. 26.10.2006 v. 2.2 Catch recent spampics with underline colors. Reorganize the plugin a bit. Fix logic error introduced in v 2.1 17.11.2006 v. 3.0 Add fuzzy string support, but match full and simple regex matches still directly. Add a maximum score to still do OCR to prevent useless picture scans. The wordlist is now a simple arrray at the top of the config. Important: You need now the perl Module String::Approx. A lot of the new features have been borrowed by the Fuzzy OCR Plugin (Thanks Christian !) 1.12.2006 v. 3.1 Changed ocrtext_minpixels_ocr to need only 20000 pixel pictures. Changed priority to 100, allowing metatests which did not work previously. Added ocrtext_pwords, a list of positive words which give negative counts. It's almost left empty since releasing this information would give spammers a new opportunity. 12.6.2007 v. 3.2 Added a fixed version for SpamAssassin 3.2