Sophie

Sophie

distrib > Mandriva > 2010.0 > i586 > media > contrib-release > by-pkgid > ced98ca62eed61aa32c12cb3eb98f8ed > files > 5

perl-Mail-SpamAssassin-Plugin-ocrtext-3.2-3mdv2010.0.noarch.rpm

31.03.2006, v. 1.00
 
Initial revision

 
01.04.2006, v. 1.01
 
Added more words to scanlist


02.04.2006, v. 1.02
 
Check return values of netpbm utils

 
03.04.2006, v 1.1
 
Remove the eval function and replace
it with parsed_metadata(). Now we can
track errors, count the words we found.
The plugin detects now forged content type
entries.


03.04.2006, v 1.11

Add a check for suspect pictures and add some
score for it. There are new pics going around
with obfuscated content so ocr scanners are useless
again :-(


04.04.2006, v. 1.2

The GIF module from Image::ExifTool doesn't recognize
GIFs without colortable as valid pics and just skips them.
The result is a matching SPAMPIC_BROKEN_GIF entry which is
wrong. You should definitly patch your Image::ExifTool installation
with the provided patch at http://antispam.imp.ch/patches/patch-GIF-Colortable


04.04.2006, v. 1.2.1

Don't scan small pictures, even not for header parsing as it
seems Image::ExifTool has again problems with this. Fix the
size calculations.


04.04.2006, v. 1.2.2

Count the non standard Image::ExifTool failures as soft errors
and add NONSTD_ tests for them.


08.04.2006, v 1.3

Much more words to scan for, added a second method to scan jpeg
pics which helps with pics having white font and a lot of noisy
distorts. Rename the SUSPECT_ tests and lower the scores for
them.


08.04.2006, v 1.3.1

Added two other jpeg scanmethods which give a higher match
possibility.


12.04.2006, v 1.4

Added a timeout (default 10 seconds) and change the scanmethods.
Now we scan also normalized pnm files, this seems to help a lot
on some jpegs. Removed some debug statements.


12.04.2006 v. 1.4.1

Add a scanlimit, only scan a limited number of images.
Fix a logical error, really redirect all error output to
stderr as I've implemented some time ago, but now it works.


12.04.2006 v. 1.4.2

Rename some vars to make it more logic, add new spamwords.
Add some perldoc documentation. Change minpixratio_ocr to
4000 as there are more and more supect pics around.


14.04.2006 v. 1.5

Important change. Ignore raw pnm files if parsing has failed
or gocr dumped core (yes this can happen, I'll soon post
a fix for gocr).


14.04.2006 v. 1.6

Important change. Alter the whole plugin to use pipes and
kill stalled pids after we left the 'helper_run_mode'. Added
three count rules to count alpha nummeric chars.


14.04.2006 v. 1.6.1

Sort out identical chars. Some moirees and patterns are often found in
pictures and they show after a OCR scan repeated chars of the same
type. Not really a sign of words. Added some examples about the ALPHA rules.


06.06.2006 v. 1.6.2

Fix typo: pngtpnm -> pngtopnm. Now png pictures finally work too.


09.06.2006 v. 1.7

Add rules against multiple small pictures in HTML mails where
OCR is almost useless.


03.09.2006 v. 1.8

Add support for animated gifs. Mostly contributed by Romeo Benzoni.
Thanks a lot ! Add ~10 new rules.

Important: You need now p5-Imager and libungif support.


08.09.2006 v. 1.9

Handle broken gif pictures and try to fix them if possible. I've
fixed some of the regexes and added a lot of new rules to match
the recent spams.


21.10.2006 v. 2.0

Catch the recent image spam with combined pictures and transparent
backgrounds, or images which have different offsets. Try to catch those
tricks all together.


22.10.2006 v. 2.1

Composed anims were not really correctly combined. Fix this issue.


26.10.2006 v. 2.2

Catch recent spampics with underline colors. Reorganize the plugin a bit.
Fix logic error introduced in v 2.1


17.11.2006 v. 3.0

Add fuzzy string support, but match full and simple regex matches
still directly. Add a maximum score to still do OCR to prevent useless
picture scans. The wordlist is now a simple arrray at the top of the
config.

Important: You need now the perl Module String::Approx.

A lot of the new features have been borrowed by the
Fuzzy OCR Plugin (Thanks Christian !)


1.12.2006 v. 3.1

Changed ocrtext_minpixels_ocr to need only 20000 pixel pictures.
Changed priority to 100, allowing metatests which did not work
previously.
Added ocrtext_pwords, a list of positive words which give negative
counts. It's almost left empty since releasing this information would
give spammers a new opportunity.


12.6.2007 v. 3.2

Added a fixed version for SpamAssassin 3.2