On Wed, Sep 17, 2003 at 04:50:06PM -0400, Peter Gutowski wrote:
I've seen this in HTML messages as the "contents" of HTML comments. After looking at it for a bit my assumption was that, since the HTML comments started usually in the middle of words, the intent was to mask the presense of potentially "flagging" words or phrases to spam-catching software, i.e. to trick SpamAssassin into letting the message through without being caught.
Yes. Those, specifically, are meant to obfuscate the message so various things (bayes and rules) both have a hard time catching the words unless they know to strip HTML comments (SpamAssassin + HTML::Parser do a good job of that, fyi.) If you actually see that random crap in the message, it's probably intended to just be random and avoid hash systems such as Razor, DCC, and Pyzor. Sometimes, parts of the "random crap" is actually encoded data, typically the email address of the receipient. Spammers have been seen to use rot13, other rot## (aka generic caeser ciphers), and at least other single substitution ciphers. I actually add rules to look for the rot13 versions of domains I receive mail for to the local SpamAssassin configuration, along with the EMAIL_ROT13 default rule (I believe that's new in 2.60). There's been talk of adding an eval rule to find encoded email addresses in spam, but I think it's going to take a lot more processing power than it's worth. -- Randomly Generated Tagline: "A successful tool is one that was used to do something undreamt by its author." - Stephen C. Johnson