Spam sorting (was Re: Install assistance needed in Berkeley area)

Nick Moffitt nick@zork.net
Mon, 5 Jan 2004 18:00:43 -0800


begin  Tony Godshall  quotation:
> Unfortunately bayesian filtering has become a bit less
> effective lately (more spam gets through) as spammers are
> using random misspellings and garbage words to evade.  But
> hopefully someone will come up with a fix for that (perhaps
> grouping rarely-seen words together rather than ranking them
> separately).

	Nope.  You just don't get enough spam.  Bayesian filtering
eventually flags those nonsense words as spammy words, and you don't
need to worry.  Remember, and repeat after me:  The more spam you GET,
the less you have to READ!

> I only use manual training.  Supposedly it's best to use
> approximatly equal volume of spam and non-spam in the
> training.  spamassassin's FAQ (I think) recommends a training 
> with 1000+ messages each (spam and non).  Also, it's easy to
> reverse a mistake, as using sa-learn --ham reverses the
> effect of sa-learn --spam and vise versa.

	That's why your bayesian filter isn't adapting.  You're not
letting it learn.
	
	If it makes a mistake, correct it.  But don't keep it in the
dark all the time!


-- 
"Forget the damned motor car and build cities for lovers and friends."
	-- Lewis Mumford

end