Mail-SpamAssassin
view release on metacpan or search on metacpan
lib/Mail/SpamAssassin/Plugin/Bayes.pm view on Meta::CPAN
864865866867868869870871872873874875876877878879880881882883884$msgatime
=
$now
if
(
$msgatime
>
$now
);
my
@touch_tokens
;
my
$tinfo_spammy
=
$permsgstatus
->{bayes_token_info_spammy} = [];
my
$tinfo_hammy
=
$permsgstatus
->{bayes_token_info_hammy} = [];
my
%tok_strength
=
map
( (
$_
,
abs
(
$pw
{
$_
}->{prob} - 0.5)),
@pw_keys
);
my
$log_each_token
= (would_log(
'dbg'
,
'bayes'
) > 1);
# now take the most significant tokens and calculate probs using
# Robinson's formula.
@pw_keys
=
sort
{
$tok_strength
{
$b
} <=>
$tok_strength
{
$a
} }
@pw_keys
;
if
(
@pw_keys
> N_SIGNIFICANT_TOKENS) {
$#pw_keys
= N_SIGNIFICANT_TOKENS - 1 }
my
@sorted
;
my
$score
;
foreach
my
$tok
(
@pw_keys
) {
next
if
$tok_strength
{
$tok
} <
$Mail::SpamAssassin::Bayes::Combine::MIN_PROB_STRENGTH
;
lib/Mail/SpamAssassin/Plugin/TxRep.pm view on Meta::CPAN
6869707172737475767778798081828384858687881. B<Scoring> - at AWL, although it tracks the number of messages received from
each
respective sender,
when
calculating the corrective score at a new message, it does
not take it in count in any way. So
for
example a sender who previously sent a single
ham message
with
the score of -5, and then sends a second one
with
the score of +10,
AWL will issue a corrective score bringing the score towards the -5. With the
default
C<auto_welcomelist_factor> of 0.5, the resulting score would be only 2.5. And it would be
exactly the same even
if
the sender previously sent 1,000 messages
with
the average of
-5. TxRep tries to take the maximal advantage of the collected data, and adjusts the
final score not only
with
the mean reputation score stored in the database, but also
respecting the number of messages already seen from the sender. You can see the exact
formula in the section L</C<txrep_factor>>.
2. B<Learning> - AWL ignores any spam/ham learning. In fact it acts against it, which
often leads to a frustrating situation, where a user repeatedly tags all messages of a
given
sender as spam (resp. ham), but at any new message from the sender, AWL will
adjust the score of the message back to the historical average which does B<not> include
the learned scores. This is now changed at TxRep, and every spam/ham learning will be
recorded in the reputation database, and hence taken in consideration at future email
from the respective sender. See the section L</
"LEARNING SPAM / HAM"
>
for
more details.
3. B<Auto-Learning> - in certain situations SpamAssassin may declare a message an
lib/Mail/SpamAssassin/Plugin/TxRep.pm view on Meta::CPAN
19031904190519061907190819091910191119121913191419151916191719181919192019211922}
=head1 LEARNING SPAM / HAM
When SpamAssassin is told to learn (or relearn) a given message as spam or
ham, all reputations relevant to the message (email, email_ip, domain, ip, helo)
in both global and user storages will be updated using the C<txrep_learn_penalty>
respectively the C<rxrep_learn_bonus> values. The new reputation of given sender
property (email, domain,...) will be the respective result of one of the following
formulas:
new_reputation = old_reputation + learn_penalty
new_reputation = old_reputation - learn_bonus
The TxRep plugin currently does track each message individually, hence it
does not detect when you learn the message repeatedly. It will add/subtract
the penalty/bonus score each time the message is fed to the spam learner.
=cut
( run in 0.638 second using v1.01-cache-2.11-cpan-f79bc02f770 )