formula results from the CPAN

formula

Mail-SpamAssassin

view release on metacpan or search on metacpan

lib/Mail/SpamAssassin/Plugin/Bayes.pm view on Meta::CPAN

  $msgatime = $now if ( $msgatime > $now );

  my @touch_tokens;
  my $tinfo_spammy = $permsgstatus->{bayes_token_info_spammy} = [];
  my $tinfo_hammy = $permsgstatus->{bayes_token_info_hammy} = [];

  my %tok_strength = map( ($_, abs($pw{$_}->{prob} - 0.5)), @pw_keys);
  my $log_each_token = (would_log('dbg', 'bayes') > 1);

  # now take the most significant tokens and calculate probs using
  # Robinson's formula.

  @pw_keys = sort { $tok_strength{$b} <=> $tok_strength{$a} } @pw_keys;

  if (@pw_keys > N_SIGNIFICANT_TOKENS) { $#pw_keys = N_SIGNIFICANT_TOKENS - 1 }

  my @sorted;
  my $score;
  foreach my $tok (@pw_keys) {
    next if $tok_strength{$tok} <
                $Mail::SpamAssassin::Bayes::Combine::MIN_PROB_STRENGTH;

lib/Mail/SpamAssassin/Plugin/TxRep.pm view on Meta::CPAN

1. B<Scoring> - at AWL, although it tracks the number of messages received from each
respective sender, when calculating the corrective score at a new message, it does
not take it in count in any way. So for example a sender who previously sent a single
ham message with the score of -5, and then sends a second one with the score of +10,
AWL will issue a corrective score bringing the score towards the -5. With the default
C<auto_welcomelist_factor> of 0.5, the resulting score would be only 2.5. And it would be
exactly the same even if the sender previously sent 1,000 messages with the average of
-5. TxRep tries to take the maximal advantage of the collected data, and adjusts the
final score not only with the mean reputation score stored in the database, but also
respecting the number of messages already seen from the sender. You can see the exact
formula in the section L</C<txrep_factor>>.

2. B<Learning> - AWL ignores any spam/ham learning. In fact it acts against it, which
often leads to a frustrating situation, where a user repeatedly tags all messages of a
given sender as spam (resp. ham), but at any new message from the sender, AWL will
adjust the score of the message back to the historical average which does B<not> include
the learned scores. This is now changed at TxRep, and every spam/ham learning will be
recorded in the reputation database, and hence taken in consideration at future email
from the respective sender. See the section L</"LEARNING SPAM / HAM"> for more details.

3. B<Auto-Learning> - in certain situations SpamAssassin may declare a message an

lib/Mail/SpamAssassin/Plugin/TxRep.pm view on Meta::CPAN

}


=head1 LEARNING SPAM / HAM

When SpamAssassin is told to learn (or relearn) a given message as spam or
ham, all reputations relevant to the message (email, email_ip, domain, ip, helo)
in both global and user storages will be updated using the C<txrep_learn_penalty>
respectively the C<rxrep_learn_bonus> values. The new reputation of given sender
property (email, domain,...) will be the respective result of one of the following
formulas:

   new_reputation = old_reputation + learn_penalty
   new_reputation = old_reputation - learn_bonus

The TxRep plugin currently does track each message individually, hence it
does not detect when you learn the message repeatedly. It will add/subtract
the penalty/bonus score each time the message is fed to the spam learner.

=cut

( run in 0.313 second using v1.01-cache-2.11-cpan-5f2e87ce722 )