Gentoo Wiki


Please format this article according to the guidelines and Wikification suggestions, then remove this notice {{Wikify}} from the article

Complete Virtual Mail Server

Getting Started

Basic Mail Setup

Enhanced Mail Services

Anti-Spam Configuration

Anti-Virus Configuration

Log Analyzer

Wrapping it Up


Bayesian Filters and Auto White-listing

SpamAssasssin also has the ability to store the rules for the Bayesian filter and auto-whitelisting in our database. At this point, I am still figuring out how to “teach” the Bayesian filter given our setup (sql based quarantine, virtual users, etc.) so this will not be covered off in the first release of this guide. If any of you have it figured out, be sure to let me know, but for now we will create the tables in future anticipation.

{bmichels 2-14-2007 - Has anyone figured out how to use Bayesian filtering in this setup yet?}

{bmichels 8-06-2007 - Finally got around to trying this again. I had scanned several hundred ham and spam emails and had bayes enabled, but never got BAYES_ headers in any emails. Last week I decided to flush the database and start again. It started including bayes headers a couple days later after scanning 300 ham and 120 spam emails.}

We can setup the auto-whitelist (AWL) to track the historical average score of a sender which will push subsequent mails towards that average. This will assist in reducing the number of false positives. A sender is identified both by the address they went with and their IP so that spam claiming to be from you with forged headers will fail to get through.

Note: Auto-whitelist is not a white-listing system. This is intended to serve as an averaging system to help identify non-spam messages. White-lists will be covered off separately.

We will start by extending our database to include the new tables we need.

Code: Database Tables
CREATE TABLE bayes_expire (
  id integer NOT NULL default '0',
  runtime integer NOT NULL default '0'

CREATE INDEX bayes_expire_idx1 ON bayes_expire (id);
GRANT SELECT, INSERT, UPDATE, DELETE on bayes_expire to amavis;

CREATE TABLE bayes_global_vars (
  variable varchar(30) NOT NULL default '',
  value varchar(200) NOT NULL default '',
  PRIMARY KEY  (variable)
GRANT SELECT, INSERT, UPDATE, DELETE on bayes_global_vars to amavis;
INSERT INTO bayes_global_vars VALUES ('VERSION','3');

CREATE TABLE bayes_seen (
  id integer NOT NULL default '0',
  msgid varchar(200) NOT NULL default '',
  flag character(1) NOT NULL default '',
  PRIMARY KEY  (id,msgid)
GRANT SELECT, INSERT, UPDATE, DELETE on bayes_seen to amavis;

CREATE TABLE bayes_token (
  id integer NOT NULL default '0',
  token char(5) NOT NULL default '',
  spam_count integer NOT NULL default '0',
  ham_count integer NOT NULL default '0',
  atime integer NOT NULL default '0',
  PRIMARY KEY  (id,token)
GRANT SELECT, INSERT, UPDATE, DELETE on bayes_token to amavis;

CREATE TABLE bayes_vars (
  id serial NOT NULL,
  username varchar(255) NOT NULL default '',
  spam_count integer NOT NULL default '0',
  ham_count integer NOT NULL default '0',
  token_count integer NOT NULL default '0',
  last_expire integer NOT NULL default '0',
  last_atime_delta integer NOT NULL default '0',
  last_expire_reduce integer NOT NULL default '0',
  oldest_token_age integer NOT NULL default '2147483647',
  newest_token_age integer NOT NULL default '0',
CREATE INDEX bayes_vars_idx1 ON bayes_vars (username);
GRANT SELECT, INSERT, UPDATE, DELETE on bayes_vars, bayes_vars_id_seq to amavis;

  username varchar(100) NOT NULL default '',
  email varchar(255) NOT NULL default '',
  ip varchar(10) NOT NULL default '',
  count bigint default '0',
  totscore float default '0'
CREATE UNIQUE INDEX awl_pkey ON awl (username,email,ip);

Once the tables have been created, we need to tell spamassassin to access them.

Code: /etc/mail/spamassassin/
#nano /etc/mail/spamassassin/

# The below sample from bug 91430 is an example of using mysql
# for spam filter storage. It works for postgresql too,
# IF the database-encoding in SQL_ASCII!

#Tell SpamAssassin to use PostgreSQL for bayes data
bayes_store_module             Mail::SpamAssassin::BayesStore::SQL
bayes_sql_dsn                  DBI:Pg:dbname=amavis;host=dbServerhostname;port=5432
bayes_sql_username             amavis
bayes_sql_password             $password

#Tell SpamAssassin to use PostgreSQL for AWL data
auto_whitelist_factory         Mail::SpamAssassin::SQLBasedAddrList
user_awl_dsn                   DBI:Pg:dbname=amavis;host=dbServerhostname;port=5432
user_awl_sql_username          amavis
user_awl_sql_password          $password
Note: Make sure to change bayes_store_module accordingly to your database. ex. for MySQL change SQL to MySQL and for postgres change to PgSQL

The file is not readable by amavis by default and needs to be so that amavis can preload spamassassin. We can make it readable by changing ownership of it to amavais with "chown amavis /etc/mail/spamassassin/".

Testing Time

Be sure to restart amavisd to pickup the new changes and then send through a message (spam or clean, it doesn’t matter). After the message has come in, use webmin to check the awl table to verify that the message has been added. The column names are pretty self-explanatory so you should be able to figure out if it all worked.

Per-Recipient White/Black Lists

We have already created the tables we need; all that is left to do is to let amavisd know how to read them and populate some date for testing purposes.

File: /etc/amavisd.conf
(If you want sender white/blacklisting)
$sql_select_white_black_list = 'SELECT wb FROM wblist,mailaddr'.
     ' WHERE (wblist.rid=?) AND ('.
     '   AND ( IN (%k))'.
     ' ORDER BY mailaddr.priority DESC';

I used the following test data that was linked to what I had created earlier for my initial testing. Replace the with your valid email address that you have been sending your test mail from.

File: Test Data
   id:         	1
   priority:  	9

   rid:        	1
   sid:       	1
   wb:         	w

To test, I sent the known spam message from my usual email account ( to the account I am testing. As expected, because is on the whitelist, it passed straight through skipping the spam filter.

Retrieved from ""

Last modified: Sun, 18 May 2008 06:58:00 +0000 Hits: 10,346