If you do not see an answer to your question on this page, please try checking the manual, or use the Contact Form.
QSF's targets are speed, accuracy, and simplicity. So:
ifile.If QSF does not meet your needs, try looking at the resources page for other spam filters, or make a suggestion using the Contact Form.
First, determine where you are going to do your filtering.
Next, work out whether you have procmail installed on the
relevant machine. Doing "man procmail" should work if you
have it installed.
If you have procmail, then create / edit your
~/.procmailrc file so it contains the following lines:
:0 wf | qsf -sra
If you do not have procmail, you may have other alternatives
such as maildrop. Check with your server administrator.
Next, you need to create a new database so QSF can classify your email. To do this, collect as much recent spam as you can into one mail folder (somewhere between 100 and 2000 messages should be enough). Then collect a similar amount of non-spam in another mail folder.
These mail folders should be in mbox format. Email clients such
as Mutt use it; it is one of the standard
Unix mailbox formats. If, instead, you have your messages as individual
files inside a directory, you can use a command line such as the following
to put all the messages in DIRECTORY into one mbox file:
find DIRECTORY -type f -exec formail '{}' ';' >> NEW-MBOX-FILE
Next, run QSF in training mode on your two mbox folders:
qsf -T spam-folder non-spam-folder
From now on, any incoming mail that QSF thinks is spam should end up with an
X-Spam: YES header and a subject line starting with
[SPAM].
When training using the -T option, QSF does not just mark all
of the messages in the "spam" folder as spam, and all in the "non-spam"
folder as non-spam. Instead, it goes through each message in each folder and
only changes its database if it "guesses" the message's classification
wrongly. Having tried this on every message, it then restarts the process,
and keeps doing it until the number of messages it gets wrong falls to an
acceptable number.
The reason it is done this way is to avoid overtraining the database. If too many entries are added to the database at once, the database becomes large and inflexible - it becomes more difficult to teach it new things in future.
Although the database format was recently changed to "age" tokens so that overtraining is less of a problem, the initial training process will probably always be done this way to ensure a balanced data set.
The main QSF RPM was built on a Fedora Core system with MySQL 4.0.x installed. If you are running an older version of MySQL, such as that on RHEL 3.0, then installation will fail with this error.
To get around this, either install the "static" RPM, which has MySQL support compiled in statically, or download the source RPM and compile your own version, like this:
rpmbuild --rebuild qsf-1.0.31-1.src.rpm rpm -Uvh /usr/src/redhat/RPMS/i386/qsf-1.0.31-1.i386.rpm
You will then have a working copy.