:
QSF 1.1.0
Version 1.1.0 has been released, which includes the planned improvements to
the database pruning algorithm (whereby tokens "age" automatically). The
result seems to be, in general, smaller and more accurate databases.
A few other improvements have been made:
- Several new token types have been added (words with non-alphanumeric start characters, words with too many hyphens in them, ridiculously long words, and HTML FONT tags), and one has been modified (HTML IMG tokens are now only looked at if they refer to an external image, not one embedded in the message).
- The default binary tree database backend has been made
significantly faster by using
mmap(), where available, instead ofread(). This means that the database is mapped directly into memory when it is being read. Although this increases memory usage, the new token ageing and pruning algorithms mean that databases stay small, so the slight memory overhead is far outweighed by the speed improvement. - Per-user databases (
~/.qsfdb) are now given more weight than the global database, by a factor of 10. This means that even if the global database is heavily trained, individual users will be able to see a clear change in classification behaviour when they retrain their own database.
The QSF web page has been redesigned, so it now occupies a few more pages than before.
Fun facts:
- 30% of traffic to this web site is from robots.
- The most frequently accessed pages are the QSF page and
/robots.txt. - Nearly half of all hits come without a referrer field.
- Almost all "404 not found" errors this site logs are caused by broken clients that can't understand relative links. Most of those are written in Java.
Fun.