Full bibliography

Three non-Bayesian methods of spam filtration: CRM114 at TREC 2007

Resource type
Authors/contributors
Title
Three non-Bayesian methods of spam filtration: CRM114 at TREC 2007
Abstract
For the TREC 2007 conference, the CRM114 team considered three non-Bayesian methods of spam filtration in the CRM114 framework - an SVM based on the "hyperspace" feature==document paradigm, a bit-entropy matcher, and substring compression based on LZ77. As a calibration yardstick, we used the well-tested and widely used CRM114 OSB markov random field system (basically unchanged since 2003). The results show that the SVM has a spam-filtering accuracy of about a factor of two to three better accuracy than the OSB system, that substring compression is somewhat more accurate than OSB, and that bit entropy is somewhat less accurate for the TREC 2007 test sets.
Proceedings Title
NIST Special Publication
Date
2007
ISBN
1048776X (ISSN)
Citation Key
katoThreeNonBayesianMethods2007
Archive
Scopus
Language
English
Extra
Journal Abbreviation: NIST Spec. Publ.
Citation
Kato, M., Langeway, J., Wu, Y., & Yerazunis, W. S. (2007). Three non-Bayesian methods of spam filtration: CRM114 at TREC 2007. NIST Special Publication. Scopus. https://www.scopus.com/inward/record.uri?eid=2-s2.0-84873427254&partnerID=40&md5=bb761f80b50620579799ae8125683def