Authorship categorization of public domain literature

Resource type
Authors/contributors
Title
Authorship categorization of public domain literature
Abstract
We defined a set of quantifiable features for authorship categorization. We performed our experiments on public domain literature - all books analyzed were obtained in plain text format through Project Gutenberg's online repository of classic books. We tested three machine learning algorithms: Artificial Neural Network, Naïve Bayes Classifier, and Support Vector Machine with our features. We found that certain features, such as punctuation and various suffixes result in a higher accuracy. In addition, the Support Vector Machine classifier produces repeatedly higher accuracies than other classifiers and seems to be a far superior method of classification in terms of authorship categorization. © 2016 IEEE.
Proceedings Title
Ubiquitous Computing
Publisher
Institute of Electrical and Electronics Engineers Inc.
Date
2016
Pages
1-7
ISBN
9781509014965 (ISBN)
Citation Key
pop00252
Language
English
Extra
1 citations (Crossref) [2023-10-31] tex.type: Proceedings paper
Citation
Boran, T., Voss, J., & Hossain, S. (2016). Authorship categorization of public domain literature. Ubiquitous Computing, 1–7. https://doi.org/10.1109/UEMCON.2016.7777898