Perpustakaan Unika Atma Jaya

Anda belum login :: 23 Nov 2024 07:53 WIB

Home

Logon

» »

Detail

A Comparative Study Of Machine Learning Methods For Authorship Attribution

Oleh:

Jockers, Matthew L.

; Witten, Daniela M.

Jenis: Article from Journal - e-Journal
Dalam koleksi: Literary and Linguistic Computing vol. 25 no. 2 (Jun. 2010), page 215-223.
Fulltext: Vol 25, 2, p 215-223.pdf (131.97KB)

Isi artikelWe compare and benchmark the performance of five classification methods, four of which are taken from the machine learning literature, in a classic authorship attribution problem involving the Federalist Papers. Cross-validation results are reported for each method, and each method is further employed in classifying the disputed papers and the few papers that are generally understood to be coauthored. These tests are performed using two separate feature sets: a ‘‘raw’’ feature set containing all words and word bigrams that are common to all of the authors, and a second ‘‘pre-processed’’ feature set derived by reducing the raw feature set to include only words meeting a minimum relative frequency threshold. Each of the methods tested performed well, but nearest shrunken centroids and regularized discriminant analysis had the best overall performances with 0/70 cross-validation errors.

Opini AndaKlik untuk menuliskan opini Anda tentang koleksi ini!

Kembali

Process time: 0 second(s)