Anda belum login :: 23 Nov 2024 07:53 WIB
Home
|
Logon
Hidden
»
Administration
»
Collection Detail
Detail
A Comparative Study Of Machine Learning Methods For Authorship Attribution
Oleh:
Jockers, Matthew L.
;
Witten, Daniela M.
Jenis:
Article from Journal - e-Journal
Dalam koleksi:
Literary and Linguistic Computing vol. 25 no. 2 (Jun. 2010)
,
page 215-223.
Fulltext:
Vol 25, 2, p 215-223.pdf
(131.97KB)
Isi artikel
We compare and benchmark the performance of five classification methods, four of which are taken from the machine learning literature, in a classic authorship attribution problem involving the Federalist Papers. Cross-validation results are reported for each method, and each method is further employed in classifying the disputed papers and the few papers that are generally understood to be coauthored. These tests are performed using two separate feature sets: a ‘‘raw’’ feature set containing all words and word bigrams that are common to all of the authors, and a second ‘‘pre-processed’’ feature set derived by reducing the raw feature set to include only words meeting a minimum relative frequency threshold. Each of the methods tested performed well, but nearest shrunken centroids and regularized discriminant analysis had the best overall performances with 0/70 cross-validation errors.
Opini Anda
Klik untuk menuliskan opini Anda tentang koleksi ini!
Kembali
Process time: 0 second(s)