File(s) under permanent embargo
Local n-grams for author identification: notebook for PAN at CLEF 2013
conference contribution
posted on 2013-01-01, 00:00 authored by R Layton, P Watters, Richard DazeleyRichard DazeleyOur approach to the author identification task uses existing authorship attribution methods using local n-grams (LNG) and performs a weighted ensemble. This approach came in third for this year's competition, using a relatively simple scheme of weights by training set accuracy. LNG models create profiles, consisting of a list of character n-grams that best represent a particular author's writing. The use of a weighted ensemble improved upon the accuracy of the method without reducing the speed of the algorithm; the submitted solution was not only near the top of the leaderboard in terms of accuracy, but it was also one of the faster algorithms submitted.