Browse Source

add text on papers

muenzner 5 years ago
2 changed files with 19 additions and 0 deletions
  1. BIN
      paper/Stamatatos - A Survey of Modern Authorship Attribution Methods.pdf
  2. +19

paper/Stamatatos - A Survey of Modern Authorship Attribution Methods.pdf View File

+ 19
- 0
tex/bonus_report/Papers.txt View File

@@ -0,0 +1,19 @@
Since the 19th century author identification has been investigated. Therefore a wide basis of approaches exists.
The paper "A Survey of Modern Authorship Attribution Methods" is a state-of-the-art survey and gives an overview about all the main developments in this field and the evaluation of those.
Based on that the paper several meassurements are proposed and their computational requirements are discussed.
This report helps us in finding important meassurements we can to use, describes their possible impact and gives generell information about the topic.

For instance, n-grams are evaluated, on word basis as well as on letter basis.
We guess we get good results on combining n-grams on letter basis with word and sentence length.

Another method which is only mentioned briefly by the survey above is based on compression models.
Hereby for each training text symbols are coded and a compression model is generated.
For testing those models are used to compress the input text.
The one having the highest compressing rate fits best to the symbols in the input and therefore is most likely to be written by the same author.
In the evaluation this approach has competed against SVMs.
The outcome are in generell fairly equal results in identifying authors, but in detail their performance differs from author to author.
Hence, the authors conclude "This shows that both strategies are complementary to each other and can be combined to build a more reliable identification system."

We will implement this method to evaluate it on our own. Based our results we are thinking about to use a combination of different methods.