test
Search publications, data, projects and authors

Conference

French

ID: <

10670/1.8qi0fy

>

Where these data come from
A new measure for the evaluation of thematic extraction methods: the general plausibility

Abstract

National audience The methods for automatically extracting topics come from a variety of fields: computational language, TAL, linear alga, statistics, etc. These specific methods may be supplemented by adapted methods from other areas, in particular unsupervised machine learning. The results produced by all these methods take different forms: partitions of documents, distribution of probabilities on words, matrices. This clearly poses a problem in comparing them in a uniform way. In this article, we propose a new quality measure, entitled ‘Generalised plausibility’, to allow evaluation and thus comparison of different methods of extracting topics. The results obtained on a corpus of web documents around the French presidential elections in 2012, as well as on the Associated Press corpus, show the relevance of the proposed measure.

Your Feedback

Please give us your feedback and help us make GoTriple better.
Fill in our satisfaction questionnaire and tell us what you like about GoTriple!