Conference
French
ID: <
http://hdl.handle.net/2078.1/134874>
Abstract
Pearson’s Khi-2 test is probably the most popular statistical test in corpus linguists, especially where emphasis is placed on highlighting linguistic variations between corpus. For a number of years, its use has been challenged because of the large number of rejections of the zero hypothesis it produces when applied to large corpus. Oakes and Farrow (Literary and Linguistic Computing, 2007, 22, 85-99) proposed various adaptations to this test in order to make it more appropriate. By means of re-sampling procedures, this research demonstrates the severity of the problem and the inadequacy of the remedies proposed. This negative conclusion is consistent with the benefits of the matching analysis, which is probably the most classic approach to textual data analysis to deal with such issues.