test
Search publications, data, projects and authors

Article

English

ID: <

http://hdl.handle.net/2078.1/141995

>

Where these data come from
Normalization of Intonation Contours and the Study of Regional Variation in French

Abstract

In the analysis of prosody, both its linguistic uses and its regional or social variation, speaker-dependent prosodic variation is ubiquitous, in particular for pitch range. In large scale speech corpora such as Rhapsodie, the pitch range of individual speakers ranges from 8 to 24 semitones. This paper describes a novel approach to pitch contour normalization and applies it to the study of regional variation of intonation contours in French. The pitch range of a speaker is commonly characterized by two measures (or parameters) (cf. Patterson & Ladd 1999, Ladd 2009, Hirst 2011): the overall average pitch, sometimes called the "key" and often approximated by the median of the pitch values, and the "span", the interval between the upper and lower pitch targets in the speech sample. The normalization of observed pitch contours for such parameters will depend to a large extent on the pitch scale chosen to represent the input data. Various pitch scales have been proposed (Hertz, semitones, OME, mel, Bark and ERB-rate). But studies on the optimal scale for intonation in continuous speech arrive at different conclusions (Hermes & Van Gestel 1991, Nolan 2003). For pitch range normalisation, standard statistical techniques may be used, such as the Z-score, which divides pitch values (relative to the mean) by their standard deviation (Jassem & Kudela-Dobrogowska 1980). Alternatively, as in the approach proposed here, normalization may be relative to the observed pitch range of the speaker, in which case the measurement/detection of pitch range is obviously decisive. Various approaches use the pitch targets provided by some kind of stylization, such as Momel (De Looze & Hirst 2010) or the tonal perception based model of Prosogram (Mertens 2004), or the statistical distribution of syllable-based pitch targets (Mertens 2013). The stylization simulating tonal perception removes pitch variation below the glissando threshold, while taking into account the impact of spectral and amplitude changes at syllable boundaries, resulting in targets synchronized with the syllabic nuclei. The pitch range normalization results in a scaling of the observed contours such that all contours, irrespective of the speaker who pronounced them, are mapped onto an identical scale varying between 0 and 100%, removing the pitch range variation between speakers. A contour may be characterized as a configuration of pitch targets associated with particular syllables, of which the stressed syllable is the central anchor point. Moreover, the anchor points and the possible configurations of pitch targets are language-specific. In the study of regional variation of prosody, comparisons between contours should take into account this internal structure and temporal alignment of the contour. For French, most studies define the intonation unit as a sequence of syllables ending in a final stressed syllable, and possibly containing an additional initial stress in the preceding syllables. In this case, the pitch contour is the configuration of pitch targets associated with these anchor points. Moreover, since the final stress may carry pitch movements (rise, fall), multiple targets per syllable are required. Given the rich variety of contours and the position of stress on the last syllable in the intonation unit, Carton (1983) restricts the comparison to the final part of the contour, designated as the "clausule", consisting of the last three syllables. In order to obtain an automatic and quantitative analysis of pitch contours in a speech corpus, a procedure was designed which provides time-aligned, normalized pitch targets for each intonation unit. It includes the following steps: segmentation into syllable rhymes, pitch stylization using Prosogram, localisation of pitch targets for various contour shapes (level, rise, fall), normalization of pitch values and pitch intervals, and finally identification of the intonation groups from the annotation. Given this information, for each intonation group, normalized pitch levels and intervals are obtained for the last 3 syllables (clausule part). This contour normalization procedure is applied to the study of regional variation of prosody in French. A perceptual approach is used (Bardiaux 2013a 2013b, in which the criterion for grouping/classifying speech productions is not the geographic origin of the speaker, but rather the judgment by experts about the regional markedness of particular intonation groups or syllables. For speech corpora with speakers of various geographic origins, judgments about marked regional prosody were collected. Moreover, the intonation groups were classified according to the pitch level of the end point of the intonation contour. In this way, sets of contours are obtained, with similar shape (end point), and similar markedness judgments. For the contours in a given class (for instance, regionally marked rise contours), the average values (normalized) at the pitch targets are compared to those of another class (for instance, unmarked rise contours). The article discusses the results obtained in this way.

Your Feedback

Please give us your feedback and help us make GoTriple better.
Fill in our satisfaction questionnaire and tell us what you like about GoTriple!