Skip to main content
To KTH's start page To KTH's start page

Spectrum variation over the Voice Range Profile

Time: Fri 2017-03-17 15.00 - 17.00

Location: Fantum, Lindstedsvägen 24, 5th floor

Participating: Peter Pabon, TMH and Institute of Sonology, Royal Conservatoire, The Hague, Netherlands

Export to calendar

The acoustic spectrum of the voice is highly variable in speech and singing, due to many sources of variation. Averaging techniques, such as the long-time average spectrum (LTAS), are conventionally used to reduce the impact of articulation in uncovering properties of the voice source. However, without accounting for variations that are coupled to fundamental frequency fo and sound level SPL, the LTAS still obscures much information. The spectrum can instead be averaged cell-by-cell on the voice range profile (VRP) plane, resulting in a ‘spectral VRP. When a sufficient volume of data is collected, some hitherto unseen trends are resolved.

Spectral data over the entire voice range was collected from three categories of subjects: untrained (N=16) and trained (N=12) females as well as trained males (N=7). The phonation type was controlled for throughout, such that the spectral VRPs were separate for chest/modal/M1 and head/falsetto/M2. Subjects produced vrps of both phonation types, as ensured by a precise protocol. The VRP data contained the intra-subject-averaged narrow-band spectrum in every cell (one semitone × one dB) for that combination of fo and SPL. Then, the spectral vrp data was averaged across subjects, still within categories. No adjustments were made for personal voice range; data from different subjects was simply accumulated per cell.

From the averaged spectrum sections, eight scalar metrics were derived and computed per cell: (1) absolute level of the fundamental, (2) power ratio of higher harmonics to the fundamental, (3) number of the strongest harmonic NHmax, (4) power ratio of NHmax to all other harmonics, (5) spectrum balance (high-to-low band power ratio), (6) spectrum slope above 3 kHz, (7) spectrum centroid below 2 kHz, and (8) level difference LH2-LH1 between the two lowest harmonics. The results for these eight metrics were then visualized over the vrp plane, both the ensemble averages and per subject. The complete corpus of data is in publication.

Many interesting general effects emerged, and in the presentation only three will be highlighted. It will be discussed how each effect shows out in the different metrics, and to what degree the effect agrees with standard models of voice production, or not. Firstly, the notion that vocal tract resonances have a major influence on voice sound level was not seen to be corroborated by the data. Secondly, the level of the second harmonic was often unexpectedly high at low SPL, an observation that could be attributable to non-linear and often subject-specific effects. Third, the spectrum slope continuum appears to be a broken one, where the distribution of energy below 2 kHz is partly independent of the energy buildup around 3 kHz, and where again the roll-off above 3 kHz curves in the opposite direction.