picture
RJR-logo

About | BLOGS | Portfolio | Misc | Recommended | What's New | What's Hot

About | BLOGS | Portfolio | Misc | Recommended | What's New | What's Hot

icon

Bibliography Options Menu

icon
QUERY RUN:
19 Nov 2018 at 01:34
HITS:
2088
PAGE OPTIONS:
Hide Abstracts   |   Hide Additional Links
NOTE:
Long bibliographies are displayed in blocks of 100 citations at a time. At the end of each block there is an option to load the next block.

Bibliography on: Formants: Modulators of Communication

RJR-3x

Robert J. Robbins is a biologist, an educator, a science administrator, a publisher, an information technologist, and an IT leader and manager who specializes in advancing biomedical knowledge and supporting education through the application of information technology. More About:  RJR | OUR TEAM | OUR SERVICES | THIS WEBSITE

RJR: Recommended Bibliography 19 Nov 2018 at 01:34 Created: 

Formants: Modulators of Communication

Wikipedia: A formant, as defined by James Jeans, is a harmonic of a note that is augmented by a resonance. In speech science and phonetics, however, a formant is also sometimes used to mean acoustic resonance of the human vocal tract. Thus, in phonetics, formant can mean either a resonance or the spectral maximum that the resonance produces. Formants are often measured as amplitude peaks in the frequency spectrum of the sound, using a spectrogram (in the figure) or a spectrum analyzer and, in the case of the voice, this gives an estimate of the vocal tract resonances. In vowels spoken with a high fundamental frequency, as in a female or child voice, however, the frequency of the resonance may lie between the widely spaced harmonics and hence no corresponding peak is visible. Because formants are a product of resonance and resonance is affected by the shape and material of the resonating structure, and because all animals (humans included) have unqiue morphologies, formants can add additional generic (sounds big) and specific (that's Towser barking) information to animal vocalizations.

Created with PubMed® Query: formant NOT pmcbook NOT ispreviousversion

Citations The Papers (from PubMed®)

RevDate: 2018-11-18

Graf S, Schwiebacher J, Richter L, et al (2018)

Adjustment of Vocal Tract Shape via Biofeedback: Influence on Vowels.

Journal of voice : official journal of the Voice Foundation pii:S0892-1997(18)30326-6 [Epub ahead of print].

The study assessed 30 nonprofessional singers to evaluate the effects of vocal tract shape adjustment via increased resonance toward an externally applied sinusoidal frequency of 900 Hz without phonation. The amplification of the sound wave was used as biofeedback signal and the intensity and the formant position of the basic vowels /a/, /e/, /i/, /o/, and /u/ were compared before and after a vocal tract adjustment period. After the adjustment period, the intensities for all vowels increased and the measured changes correlated with the participants' self-perception.The diferences between the second formant position of the vowels and the applied frequency influences the changes in amplitude and in formant frequencies. The most significant changes in formant frequency occurred with vowels that did not include a formant frequency of 900 Hz, while the increase in amplitude was the strongest for vowels with a formant frequency of about 900 Hz.

RevDate: 2018-11-16

Bhat GS, Reddy CKA, Shankar N, et al (2018)

Smartphone based real-time super Gaussian single microphone Speech Enhancement to improve intelligibility for hearing aid users using formant information.

Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference, 2018:5503-5506.

In this paper, we present a Speech Enhancement (SE) technique to improve intelligibility of speech perceived by Hearing Aid users using smartphone as an assistive device. We use the formant frequency information to improve the overall quality and intelligibility of the speech. The proposed SE method is based on new super Gaussian joint maximum a Posteriori (SGJMAP) estimator. Using the priori information of formant frequency locations, the derived gain function has " tradeoff" factors that allows the smartphone user to customize perceptual preference, by controlling the amount of noise suppression and speech distortion in real-time. The formant frequency information helps the hearing aid user to control the gains over the non-formant frequency band, allowing the HA users to attain more noise suppression while maintaining the speech intelligibility using a smartphone application. Objective intelligibility measures and subjective results reflect the usability of the developed SE application in noisy real world acoustic environment.

RevDate: 2018-11-14

Williams D, Escudero P, A Gafos (2018)

Spectral change and duration as cues in Australian English listeners' front vowel categorization.

The Journal of the Acoustical Society of America, 144(3):EL215.

Australian English /iː/, /ɪ/, and /ɪə/ exhibit almost identical average first (F1) and second (F2) formant frequencies and differ in duration and vowel inherent spectral change (VISC). The cues of duration, F1 × F2 trajectory direction (TD) and trajectory length (TL) were assessed in listeners' categorization of /iː/ and /ɪə/ compared to /ɪ/. Duration was important for distinguishing both /iː/ and /ɪə/ from /ɪ/. TD and TL were important for categorizing /iː/ versus /ɪ/, whereas only TL was important for /ɪə/ versus /ɪ/. Finally, listeners' use of duration and VISC was not mutually affected for either vowel compared to /ɪ/.

RevDate: 2018-11-09

Gómez-Vilda P, Gómez-Rodellar A, Vicente JMF, et al (2018)

Neuromechanical Modelling of Articulatory Movements from Surface Electromyography and Speech Formants.

International journal of neural systems [Epub ahead of print].

Speech articulation is produced by the movements of muscles in the larynx, pharynx, mouth and face. Therefore speech shows acoustic features as formants which are directly related with neuromotor actions of these muscles. The first two formants are strongly related with jaw and tongue muscular activity. Speech can be used as a simple and ubiquitous signal, easy to record and process, either locally or on e-Health platforms. This fact may open a wide set of applications in the study of functional grading and monitoring neurodegenerative diseases. A relevant question, in this sense, is how far speech correlates and neuromotor actions are related. This preliminary study is intended to find answers to this question by using surface electromyographic recordings on the masseter and the acoustic kinematics related with the first formant. It is shown in the study that relevant correlations can be found among the surface electromyographic activity (dynamic muscle behavior) and the positions and first derivatives of the first formant (kinematic variables related to vertical velocity and acceleration of the joint jaw and tongue biomechanical system). As an application example, it is shown that the probability density function associated to these kinematic variables is more sensitive than classical features as Vowel Space Area (VSA) or Formant Centralization Ratio (FCR) in characterizing neuromotor degeneration in Parkinson's Disease.

RevDate: 2018-10-26

Lopes LW, Alves JDN, Evangelista DDS, et al (2018)

Accuracy of traditional and formant acoustic measurements in the evaluation of vocal quality.

CoDAS, 30(5):e20170282 pii:S2317-17822018000500310.

PURPOSE: Investigate the accuracy of isolated and combined acoustic measurements in the discrimination of voice deviation intensity (GD) and predominant voice quality (PVQ) in patients with dysphonia.

METHODS: A total of 302 female patients with voice complaints participated in the study. The sustained /ɛ/ vowel was used to extract the following acoustic measures: mean and standard deviation (SD) of fundamental frequency (F0), jitter, shimmer, glottal to noise excitation (GNE) ratio and the mean of the first three formants (F1, F2, and F3). Auditory-perceptual evaluation of GD and PVQ was conducted by three speech-language pathologists who were voice specialists.

RESULTS: In isolation, only GNE provided satisfactory performance when discriminating between GD and PVQ. Improvement in the classification of GD and PVQ was observed when the acoustic measures were combined. Mean F0, F2, and GNE (healthy × mild-to-moderate deviation), the SDs of F0, F1, and F3 (mild-to-moderate × moderate deviation), and mean jitter and GNE (moderate × intense deviation) were the best combinations for discriminating GD. The best combinations for discriminating PVQ were mean F0, shimmer, and GNE (healthy × rough), F3 and GNE (healthy × breathy), mean F 0, F3, and GNE (rough × tense), and mean F0 , F1, and GNE (breathy × tense).

CONCLUSION: In isolation, GNE proved to be the only acoustic parameter capable of discriminating between GG and PVQ. There was a gain in classification performance for discrimination of both GD and PVQ when traditional and formant acoustic measurements were combined.

RevDate: 2018-10-23

Grawunder S, Crockford C, Clay Z, et al (2018)

Higher fundamental frequency in bonobos is explained by larynx morphology.

Current biology : CB, 28(20):R1188-R1189.

Acoustic signals, shaped by natural and sexual selection, reveal ecological and social selection pressures [1]. Examining acoustic signals together with morphology can be particularly revealing. But this approach has rarely been applied to primates, where clues to the evolutionary trajectory of human communication may be found. Across vertebrate species, there is a close relationship between body size and acoustic parameters, such as formant dispersion and fundamental frequency (f0). Deviations from this acoustic allometry usually produce calls with a lower f0 than expected for a given body size, often due to morphological adaptations in the larynx or vocal tract [2]. An unusual example of an obvious mismatch between fundamental frequency and body size is found in the two closest living relatives of humans, bonobos (Pan paniscus) and chimpanzees (Pan troglodytes). Although these two ape species overlap in body size [3], bonobo calls have a strikingly higher f0 than corresponding calls from chimpanzees [4]. Here, we compare acoustic structures of calls from bonobos and chimpanzees in relation to their larynx morphology. We found that shorter vocal fold length in bonobos compared to chimpanzees accounted for species differences in f0, showing a rare case of positive selection for signal diminution in both bonobo sexes.

RevDate: 2018-10-23

Niziolek CA, S Kiran (2018)

Assessing speech correction abilities with acoustic analyses: Evidence of preserved online correction in persons with aphasia.

International journal of speech-language pathology [Epub ahead of print].

PURPOSE: Disorders of speech production may be accompanied by abnormal processing of speech sensory feedback. Here, we introduce a semi-automated analysis designed to assess the degree to which speakers use natural online feedback to decrease acoustic variability in spoken words. Because production deficits in aphasia have been hypothesised to stem from problems with sensorimotor integration, we investigated whether persons with aphasia (PWA) can correct their speech acoustics online.

METHOD: Eight PWA in the chronic stage produced 200 repetitions each of three monosyllabic words. Formant variability was measured for each vowel in multiple time windows within the syllable, and the reduction in formant variability from vowel onset to midpoint was quantified.

RESULT: PWA significantly decreased acoustic variability over the course of the syllable, providing evidence of online feedback correction mechanisms. The magnitude of this corrective formant movement exceeded past measurements in control participants.

CONCLUSION: Vowel centreing behaviour suggests that error correction abilities are at least partially spared in speakers with aphasia, and may be relied upon to compensate for feedforward deficits by bringing utterances back on track. These proof of concept data show the potential of this analysis technique to elucidate the mechanisms underlying disorders of speech production.

RevDate: 2018-10-21

Fazeli M, Moradi N, Soltani M, et al (2018)

Dysphonia Characteristics and Vowel Impairment in Relation to Neurological Status in Patients with Multiple Sclerosis.

Journal of voice : official journal of the Voice Foundation pii:S0892-1997(18)30351-5 [Epub ahead of print].

PURPOSE: In this study, we attempted to assess the phonation and articulation subsystem changes in patients with multiple sclerosis compared to healthy individuals using Dysphonia Severity Index and Formant Centralization Ratio with the aim of evaluating the correlation between these two indexes with neurological status.

MATERIALS AND METHODS: A sample of 47 patients with multiple sclerosis and 20 healthy speakers were evaluated. Patients' disease duration and disability were monitored by a neurologist. Dysphonia Severity Index and Formant Centralization Ratio scores were computed for each individual. Acoustic analysis was performed by Praat software; the statistical analysis was run using SPSS 21. To compare multiple sclerosis patients with the control group, Mann-Whitney U test was used for non-normal data and independent-samples t test for normal data. Also a logistic regression was used to compare the data. Correlation between acoustic characteristics and neurological status was verified using Spearman correlation coefficient and linear regression was performed to evaluate the simultaneous effects of neurological data.

RESULTS: Statistical analysis revealed that a significant difference existed between multiple sclerosis and healthy participants. Formant Centralization Ratio had a significant correlation with disease severity.

CONCLUSION: Multiple sclerosis patients would be differentiated from healthy individuals by their phonation and articulatory features. Scores of these two indexes can be considered as appropriate criteria for onset of the speech problems in multiple sclerosis. Also, articulation subsystem changes might be useful signs for the progression of the disease.

RevDate: 2018-10-19

Brabenec L, Klobusiakova P, Barton M, et al (2018)

Non-invasive stimulation of the auditory feedback area for improved articulation in Parkinson's disease.

Parkinsonism & related disorders pii:S1353-8020(18)30439-5 [Epub ahead of print].

INTRODUCTION: Hypokinetic dysarthria (HD) is a common symptom of Parkinson's disease (PD) which does not respond well to PD treatments. We investigated acute effects of repetitive transcranial magnetic stimulation (rTMS) of the motor and auditory feedback area on HD in PD using acoustic analysis of speech.

METHODS: We used 10 Hz and 1 Hz stimulation protocols and applied rTMS over the left orofacial primary motor area, the right superior temporal gyrus (STG), and over the vertex (a control stimulation site) in 16 PD patients with HD. A cross-over design was used. Stimulation sites and protocols were randomised across subjects and sessions. Acoustic analysis of a sentence reading task performed inside the MR scanner was used to evaluate rTMS-induced effects on motor speech. Acute fMRI changes due to rTMS were also analysed.

RESULTS: The 1 Hz STG stimulation produced significant increases of the relative standard deviation of the 2nd formant (p = 0.019), i.e. an acoustic parameter describing the tongue and jaw movements. The effects were superior to the control site stimulation and were accompanied by increased resting state functional connectivity between the stimulated region and the right parahippocampal gyrus. The rTMS-induced acoustic changes were correlated with the reading task-related BOLD signal increases of the stimulated area (R = 0.654, p = 0.029).

CONCLUSION: Our results demonstrate for the first time that low-frequency stimulation of the temporal auditory feedback area may improve articulation in PD and enhance functional connectivity between the STG and the cortical region involved in an overt speech control.

RevDate: 2018-10-19

Gómez-Vilda P, Galaz Z, Mekyska J, et al (2018)

Vowel Articulation Dynamic Stability Related to Parkinson's Disease Rating Features: Male Dataset.

International journal of neural systems [Epub ahead of print].

Neurodegenerative pathologies as Parkinson's Disease (PD) show important distortions in speech, affecting fluency, prosody, articulation and phonation. Classically, measurements based on articulation gestures altering formant positions, as the Vocal Space Area (VSA) or the Formant Centralization Ratio (FCR) have been proposed to measure speech distortion, but these markers are based mainly on static positions of sustained vowels. The present study introduces a measurement based on the mutual information distance among probability density functions of kinematic correlates derived from formant dynamics. An absolute kinematic velocity associated to the position of the jaw and tongue articulation gestures is estimated and modeled statistically. The distribution of this feature may differentiate PD patients from normative speakers during sustained vowel emission. The study is based on a limited database of 53 male PD patients, contrasted to a very selected and stable set of eight normative speakers. In this sense, distances based on Kullback-Leibler divergence seem to be sensitive to PD articulation instability. Correlation studies show statistically relevant relationship between information contents based on articulation instability to certain motor and nonmotor clinical scores, such as freezing of gait, or sleep disorders. Remarkably, one of the statistically relevant correlations point out to the time interval passed since the first diagnostic. These results stress the need of defining scoring scales specifically designed for speech disability estimation and monitoring methodologies in degenerative diseases of neuromotor origin.

RevDate: 2018-10-09

den Ouden DB, Galkina E, Basilakos A, et al (2018)

Vowel Formant Dispersion Reflects Severity of Apraxia of Speech.

Aphasiology, 32(8):902-921.

Background: Apraxia of Speech (AOS) has been associated with deviations in consonantal voice-onset-time (VOT), but studies of vowel acoustics have yielded conflicting results. However, a speech motor planning disorder that is not bound by phonological categories is expected to affect vowel as well as consonant articulations.

Aims: We measured consonant VOTs and vowel formants produced by a large sample of stroke survivors, and assessed to what extent these variables and their dispersion are predictive of AOS presence and severity, based on a scale that uses clinical observations to rate gradient presence of AOS, aphasia, and dysarthria.

Methods & Procedures: Picture-description samples were collected from 53 stroke survivors, including unimpaired speakers (12) and speakers with primarily aphasia (19), aphasia with AOS (12), primarily AOS (2), aphasia with dysarthria (2), and aphasia with AOS and dysarthria (6). The first three formants were extracted from vowel tokens bearing main stress in open-class words, as well as VOTs for voiced and voiceless stops. Vowel space was estimated as reflected in the formant centralization ratio. Stepwise Linear Discriminant Analyses were used to predict group membership, and ordinal regression to predict AOS severity, based on the absolute values of these variables, as well as the standard deviations of formants and VOTs within speakers.

Outcomes and Results: Presence and severity of AOS were most consistently predicted by the dispersion of F1, F2, and voiced-stop VOT. These phonetic-acoustic measures do not correlate with aphasia severity.

Conclusions: These results confirm that the AOS affects articulation across-the-board and does not selectively spare vowel production.

RevDate: 2018-10-02

Baotic A, Garcia M, Boeckle M, et al (2018)

Field Propagation Experiments of Male African Savanna Elephant Rumbles: A Focus on the Transmission of Formant Frequencies.

Animals : an open access journal from MDPI, 8(10): pii:ani8100167.

African savanna elephants live in dynamic fission⁻fusion societies and exhibit a sophisticated vocal communication system. Their most frequent call-type is the 'rumble', with a fundamental frequency (which refers to the lowest vocal fold vibration rate when producing a vocalization) near or in the infrasonic range. Rumbles are used in a wide variety of behavioral contexts, for short- and long-distance communication, and convey contextual and physical information. For example, maturity (age and size) is encoded in male rumbles by formant frequencies (the resonance frequencies of the vocal tract), having the most informative power. As sound propagates, however, its spectral and temporal structures degrade progressively. Our study used manipulated and resynthesized male social rumbles to simulate large and small individuals (based on different formant values) to quantify whether this phenotypic information efficiently transmits over long distances. To examine transmission efficiency and the potential influences of ecological factors, we broadcasted and re-recorded rumbles at distances of up to 1.5 km in two different habitats at the Addo Elephant National Park, South Africa. Our results show that rumbles were affected by spectral⁻temporal degradation over distance. Interestingly and unlike previous findings, the transmission of formants was better than that of the fundamental frequency. Our findings demonstrate the importance of formant frequencies for the efficiency of rumble propagation and the transmission of information content in a savanna elephant's natural habitat.

RevDate: 2018-10-01

Pabon P, S Ternström (2018)

Feature Maps of the Acoustic Spectrum of the Voice.

Journal of voice : official journal of the Voice Foundation pii:S0892-1997(18)30185-1 [Epub ahead of print].

The change in the spectrum of sustained /a/ vowels was mapped over the voice range from low to high fundamental frequency and low to high sound pressure level (SPL), in the form of the so-called voice range profile (VRP). In each interval of one semitone and one decibel, narrowband spectra were averaged both within and across subjects. The subjects were groups of 7 male and 12 female singing students, as well as a group of 16 untrained female voices. For each individual and also for each group, pairs of VRP recordings were made, with stringent separation of the modal/chest and falsetto/head registers. Maps are presented of eight scalar metrics, each of which was chosen to quantify a particular feature of the voice spectrum, over fundamental frequency and SPL. Metrics 1 and 2 chart the role of the fundamental in relation to the rest of the spectrum. Metrics 3 and 4 are used to explore the role of resonances in relation to SPL. Metrics 5 and 6 address the distribution of high frequency energy, while metrics 7 and 8 seek to describe the distribution of energy at the low end of the voice spectrum. Several examples are observed of phenomena that are difficult to predict from linear source-filter theory, and of the voice source being less uniform over the voice range than is conventionally assumed. These include a high-frequency band-limiting at high SPL and an unexpected persistence of the second harmonic at low SPL. The two voice registers give rise to clearly different maps. Only a few effects of training were observed, in the low frequency end below 2kHz. The results are of potential interest in voice analysis, voice synthesis and for new insights into the voice production mechanism.

RevDate: 2018-09-30

Kraus MS, Walker TM, Jarskog LF, et al (2018)

Basic auditory processing deficits and their association with auditory emotion recognition in schizophrenia.

Schizophrenia research pii:S0920-9964(18)30542-5 [Epub ahead of print].

BACKGROUND: Individuals with schizophrenia are impaired in their ability to recognize emotions based on vocal cues and these impairments are associated with poor global outcome. Basic perceptual processes, such as auditory pitch processing, are impaired in schizophrenia and contribute to difficulty identifying emotions. However, previous work has focused on a relatively narrow assessment of auditory deficits and their relation to emotion recognition impairment in schizophrenia.

METHODS: We have assessed 87 patients with schizophrenia and 73 healthy controls on a comprehensive battery of tasks spanning the five empirically derived domains of auditory function. We also explored the relationship between basic auditory processing and auditory emotion recognition within the patient group using correlational analysis.

RESULTS: Patients exhibited widespread auditory impairments across multiple domains of auditory function, with mostly medium effect sizes. Performance on all of the basic auditory tests correlated with auditory emotion recognition at the p < .01 level in the patient group, with 9 out of 13 tests correlating with emotion recognition at r = 0.40 or greater. After controlling for cognition, many of the largest correlations involved spectral processing within the phase-locking range and discrimination of vocally based stimuli.

CONCLUSIONS: While many auditory skills contribute to this impairment, deficient formant discrimination appears to be a key skill contributing to impaired emotion recognition as this was the only basic auditory skill to enter a step-wise multiple regression after first entering a measure of cognitive impairment, and formant discrimination accounted for significant unique variance in emotion recognition performance after accounting for deficits in pitch processing.

RevDate: 2018-09-25

Han C, Wang H, Fasolt V, et al (2018)

No clear evidence for correlations between handgrip strength and sexually dimorphic acoustic properties of voices.

American journal of human biology : the official journal of the Human Biology Council [Epub ahead of print].

OBJECTIVES: Recent research on the signal value of masculine physical characteristics in men has focused on the possibility that such characteristics are valid cues of physical strength. However, evidence that sexually dimorphic vocal characteristics are correlated with physical strength is equivocal. Consequently, we undertook a further test for possible relationships between physical strength and masculine vocal characteristics.

METHODS: We tested the putative relationships between White UK (N = 115) and Chinese (N = 106) participants' handgrip strength (a widely used proxy for general upper-body strength) and five sexually dimorphic acoustic properties of voices: fundamental frequency (F0), fundamental frequency's SD (F0-SD), formant dispersion (Df), formant position (Pf), and estimated vocal-tract length (VTL).

RESULTS: Analyses revealed no clear evidence that stronger individuals had more masculine voices.

CONCLUSIONS: Our results do not support the hypothesis that masculine vocal characteristics are a valid cue of physical strength.

RevDate: 2018-09-21

Easwar V, Banyard A, Aiken SJ, et al (2018)

Phase-locked responses to the vowel envelope vary in scalp-recorded amplitude due to across-frequency response interactions.

The European journal of neuroscience [Epub ahead of print].

Neural encoding of the envelope of sounds like vowels is essential to access temporal information useful for speech recognition. Subcortical responses to envelope periodicity of vowels can be assessed using scalp-recorded envelope following responses (EFRs), however, the amplitude of EFRs vary by vowel spectra and the causal relationship is not well understood. One cause for spectral dependency could be interactions between responses with different phases, initiated by multiple stimulus frequencies. Phase differences can arise from earlier initiation of processing high frequencies relative to low frequencies in the cochlea. The present study investigated the presence of such phase interactions by measuring EFRs to two naturally spoken vowels (/ε/ and /u/), while delaying the envelope phase of the second formant band (F2+) relative to the first formant (F1) band in 45◦ increments. At 0◦ F2+ phase delay, EFRs elicited by the vowel /ε/ were lower in amplitude than the EFRs elicited by /u/. Using vector computations, we found that the lower amplitude of /ε/-EFRs was caused by linear superposition of F1- and F2+-contributions with larger F1-F2+ phase differences (166◦) compared to /u/ (19◦). While the variation in amplitude across F2+ phase delays could be modeled with two dominant EFR sources for both vowels, the degree of variation was dependent on F1 and F2+ EFR characteristics. Together, we demonstrate that (1) broadband sounds like vowels elicit independent responses from different stimulus frequencies that may be out-of-phase and affect scalp-based measurements, and (2) delaying higher frequency formants can maximize EFR amplitudes for some vowels. This article is protected by copyright. All rights reserved.

RevDate: 2018-09-21

Omidvar S, Mahmoudian S, Khabazkhoob M, et al (2018)

Tinnitus Impacts on Speech and Non-speech Stimuli.

Otology & neurotology : official publication of the American Otological Society, American Neurotology Society [and] European Academy of Otology and Neurotology [Epub ahead of print].

OBJECTIVE: To investigate how tinnitus affects the processing of speech and non-speech stimuli at the subcortical level.

STUDY DESIGN: Cross-sectional analytical study.

SETTING: Academic, tertiary referral center.

PATIENTS: Eighteen individuals with tinnitus and 20 controls without tinnitus matched based on their age and sex. All subjects had normal hearing sensitivity.

INTERVENTION: Diagnostic.

MAIN OUTCOME MEASURES: The effect of tinnitus on the parameters of auditory brainstem responses (ABR) to non-speech (click-ABR), and speech (sABR) stimuli was investigated.

RESULTS: Latencies of click ABR in waves III, V, and Vn, as well as inter-peak latency (IPL) of I to V were significantly longer in individuals with tinnitus compared with the controls. Individuals with tinnitus demonstrated significantly longer latencies of all sABR waves than the control group. The tinnitus patients also exhibited a significant decrease in the slope of the V-A complex and reduced encoding of the first and higher formants. A significant difference was observed between the two groups in the spectral magnitudes, the first formant frequency range (F1) and a higher frequency region (HF).

CONCLUSIONS: Our findings suggest that maladaptive neural plasticity resulting from tinnitus can be subcortically measured and affects timing processing of both speech and non-speech stimuli. The findings have been discussed based on models of maladaptive plasticity and the interference of tinnitus as an internal noise in synthesizing speech auditory stimuli.

RevDate: 2018-09-21

Charlton BD, Owen MA, Keating JL, et al (2018)

Sound transmission in a bamboo forest and its implications for information transfer in giant panda (Ailuropoda melanoleuca) bleats.

Scientific reports, 8(1):12754 pii:10.1038/s41598-018-31155-5.

Although mammal vocalisations signal attributes about the caller that are important in a range of contexts, relatively few studies have investigated the transmission of specific types of information encoded in mammal calls. In this study we broadcast and re-recorded giant panda bleats in a bamboo plantation, to assess the stability of individuality and sex differences in these calls over distance, and determine how the acoustic structure of giant panda bleats degrades in this species' typical environment. Our results indicate that vocal recognition of the caller's identity and sex is not likely to be possible when the distance between the vocaliser and receiver exceeds 20 m and 10 m, respectively. Further analysis revealed that the F0 contour of bleats was subject to high structural degradation as it propagated through the bamboo canopy, making the measurement of mean F0 and F0 modulation characteristics highly unreliable at distances exceeding 10 m. The most stable acoustic features of bleats in the bamboo forest environment (lowest % variation) were the upper formants and overall formant spacing. The analysis of amplitude attenuation revealed that the fifth and sixth formant are more prone to decay than the other frequency components of bleats, however, the fifth formant still remained the most prominent and persistent frequency component over distance. Paired with previous studies, these results show that giant panda bleats have the potential to signal the caller's identity at distances of up to 20 m and reliably transmit sex differences up to 10 m from the caller, and suggest that information encoded by F0 modulation in bleats could only be functionally relevant during close-range interactions in this species' natural environment.

RevDate: 2018-09-20

Ward RM, DG Kelty-Stephen (2018)

Bringing the Nonlinearity of the Movement System to Gestural Theories of Language Use: Multifractal Structure of Spoken English Supports the Compensation for Coarticulation in Human Speech Perception.

Frontiers in physiology, 9:1152.

Coarticulation is the tendency for speech vocalization and articulation even at the phonemic level to change with context, and compensation for coarticulation (CfC) reflects the striking human ability to perceive phonemic stability despite this variability. A current controversy centers on whether CfC depends on contrast between formants of a speech-signal spectrogram-specifically, contrast between offset formants concluding context stimuli and onset formants opening the target sound-or on speech-sound variability specific to the coordinative movement of speech articulators (e.g., vocal folds, postural muscles, lips, tongues). This manuscript aims to encode that coordinative-movement context in terms of speech-signal multifractal structure and to determine whether speech's multifractal structure might explain the crucial gestural support for any proposed spectral contrast. We asked human participants to categorize individual target stimuli drawn from an 11-step [ga]-to-[da] continuum as either phonemes "GA" or "DA." Three groups each heard a specific-type context stimulus preceding target stimuli: either real-speech [al] or [a], sine-wave tones at the third-formant offset frequency of either [al] or [aɹ], and either simulated-speech contexts [al] or [aɹ]. Here, simulating speech contexts involved randomizing the sequence of relatively homogeneous pitch periods within vowel-sound [a] of each [al] and [aɹ]. Crucially, simulated-speech contexts had the same offset and extremely similar vowel formants as and, to additional naïve participants, sounded identical to real-speech contexts. However, randomization distorted original speech-context multifractality, and effects of spectral contrast following speech only appeared after regression modeling of trial-by-trial "GA" judgments controlled for context-stimulus multifractality. Furthermore, simulated-speech contexts elicited faster responses (like tone contexts do) and weakened known biases in CfC, suggesting that spectral contrast depends on the nonlinear interactions across multiple scales that articulatory gestures express through the speech signal. Traditional mouse-tracking behaviors measured as participants moved their computer-mouse cursor to register their "GA"-or-"DA" decisions with mouse-clicks suggest that listening to speech leads the movement system to resonate with the multifractality of context stimuli. We interpret these results as shedding light on a new multifractal terrain upon which to found a better understanding in which movement systems play an important role in shaping how speech perception makes use of acoustic information.

RevDate: 2018-09-19

Hu XJ, Li FF, CC Lau (2018)

Development of the Mandarin speech banana.

International journal of speech-language pathology [Epub ahead of print].

PURPOSE: For Indo-European languages, "speech banana" is widely used to verify the benefits of hearing aids and cochlear implants. As a standardised "Mandarin speech banana" is not available, clinicians in China typically use a non-Mandarin speech banana. However, as Chinese is logographic and tonal, using a non-Mandarin speech banana is inappropriate. This paper was designed to develop the Mandarin speech banana according to the Mandarin phonetic properties.

METHOD: In the first experiment, 14 participants read aloud the standard Mandarin initials and finals. For each pronounced sound, its formants were measured. The boundary of all formants formed the formant graph (intensity versus frequency). In the second experiment, 20 participants listened to a list of pre-recorded initials and finals that had been filtered with different bandwidths. The minimum bandwidth to recognise a target sound defined its location on the formant graph.

RESULT: The Mandarin speech banana was generated with recognisable initials and finals on the formant graph. Tone affected the shape of the formant graph, especially at low frequencies.

CONCLUSION: Clinicians can use the new M andarin speech banana to counsel patients about what sounds are inaudible to them. Speech training can be implemented based on the unheard sounds in the speech banana.

RevDate: 2018-09-05

Sfakianaki A, Nicolaidis K, Okalidou A, et al (2018)

Coarticulatory dynamics in Greek disyllables produced by young adults with and without hearing loss.

Clinical linguistics & phonetics [Epub ahead of print].

Hearing loss affects both speech perception and production with detrimental effects on various speech characteristics including coarticulatory dynamics. The aim of the present study is to explore consonant-to-vowel (C-to-V) and vowel-to-vowel (V-to-V) coarticulation in magnitude, direction and temporal extent in the speech of young adult male and female speakers of Greek with normal hearing (NH) and hearing impairment (HI). Nine intelligible speakers with profound HI, using conventional hearing aids, and five speakers with NH produced /pV1CV2/ disyllables, with the point vowels /i, a, u/ and the consonants /p, t, s/, stressed either on the first or the second syllable. Formant frequencies F1 and F2 were measured in order to examine C-to-V effects at vowel midpoint and V-to-V effects at vowel onset, midpoint and offset. The acoustic and statistical analyses revealed similarities but also significant differences regarding coarticulatory patterns of the two groups. Interestingly, prevalence of anticipatory coarticulation effects in alveolar contexts was observed for speakers with HI. Findings are interpreted on account of possible differences in articulation strategies between the two groups and with reference to current coarticulatory models.

RevDate: 2018-09-03

Kawitzky D, T McAllister (2018)

The Effect of Formant Biofeedback on the Feminization of Voice in Transgender Women.

Journal of voice : official journal of the Voice Foundation pii:S0892-1997(18)30190-5 [Epub ahead of print].

Differences in formant frequencies between men and women contribute to the perception of voices as masculine or feminine. This study investigated whether visual-acoustic biofeedback can be used to help transgender women achieve formant targets typical of cisgender women, and whether such a shift influences the perceived femininity of speech. Transgender women and a comparison group of cisgender males were trained to produce vowels in a word context while also attempting to make a visual representation of their second formant (F2) line up with a target that was shifted up relative to their baseline F2 (feminized target) or an unshifted or shifted-down target (control conditions). Despite the short-term nature of the training, both groups showed significant differences in F2 frequency in shifted-up, shifted-down, and unshifted conditions. Gender typicality ratings from blinded listeners indicated that higher F2 values were associated with an increase in the perceived femininity of speech. Consistent with previous literature, we found that fundamental frequency and F2 make a joint contribution to the perception of gender. The results suggest that biofeedback might be a useful tool in voice modification therapy for transgender women; however, larger studies and information about generalization will be essential before strong conclusions can be drawn.

RevDate: 2018-08-08

Núñez-Batalla F, Vasile G, Cartón-Corona N, et al (2018)

Vowel production in hearing impaired children: A comparison between normal-hearing, hearing-aided and cochlear-implanted children.

Acta otorrinolaringologica espanola pii:S0001-6519(18)30117-1 [Epub ahead of print].

INTRODUCTION AND OBJECTIVES: Inadequate auditory feedback in prelingually deaf children alters the articulation of consonants and vowels. The purpose of this investigation was to compare vowel production in Spanish-speaking deaf children with cochlear implantation, and with hearing-aids with normal-hearing children by means of acoustic analysis of formant frequencies and vowel space.

METHODS: A total of 56 prelingually deaf children (25 with cochlear implants and 31 wearing hearing-aids) and 47 normal-hearing children participated. The first 2 formants (F1 and F2) of the five Spanish vowels were measured using Praat software. One-way analysis of variance (ANOVA) and post hoc Scheffé test were applied to analyze the differences between the 3 groups. The surface area of the vowel space was also calculated.

RESULTS: The mean value of F1 in all vowels was not significantly different between the 3 groups. For vowels /i/, /o/ and /u/, the mean value of F2 was significantly different between the 2 groups of deaf children and their normal-hearing peers.

CONCLUSION: Both prelingually hearing-impaired groups tended toward subtle deviations in the articulation of vowels that could be analyzed using an objective acoustic analysis programme.

RevDate: 2018-08-07

Bucci J, Perrier P, Gerber S, et al (2018)

Vowel Reduction in Coratino (South Italy): Phonological and Phonetic Perspectives.

Phonetica pii:000490947 [Epub ahead of print].

Vowel reduction may involve phonetic reduction processes, with nonreached targets, and/or phonological processes in which a vowel target is changed for another target, possibly schwa. Coratino, a dialect of southern Italy, displays complex vowel reduction processes assumed to be phonological. We analyzed a corpus representative of vowel reduction in Coratino, based on a set of a hundred pairs of words contrasting a stressed and an unstressed version of a given vowel in a given consonant environment, produced by 10 speakers. We report vowelformants together with consonant-to-vowel formant trajectories and durations, and show that these data are rather in agreement with a change in vowel target from /i e ɛ·ɔ u/ to schwa when the vowel is a non-word-initial unstressed utterance, unless the vowel shares a place-of-articulation feature with the preceding or following consonant. Interestingly, it also appears that there are 2 targets for phonological reduction, differing in F1 values. A "higher schwa" - which could be considered as /ɨ/ - corresponds to reduction for high vowels /i u/ while a "lower schwa" - which could be considered as /ə/ - corresponds to reduction for midhigh.

RevDate: 2018-08-04

Adriaans F (2018)

Effects of consonantal context on the learnability of vowel categories from infant-directed speech.

The Journal of the Acoustical Society of America, 144(1):EL20.

Recent studies have shown that vowels in infant-directed speech (IDS) are characterized by highly variable formant distributions. The current study investigates whether vowel variability is partially due to consonantal context, and explores whether consonantal context could support the learning of vowel categories from IDS. A computational model is presented which selects contexts based on frequency in the input and generalizes across contextual categories. Improved categorization performance was found on a vowel contrast in American-English IDS. The findings support a view in which the infant's learning mechanism is anchored in context, in order to cope with acoustic variability in the input.

RevDate: 2018-08-04

Barreda S, TM Nearey (2018)

A regression approach to vowel normalization for missing and unbalanced data.

The Journal of the Acoustical Society of America, 144(1):500.

Researchers investigating the vowel systems of languages or dialects frequently employ normalization methods to minimize between-speaker variability in formant patterns while preserving between-phoneme separation and (socio-)dialectal variation. Here two methods are considered: log-mean and Lobanov normalization. Although both of these methods express formants in a speaker-dependent space, the methods differ in their complexity and in their implied models of human vowel-perception. Typical implementations of these methods rely on balanced data across speakers so that researchers may have to reduce the data available in the analyses in missing-data situations. Here, an alternative method is proposed for the normalization of vowels using the log-mean method in a linear-regression framework. The performance of the traditional approaches to log-mean and Lobanov normalization against the regression approach to the log-mean method using naturalistic, simulated vowel-data was investigated. The results indicate that the Lobanov method likely removes legitimate linguistic variation from vowel data and often provides very noisy estimates of the actual vowel quality associated with individual tokens. The authors further argue that the Lobanov method is too complex to represent a plausible model of human vowel perception, and so is unlikely to provide results that reflect the true perceptual organization of linguistic data.

RevDate: 2018-08-04

Brajot FX, D Lawrence (2018)

Delay-induced low-frequency modulation of the voice during sustained phonation.

The Journal of the Acoustical Society of America, 144(1):282.

An important property of negative feedback systems is the tendency to oscillate when feedback is delayed. This paper evaluated this phenomenon in a sustained phonation task, where subjects prolonged a vowel with 0-600 ms delays in auditory feedback. This resulted in a delay-dependent vocal wow: from 0.4 to 1 Hz fluctuations in fundamental frequency and intensity that increased in period and amplitude as the delay increased. A similar modulation in low-frequency oscillations was not observed in the first two formant frequencies, although some subjects did display increased variability. Results suggest that delayed auditory feedback enhances an existing periodic fluctuation in the voice, with a more complex, possibly indirect, influence on supraglottal articulation. These findings have important implications for understanding how speech may be affected by artificially applied or disease-based delays in sensory feedback.

RevDate: 2018-08-03

Souza P, Wright R, Gallun F, et al (2018)

Reliability and Repeatability of the Speech Cue Profile.

Journal of speech, language, and hearing research : JSLHR pii:2696625 [Epub ahead of print].

Purpose: Researchers have long noted speech recognition variability that is not explained by the pure-tone audiogram. Previous work (Souza, Wright, Blackburn, Tatman, & Gallun, 2015) demonstrated that a small number of listeners with sensorineural hearing loss utilized different types of acoustic cues to identify speechlike stimuli, specifically the extent to which the participant relied upon spectral (or temporal) information for identification. Consistent with recent calls for data rigor and reproducibility, the primary aims of this study were to replicate the pattern of cue use in a larger cohort and to verify stability of the cue profiles over time.

Method: Cue-use profiles were measured for adults with sensorineural hearing loss using a syllable identification task consisting of synthetic speechlike stimuli in which spectral and temporal dimensions were manipulated along continua. For the first set, a static spectral shape varied from alveolar to palatal, and a temporal envelope rise time varied from affricate to fricative. For the second set, formant transitions varied from labial to alveolar and a temporal envelope rise time varied from approximant to stop. A discriminant feature analysis was used to determine to what degree spectral and temporal information contributed to stimulus identification. A subset of participants completed a 2nd visit using the same stimuli and procedures.

Results: When spectral information was static, most participants were more influenced by spectral than by temporal information. When spectral information was dynamic, participants demonstrated a balanced distribution of cue-use patterns, with nearly equal numbers of individuals influenced by spectral or temporal cues. Individual cue profile was repeatable over a period of several months.

Conclusion: In combination with previously published data, these results indicate that listeners with sensorineural hearing loss are influenced by different cues to identify speechlike sounds and that those patterns are stable over time.

RevDate: 2018-07-28

Anikin A (2018)

Soundgen: An open-source tool for synthesizing nonverbal vocalizations.

Behavior research methods pii:10.3758/s13428-018-1095-7 [Epub ahead of print].

Voice synthesis is a useful method for investigating the communicative role of different acoustic features. Although many text-to-speech systems are available, researchers of human nonverbal vocalizations and bioacousticians may profit from a dedicated simple tool for synthesizing and manipulating natural-sounding vocalizations. Soundgen (https://CRAN.R-project.org/package=soundgen) is an open-source R package that synthesizes nonverbal vocalizations based on meaningful acoustic parameters, which can be specified from the command line or in an interactive app. This tool was validated by comparing the perceived emotion, valence, arousal, and authenticity of 60 recorded human nonverbal vocalizations (screams, moans, laughs, and so on) and their approximate synthetic reproductions. Each synthetic sound was created by manually specifying only a small number of high-level control parameters, such as syllable length and a few anchors for the intonation contour. Nevertheless, the valence and arousal ratings of synthetic sounds were similar to those of the original recordings, and the authenticity ratings were comparable, maintaining parity with the originals for less complex vocalizations. Manipulating the precise acoustic characteristics of synthetic sounds may shed light on the salient predictors of emotion in the human voice. More generally, soundgen may prove useful for any studies that require precise control over the acoustic features of nonspeech sounds, including research on animal vocalizations and auditory perception.

RevDate: 2018-07-26

Hînganu MV, Hînganu D, Cozma SR, et al (2018)

Morphofunctional evaluation of buccopharyngeal space using three-dimensional cone-beam computed tomography (3D-CBCT).

Annals of anatomy = Anatomischer Anzeiger : official organ of the Anatomische Gesellschaft pii:S0940-9602(18)30091-8 [Epub ahead of print].

The present study aims to identify the anatomical functional changes of the buccopharyngeal space in case of singers with canto voice. The interest in this field is particularly important in view of the relation between the artistic performance level, phoniatry and functional anatomy, as the voice formation mechanism is not completely known yet. We conducted a morphometric study on three soprano voices that differ in type and training level. The anatomical soft structures from the superior vocal formant of each soprano were measured on images captured using the Cone-beam Computed Tomography (CBCT) technique. The results obtained, as well as the 3D reconstructions emphasize the particularities of the individual morphological features, especially in case of the experienced soprano soloist, which are found to be different for each anatomical soft structure, as well as for their integrity. The experimental results are encouraging and suggest further development of this study on soprano voices and also on other types of opera voices.

RevDate: 2018-07-23

Whalen DH, Chen WR, Tiede MK, et al (2018)

Variability of articulator positions and formants across nine English vowels.

Journal of phonetics, 68:1-14.

Speech, though communicative, is quite variable both in articulation and acoustics, and it has often been claimed that articulation is more variable. Here we compared variability in articulation and acoustics for 32 speakers in the x-ray microbeam database (XRMB; Westbury, 1994). Variability in tongue, lip and jaw positions for nine English vowels (/u, ʊ, æ, ɑ, ʌ, ɔ, ε, ɪ, i/) was compared to that of the corresponding formant values. The domains were made comparable by creating three-dimensional spaces for each: the first three principal components from an analysis of a 14-dimensional space for articulation, and an F1xF2xF3 space for acoustics. More variability occurred in the articulation than the acoustics for half of the speakers, while the reverse was true for the other half. Individual tokens were further from the articulatory median than the acoustic median for 40-60% of tokens across speakers. A separate analysis of three non-low front vowels (/ε, ɪ, i/, for which the XRMB system provides the most direct articulatory evidence) did not differ from the omnibus analysis. Speakers tended to be either more or less variable consistently across vowels. Across speakers, there was a positive correlation between articulatory and acoustic variability, both for all vowels and for just the three non-low front vowels. Although the XRMB is an incomplete representation of articulation, it nonetheless provides data for direct comparisons between articulatory and acoustic variability that have not been reported previously. The results indicate that articulation is not more variable than acoustics, that speakers had relatively consistent variability across vowels, and that articulatory and acoustic variability were related for the vowels themselves.

RevDate: 2018-07-12

Z Barakzai S, Wells J, Parkin TDH, et al (2018)

Overground endoscopic findings and respiratory sound analysis in horses with recurrent laryngeal neuropathy after unilateral laser ventriculocordectomy.

Equine veterinary journal [Epub ahead of print].

BACKGROUND: Unilateral ventriculocordectomy (VeC) is frequently performed, yet objective studies in horses with naturally occurring recurrent laryngeal neuropathy (RLN) are few.

OBJECTIVES: To evaluate respiratory noise and exercising over-ground endoscopy in horses with grade B and C laryngeal function, before and after unilateral laser VeC.

STUDY DESIGN: Prospective study in clinically affected client-owned horses.

METHODS: Exercising endoscopy was performed and concurrent respiratory noise was recorded. A left sided laser VeC was performed under standing sedation. Owners were asked to present the horse for re-examination 6-8 weeks post-operatively when exercising endoscopy and sound recordings were repeated. Exercising endoscopic findings were recorded, including the degree of arytenoid stability. Quantitative measurement of left-to-right quotient angle ratio (LRQ) and rima glottidis area ratio (RGA) were performed pre- and post- operatively. Sound analysis was performed, and measurements of the energy change in F1, F2 and F3 formants between pre- and post-operative recordings were made and statistically analysed.

RESULTS: Three grade B and 7 grade C horses were included; 6/7grade C horses pre-operatively had bilateral vocal fold collapse (VFC) and 5/7 had mild right-sided medial deviation of the ary-epiglottic fold (MDAF). Right VFC and MDAF was still present in these horses post-operatively; Grade B horses had no other endoscopic dynamic abnormalities post-operatively. Sound analysis showed significant reduction in energy in formant F2 (P = 0.05) after surgery.

MAIN LIMITATIONS: The study sample size was small and multiple dynamic abnormalities made sound analysis challenging.

CONCLUSIONS: RLN-affected horses have reduction of sound levels in F2 after unilateral laser VeC. Continuing noise may be caused by other ongoing forms of dynamic obstruction in Grade C horses. Unilateral VeC is useful for grade B horses based on endoscopic images. In Grade C horses, bilateral VeC, right ary-epiglottic fold resection ± laryngoplasty might be a better option than unilateral VeC. This article is protected by copyright. All rights reserved.

RevDate: 2018-07-11

Buzaneli ECP, Zenari MS, Kulcsar MAV, et al (2018)

Supracricoid Laryngectomy: The Function of the Remaining Arytenoid in Voice and Swallowing.

International archives of otorhinolaryngology, 22(3):303-312.

Introduction Supracricoid laryngectomy still has selected indications; there are few studies in the literature, and the case series are limited, a fact that stimulates the development of new studies to further elucidate the structural and functional aspects of the procedure. Objective To assess voice and deglutition parameters according to the number of preserved arytenoids. Methods Eleven patients who underwent subtotal laryngectomy with cricohyoidoepiglottopexy were evaluated by laryngeal nasofibroscopy, videofluoroscopy, and auditory-perceptual, acoustic, and voice pleasantness analyses, after resuming oral feeding. Results Functional abnormalities were detected in two out of the three patients who underwent arytenoidectomy, and in six patients from the remainder of the sample. Almost half of the sample presented silent laryngeal penetration and/or vallecular/hypopharyngeal stasis on the videofluoroscopy. The mean voice analysis scores indicated moderate vocal deviation, roughness and breathiness; severe strain and loudness deviation; shorter maximum phonation time; the presence of noise; and high third and fourth formant values. The voices were rated as unpleasant. There was no difference in the number and functionality of the remaining arytenoids as prognostic factors for deglutition; however, in the qualitative analysis, favorable voice and deglutition outcomes were more common among patients who did not undergo arytenoidectomy and had normal functional conditions. Conclusion The number and functionality of the preserved arytenoids were not found to be prognostic factors for favorable deglutition efficiency outcomes. However, the qualitative analysis showed that the preservation of both arytenoids and the absence of functional abnormalities were associated with more satisfactory voice and deglutition patterns.

RevDate: 2018-07-01

El Boghdady N, Başkent D, E Gaudrain (2018)

Effect of frequency mismatch and band partitioning on vocal tract length perception in vocoder simulations of cochlear implant processing.

The Journal of the Acoustical Society of America, 143(6):3505.

The vocal tract length (VTL) of a speaker is an important voice cue that aids speech intelligibility in multi-talker situations. However, cochlear implant (CI) users demonstrate poor VTL sensitivity. This may be partially caused by the mismatch between frequencies received by the implant and those corresponding to places of stimulation along the cochlea. This mismatch can distort formant spacing, where VTL cues are encoded. In this study, the effects of frequency mismatch and band partitioning on VTL sensitivity were investigated in normal hearing listeners with vocoder simulations of CI processing. The hypotheses were that VTL sensitivity may be reduced by increased frequency mismatch and insufficient spectral resolution in how the frequency range is partitioned, specifically where formants lie. Moreover, optimal band partitioning might mitigate the detrimental effects of frequency mismatch on VTL sensitivity. Results showed that VTL sensitivity decreased with increased frequency mismatch and reduced spectral resolution near the low frequencies of the band partitioning map. Band partitioning was independent of mismatch, indicating that if a given partitioning is suboptimal, a better partitioning might improve VTL sensitivity despite the degree of mismatch. These findings suggest that customizing the frequency partitioning map may enhance VTL perception in individual CI users.

RevDate: 2018-07-01

Vikram CM, Macha SK, Kalita S, et al (2018)

Acoustic analysis of misarticulated trills in cleft lip and palate children.

The Journal of the Acoustical Society of America, 143(6):EL474.

In this paper, acoustic analysis of misarticulated trills in cleft lip and palate speakers is carried out using excitation source based features: strength of excitation and fundamental frequency, derived from zero-frequency filtered signal, and vocal tract system features: first formant frequency (F1) and trill frequency, derived from the linear prediction analysis and autocorrelation approach, respectively. These features are found to be statistically significant while discriminating normal from misarticulated trills. Using acoustic features, dynamic time warping based trill misarticulation detection system is demonstrated. The performance of the proposed system in terms of the F1-score is 73.44%, whereas that for conventional Mel-frequency cepstral coefficients is 66.11%.

RevDate: 2018-07-17

Ng ML, Yan N, Chan V, et al (2018)

A Volumetric Analysis of the Vocal Tract Associated with Laryngectomees Using Acoustic Reflection Technology.

Folia phoniatrica et logopaedica : official organ of the International Association of Logopedics and Phoniatrics (IALP), 70(1):44-49.

OBJECTIVE: Previous studies of the laryngectomized vocal tract using formant frequencies reported contradictory findings. Imagining studies of the vocal tract in alaryngeal speakers are limited due to the possible radiation effect as well as the cost and time associated with the studies. The present study examined the vocal tract configuration of laryngectomized individuals using acoustic reflection technology.

SUBJECTS AND METHODS: Thirty alaryngeal and 30 laryngeal male speakers of Cantonese participated in the study. A pharyngometer was used to obtain volumetric information of the vocal tract. All speakers were instructed to imitate the production of /a/ when the length and volume information of the oral cavity, pharyngeal cavity, and the entire vocal tract were obtained. The data of alaryngeal and laryngeal speakers were compared.

RESULTS: Pharyngometric measurements revealed no significant difference in the vocal tract dimensions between laryngeal and alaryngeal speakers.

CONCLUSION: Despite the removal of the larynx and a possible alteration in the pharyngeal cavity during total laryngectomy, the vocal tract configuration (length and volume) in laryngectomized individuals was not significantly different from laryngeal speakers. It is suggested that other factors might have affected formant measures in previous studies.

RevDate: 2018-06-26

Reby D, Wyman MT, Frey R, et al (2018)

Vocal tract modelling in fallow deer: are male groans nasalized?.

The Journal of experimental biology pii:jeb.179416 [Epub ahead of print].

Males of several species of deer have a descended and mobile larynx, resulting in an unusually long vocal tract, which can be further extended by lowering the larynx during call production. Formant frequencies are lowered as the vocal tract is extended, as predicted when approximating the vocal tract as a uniform quarter wavelength resonator. However, formant frequencies in polygynous deer follow uneven distribution patterns, indicating that the vocal tract configuration may in fact be rather complex. We CT-scanned the head and neck region of two adult male fallow deer specimens with artificially extended vocal tracts and measured the cross-sectional areas of the supra-laryngeal vocal tract along the oral and nasal tracts. The CT data was then used to predict the resonances produced by three possible configurations, including the oral vocal tract only, the nasal vocal tract only, or combining both. We found that the area functions from the combined oral and nasal vocal tracts produced resonances more closely matching the formant pattern and scaling observed in fallow deer groans than those predicted by the area functions of the oral vocal tract only or of the nasal vocal tract only. This indicates that the nasal and oral vocal tracts are both simultaneously involved in the production of a nonhuman mammal vocalisation, and suggests that the potential for nasalization in putative oral loud-calls should be carefully considered.

RevDate: 2018-06-25

Yilmaz A, Sarac ET, Aydinli FE, et al (2018)

Investigating the effect of STN-DBS stimulation and different frequency settings on the acoustic-articulatory features of vowels.

Neurological sciences : official journal of the Italian Neurological Society and of the Italian Society of Clinical Neurophysiology pii:10.1007/s10072-018-3479-y [Epub ahead of print].

INTRODUCTION: Parkinson's disease (PD) is the second most frequent progressive neuro-degenerative disorder. In addition to motor symptoms, nonmotor symptoms and voice and speech disorders can also develop in 90% of PD patients. The aim of our study was to investigate the effects of DBS and different DBS frequencies on speech acoustics of vowels in PD patients.

METHODS: The study included 16 patients who underwent STN-DBS surgery due to PD. The voice recordings for the vowels including [a], [e], [i], and [o] were performed at frequencies including 230, 130, 90, and 60 Hz and off-stimulation. The voice recordings were gathered and evaluated by the Praat software, and the effects on the first (F1), second (F2), and third formant (F3) frequencies were analyzed.

RESULTS: A significant difference was found for the F1 value of the vowel [a] at 130 Hz compared to off-stimulation. However, no significant difference was found between the three formant frequencies with regard to the stimulation frequencies and off-stimulation. In addition, though not statistically significant, stimulation at 60 and 230 Hz led to several differences in the formant frequencies of other three vowels.

CONCLUSION: Our results indicated that STN-DBS stimulation at 130 Hz had a significant positive effect on articulation of [a] compared to off-stimulation. Although there is not any statistical significant stimulation at 60 and 230 Hz may also have an effect on the articulation of [e], [i], and [o] but this effect needs to be investigated in future studies with higher numbers of participants.

RevDate: 2018-06-17

Dietrich S, Hertrich I, Müller-Dahlhaus F, et al (2018)

Reduced Performance During a Sentence Repetition Task by Continuous Theta-Burst Magnetic Stimulation of the Pre-supplementary Motor Area.

Frontiers in neuroscience, 12:361.

The pre-supplementary motor area (pre-SMA) is engaged in speech comprehension under difficult circumstances such as poor acoustic signal quality or time-critical conditions. Previous studies found that left pre-SMA is activated when subjects listen to accelerated speech. Here, the functional role of pre-SMA was tested for accelerated speech comprehension by inducing a transient "virtual lesion" using continuous theta-burst stimulation (cTBS). Participants were tested (1) prior to (pre-baseline), (2) 10 min after (test condition for the cTBS effect), and (3) 60 min after stimulation (post-baseline) using a sentence repetition task (formant-synthesized at rates of 8, 10, 12, 14, and 16 syllables/s). Speech comprehension was quantified by the percentage of correctly reproduced speech material. For high speech rates, subjects showed decreased performance after cTBS of pre-SMA. Regarding the error pattern, the number of incorrect words without any semantic or phonological similarity to the target context increased, while related words decreased. Thus, the transient impairment of pre-SMA seems to affect its inhibitory function that normally eliminates erroneous speech material prior to speaking or, in case of perception, prior to encoding into a semantically/pragmatically meaningful message.

RevDate: 2018-06-17

Kent RD, HK Vorperian (2018)

Static measurements of vowel formant frequencies and bandwidths: A review.

Journal of communication disorders, 74:74-97.

PURPOSE: Data on vowel formants have been derived primarily from static measures representing an assumed steady state. This review summarizes data on formant frequencies and bandwidths for American English and also addresses (a) sources of variability (focusing on speech sample and time sampling point), and (b) methods of data reduction such as vowel area and dispersion.

METHOD: Searches were conducted with CINAHL, Google Scholar, MEDLINE/PubMed, SCOPUS, and other online sources including legacy articles and references. The primary search items were vowels, vowel space area, vowel dispersion, formants, formant frequency, and formant bandwidth.

RESULTS: Data on formant frequencies and bandwidths are available for both sexes over the lifespan, but considerable variability in results across studies affects even features of the basic vowel quadrilateral. Origins of variability likely include differences in speech sample and time sampling point. The data reveal the emergence of sex differences by 4 years of age, maturational reductions in formant bandwidth, and decreased formant frequencies with advancing age in some persons. It appears that a combination of methods of data reduction provide for optimal data interpretation.

CONCLUSION: The lifespan database on vowel formants shows considerable variability within specific age-sex groups, pointing to the need for standardized procedures.

RevDate: 2018-06-09

Horáček J, Radolf V, AM Laukkanen (2018)

Impact Stress in Water Resistance Voice Therapy: A Physical Modeling Study.

Journal of voice : official journal of the Voice Foundation pii:S0892-1997(17)30463-0 [Epub ahead of print].

OBJECTIVES: Phonation through a tube in water is used in voice therapy. This study investigates whether this exercise may increase mechanical loading on the vocal folds.

STUDY DESIGN: This is an experimental modeling study.

METHODS: A model with three-layer silicone vocal fold replica and a plexiglass, MK Plexi, Prague vocal tract set for the articulation of vowel [u:] was used. Impact stress (IS) was measured in three conditions: for [u:] (1) without a tube, (2) with a silicon Lax Vox tube (35 cm in length, 1 cm in inner diameter) immersed 2 cm in water, and (3) with the tube immersed 10 cm in water. Subglottic pressure and airflow ranges were selected to correspond to those reported in normal human phonation.

RESULTS: Phonation threshold pressure was lower for phonation into water compared with [u:] without a tube. IS increased with the airflow rate. IS measured in the range of subglottic pressure, which corresponds to measurements in humans, was highest for vowel [u:] without a tube and lower with the tube in water.

CONCLUSIONS: Even though the model and humans cannot be directly compared, for instance due to differences in vocal tract wall properties, the results suggest that IS is not likely to increase harmfully in water resistance therapy. However, there may be other effects related to it, possibly causing symptoms of vocal fatigue (eg, increased activity in the adductors or high amplitudes of oral pressure variation probably capable of increasing stress in the vocal fold). These need to be studied further, especially for cases where the water bubbling frequency is close to the acoustical-mechanical resonance and at the same time the fundamental phonation frequency is near the first formant frequency of the system.

RevDate: 2018-07-17

Bauerly KR (2018)

The Effects of Emotion on Second Formant Frequency Fluctuations in Adults Who Stutter.

Folia phoniatrica et logopaedica : official organ of the International Association of Logopedics and Phoniatrics (IALP), 70(1):13-23.

OBJECTIVE: Changes in second formant frequency fluctuations (FFF2) were examined in adults who stutter (AWS) and adults who do not stutter (ANS) when producing nonwords under varying emotional conditions.

METHODS: Ten AWS and 10 ANS viewed images selected from the International Affective Picture System representing dimensions of arousal (e.g., excited versus bored) and hedonic valence (e.g., happy versus sad). Immediately following picture presentation, participants produced a consonant-vowel + final /t/ (CVt) nonword consisting of the initial sounds /p/, /b/, /s/, or /z/, followed by a vowel (/i/, /u/, /ε/) and a final /t/. CVt tokens were assessed for word duration and FFF2.

RESULTS: Significantly slower word durations were shown in the AWS compared to the ANS across conditions. Although these differences appeared to increase under arousing conditions, no interaction was found. Results for FFF2 revealed a significant group-condition interaction. Post hoc analysis indicated that this was due to the AWS showing significantly greater FFF2 when speaking under conditions eliciting increases in arousal and unpleasantness. ANS showed little change in FFF2 across conditions.

CONCLUSIONS: The results suggest that AWS' articulatory stability is more susceptible to breakdown under negative emotional influences.

RevDate: 2018-07-15

Fisher JM, Dick FK, Levy DF, et al (2018)

Neural representation of vowel formants in tonotopic auditory cortex.

NeuroImage, 178:574-582.

Speech sounds are encoded by distributed patterns of activity in bilateral superior temporal cortex. However, it is unclear whether speech sounds are topographically represented in cortex, or which acoustic or phonetic dimensions might be spatially mapped. Here, using functional MRI, we investigated the potential spatial representation of vowels, which are largely distinguished from one another by the frequencies of their first and second formants, i.e. peaks in their frequency spectra. This allowed us to generate clear hypotheses about the representation of specific vowels in tonotopic regions of auditory cortex. We scanned participants as they listened to multiple natural tokens of the vowels [ɑ] and [i], which we selected because their first and second formants overlap minimally. Formant-based regions of interest were defined for each vowel based on spectral analysis of the vowel stimuli and independently acquired tonotopic maps for each participant. We found that perception of [ɑ] and [i] yielded differential activation of tonotopic regions corresponding to formants of [ɑ] and [i], such that each vowel was associated with increased signal in tonotopic regions corresponding to its own formants. This pattern was observed in Heschl's gyrus and the superior temporal gyrus, in both hemispheres, and for both the first and second formants. Using linear discriminant analysis of mean signal change in formant-based regions of interest, the identity of untrained vowels was predicted with ∼73% accuracy. Our findings show that cortical encoding of vowels is scaffolded on tonotopy, a fundamental organizing principle of auditory cortex that is not language-specific.

RevDate: 2018-06-02

Dubey AK, Tripathi A, Prasanna SRM, et al (2018)

Detection of hypernasality based on vowel space area.

The Journal of the Acoustical Society of America, 143(5):EL412.

This study proposes a method for differentiating hypernasal-speech from normal speech using the vowel space area (VSA). Hypernasality introduces extra formant and anti-formant pairs in vowel spectrum, which results in shifting of formants. This shifting affects the size of the VSA. The results show that VSA is reduced in hypernasal-speech compared to normal speech. The VSA feature plus Mel-frequency cepstral coefficient feature for support vector machine based hypernasality detection leads to an accuracy of 86.89% for sustained vowels and 89.47%, 90.57%, and 91.70% for vowels in contexts of high pressure consonants /k/, /p/, and /t/, respectively.

RevDate: 2018-06-05

Story BH, Vorperian HK, Bunton K, et al (2018)

An age-dependent vocal tract model for males and females based on anatomic measurements.

The Journal of the Acoustical Society of America, 143(5):3079.

The purpose of this study was to take a first step toward constructing a developmental and sex-specific version of a parametric vocal tract area function model representative of male and female vocal tracts ranging in age from infancy to 12 yrs, as well as adults. Anatomic measurements collected from a large imaging database of male and female children and adults provided the dataset from which length warping and cross-dimension scaling functions were derived, and applied to the adult-based vocal tract model to project it backward along an age continuum. The resulting model was assessed qualitatively by projecting hypothetical vocal tract shapes onto midsagittal images from the cohort of children, and quantitatively by comparison of formant frequencies produced by the model to those reported in the literature. An additional validation of modeled vocal tract shapes was made possible by comparison to cross-sectional area measurements obtained for children and adults using acoustic pharyngometry. This initial attempt to generate a sex-specific developmental vocal tract model paves a path to study the relation of vocal tract dimensions to documented prepubertal acoustic differences.

RevDate: 2018-06-02

Carignan C (2018)

Using ultrasound and nasalance to separate oral and nasal contributions to formant frequencies of nasalized vowels.

The Journal of the Acoustical Society of America, 143(5):2588.

The experimental method described in this manuscript offers a possible means to address a well known issue in research on the independent effects of nasalization on vowel acoustics: given that the separate transfer functions associated with the oral and nasal cavities are merged in the acoustic signal, the task of teasing apart the respective effects of the two cavities seems to be an intractable problem. The proposed method uses ultrasound and nasalance to predict the effect of lingual configuration on formant frequencies of nasalized vowels, thus accounting for acoustic variation due to changing lingual posture and excluding its contribution to the acoustic signal. The results reveal that the independent effect of nasalization on the acoustic vowel quadrilateral resembles a counter-clockwise chain shift of nasal compared to non-nasal vowels. The results from the productions of 11 vowels by six speakers of different language backgrounds are compared to predictions presented in previous modeling studies, as well as discussed in the light of sound change of nasal vowel systems.

RevDate: 2018-05-31

Romanelli S, Menegotto A, R Smyth (2018)

Stress-Induced Acoustic Variation in L2 and L1 Spanish Vowels.

Phonetica pii:000484611 [Epub ahead of print].

AIM: We assessed the effect of lexical stress on the duration and quality of Spanish word-final vowels /a, e, o/ produced by American English late intermediate learners of L2 Spanish, as compared to those of native L1 Argentine Spanish speakers.

METHODS: Participants read 54 real words ending in /a, e, o/, with either final or penultimate lexical stress, embedded in a text and a word list. We measured vowel duration and both F1 and F2 frequencies at 3 temporal points.

RESULTS: stressed vowels were longer than unstressed vowels, in Spanish L1 and L2. L1 and L2 Spanish stressed /a/ and /e/ had higher F1 values than their unstressed counterparts. Only the L2 speakers showed evidence of rising offglides for /e/ and /o/. The L2 and L1 Spanish vowel space was compressed in the absence of stress.

CONCLUSION: Lexical stress affected the vowel quality of L1 and L2 Spanish vowels. We provide an up-to-date account of the formant trajectories of Argentine River Plate Spanish word-final /a, e, o/ and offer experimental support to the claim that stress affects the quality of Spanish vowels in word-final contexts.

RevDate: 2018-05-25

Peter V, Kalashnikova M, D Burnham (2018)

Weighting of Amplitude and Formant Rise Time Cues by School-Aged Children: A Mismatch Negativity Study.

Journal of speech, language, and hearing research : JSLHR, 61(5):1322-1333.

Purpose: An important skill in the development of speech perception is to apply optimal weights to acoustic cues so that phonemic information is recovered from speech with minimum effort. Here, we investigated the development of acoustic cue weighting of amplitude rise time (ART) and formant rise time (FRT) cues in children as measured by mismatch negativity (MMN).

Method: Twelve adults and 36 children aged 6-12 years listened to a /ba/-/wa/ contrast in an oddball paradigm in which the standard stimulus had the ART and FRT cues of /ba/. In different blocks, the deviant stimulus had either the ART or FRT cues of /wa/.

Results: The results revealed that children younger than 10 years were sensitive to both ART and FRT cues whereas 10- to 12-year-old children and adults were sensitive only to FRT cues. Moreover, children younger than 10 years generated a positive mismatch response, whereas older children and adults generated MMN.

Conclusion: These results suggest that preattentive adultlike weighting of ART and FRT cues is attained only by 10 years of age and accompanies the change from mismatch response to the more mature MMN response.

Supplemental Material: https://doi.org/10.23641/asha.6207608.

RevDate: 2018-06-21

Redford MA (2018)

Grammatical Word Production Across Metrical Contexts in School-Aged Children's and Adults' Speech.

Journal of speech, language, and hearing research : JSLHR, 61(6):1339-1354.

Purpose: The purpose of this study is to test whether age-related differences in grammatical word production are due to differences in how children and adults chunk speech for output or to immature articulatory timing control in children.

Method: Two groups of 12 children, 5 and 8 years old, and 1 group of 12 adults produced sentences with phrase-medial determiners. Preceding verbs were varied to create different metrical contexts for chunking the determiner with an adjacent content word. Following noun onsets were varied to assess the coherence of determiner-noun sequences. Determiner vowel duration, amplitude, and formant frequencies were measured.

Results: Children produced significantly longer and louder determiners than adults regardless of metrical context. The effect of noun onset on F1 was stronger in children's speech than in adults' speech; the effect of noun onset on F2 was stronger in adults' speech than in children's. Effects of metrical context on anticipatory formant patterns were more evident in children's speech than in adults' speech.

Conclusion: The results suggest that both immature articulatory timing control and age-related differences in how chunks are accessed or planned influence grammatical word production in school-aged children's speech. Future work will focus on the development of long-distance coarticulation to reveal the evolution of speech plan structure over time.

RevDate: 2018-05-24

Dugan SH, Silbert N, McAllister T, et al (2018)

Modelling category goodness judgments in children with residual sound errors.

Clinical linguistics & phonetics [Epub ahead of print].

This study investigates category goodness judgments of /r/ in adults and children with and without residual speech errors (RSEs) using natural speech stimuli. Thirty adults, 38 children with RSE (ages 7-16) and 35 age-matched typically developing (TD) children provided category goodness judgments on whole words, recorded from 27 child speakers, with /r/ in various phonetic environments. The salient acoustic property of /r/ - the lowered third formant (F3) - was normalized in two ways. A logistic mixed-effect model quantified the relationships between listeners' responses and the third formant frequency, vowel context and clinical group status. Goodness judgments from the adult group showed a statistically significant interaction with the F3 parameter when compared to both child groups (p < 0.001) using both normalization methods. The RSE group did not differ significantly from the TD group in judgments of /r/. All listeners were significantly more likely to judge /r/ as correct in a front-vowel context. Our results suggest that normalized /r/ F3 is a statistically significant predictor of category goodness judgments for both adults and children, but children do not appear to make adult-like judgments. Category goodness judgments do not have a clear relationship with /r/ production abilities in children with RSE. These findings may have implications for clinical activities that include category goodness judgments in natural speech, especially for recorded productions.

RevDate: 2018-06-19

Tai HC, Shen YP, Lin JH, et al (2018)

Acoustic evolution of old Italian violins from Amati to Stradivari.

Proceedings of the National Academy of Sciences of the United States of America, 115(23):5926-5931.

The shape and design of the modern violin are largely influenced by two makers from Cremona, Italy: The instrument was invented by Andrea Amati and then improved by Antonio Stradivari. Although the construction methods of Amati and Stradivari have been carefully examined, the underlying acoustic qualities which contribute to their popularity are little understood. According to Geminiani, a Baroque violinist, the ideal violin tone should "rival the most perfect human voice." To investigate whether Amati and Stradivari violins produce voice-like features, we recorded the scales of 15 antique Italian violins as well as male and female singers. The frequency response curves are similar between the Andrea Amati violin and human singers, up to ∼4.2 kHz. By linear predictive coding analyses, the first two formants of the Amati exhibit vowel-like qualities (F1/F2 = 503/1,583 Hz), mapping to the central region on the vowel diagram. Its third and fourth formants (F3/F4 = 2,602/3,731 Hz) resemble those produced by male singers. Using F1 to F4 values to estimate the corresponding vocal tract length, we observed that antique Italian violins generally resemble basses/baritones, but Stradivari violins are closer to tenors/altos. Furthermore, the vowel qualities of Stradivari violins show reduced backness and height. The unique formant properties displayed by Stradivari violins may represent the acoustic correlate of their distinctive brilliance perceived by musicians. Our data demonstrate that the pioneering designs of Cremonese violins exhibit voice-like qualities in their acoustic output.

RevDate: 2018-05-21

Niemczak CE, KR Vander Werff (2018)

Informational Masking Effects on Neural Encoding of Stimulus Onset and Acoustic Change.

Ear and hearing [Epub ahead of print].

OBJECTIVE: Recent investigations using cortical auditory evoked potentials have shown masker-dependent effects on sensory cortical processing of speech information. Background noise maskers consisting of other people talking are particularly difficult for speech recognition. Behavioral studies have related this to perceptual masking, or informational masking, beyond just the overlap of the masker and target at the auditory periphery. The aim of the present study was to use cortical auditory evoked potentials, to examine how maskers (i.e., continuous speech-shaped noise [SSN] and multi-talker babble) affect the cortical sensory encoding of speech information at an obligatory level of processing. Specifically, cortical responses to vowel onset and formant change were recorded under different background noise conditions presumed to represent varying amounts of energetic or informational masking. The hypothesis was, that even at this obligatory cortical level of sensory processing, we would observe larger effects on the amplitude and latency of the onset and change components as the amount of informational masking increased across background noise conditions.

DESIGN: Onset and change responses were recorded to a vowel change from /u-i/ in young adults under four conditions: quiet, continuous SSN, eight-talker (8T) babble, and two-talker (2T) babble. Repeated measures analyses by noise condition were conducted on amplitude, latency, and response area measurements to determine the differential effects of these noise conditions, designed to represent increasing and varying levels of informational and energetic masking, on cortical neural representation of a vowel onset and acoustic change response waveforms.

RESULTS: All noise conditions significantly reduced onset N1 and P2 amplitudes, onset N1-P2 peak to peak amplitudes, as well as both onset and change response area compared with quiet conditions. Further, all amplitude and area measures were significantly reduced for the two babble conditions compared with continuous SSN. However, there were no significant differences in peak amplitude or area for either onset or change responses between the two different babble conditions (eight versus two talkers). Mean latencies for all onset peaks were delayed for noise conditions compared with quiet. However, in contrast to the amplitude and area results, differences in peak latency between SSN and the babble conditions did not reach statistical significance.

CONCLUSIONS: These results support the idea that while background noise maskers generally reduce amplitude and increase latency of speech-sound evoked cortical responses, the type of masking has a significant influence. Speech babble maskers (eight talkers and two talkers) have a larger effect on the obligatory cortical response to speech sound onset and change compared with purely energetic continuous SSN maskers, which may be attributed to informational masking effects. Neither the neural responses to the onset nor the vowel change, however, were sensitive to the hypothesized increase in the amount of informational masking between speech babble maskers with two talkers compared with eight talkers.

RevDate: 2018-05-17

Sóskuthy M, Foulkes P, Hughes V, et al (2018)

Changing Words and Sounds: The Roles of Different Cognitive Units in Sound Change.

Topics in cognitive science [Epub ahead of print].

This study considers the role of different cognitive units in sound change: phonemes, contextual variants and words. We examine /u/-fronting and /j/-dropping in data from three generations of Derby English speakers. We analyze dynamic formant data and auditory judgments, using mixed effects regression methods, including generalized additive mixed models (GAMMs). /u/-fronting is reaching its end-point, showing complex conditioning by context and a frequency effect that weakens over time. /j/-dropping is declining, with low-frequency words showing more innovative variants with /j/ than high-frequency words. The two processes interact: words with variable /j/-dropping (new) exhibit more fronting than words that never have /j/ (noodle) even when the /j/ is deleted. These results support models of change that rely on phonetically detailed representations for both word- and sound-level cognitive units.

RevDate: 2018-05-16

Sanfins MD, Hatzopoulos S, Donadon C, et al (2018)

An Analysis of The Parameters Used In Speech ABR Assessment Protocols.

The journal of international advanced otology, 14(1):100-105.

The aim of this study was to assess the parameters of choice, such as duration, intensity, rate, polarity, number of sweeps, window length, stimulated ear, fundamental frequency, first formant, and second formant, from previously published speech ABR studies. To identify candidate articles, five databases were assessed using the following keyword descriptors: speech ABR, ABR-speech, speech auditory brainstem response, auditory evoked potential to speech, speech-evoked brainstem response, and complex sounds. The search identified 1288 articles published between 2005 and 2015. After filtering the total number of papers according to the inclusion and exclusion criteria, 21 studies were selected. Analyzing the protocol details used in 21 studies suggested that there is no consensus to date on a speech-ABR protocol and that the parameters of analysis used are quite variable between studies. This inhibits the wider generalization and extrapolation of data across languages and studies.

RevDate: 2018-05-15

Chen Y, Wang J, Chen W, et al (2017)

[Research on spectrum feature of speech processing strategy for cochlear implant].

Sheng wu yi xue gong cheng xue za zhi = Journal of biomedical engineering = Shengwu yixue gongchengxue zazhi, 34(5):760-766.

Cochlear implant (CI) in present Chinese environment will lose pitch information and result in low speech recognition. In order to research Chinese feature-based speech processing strategy for cochlear implant contrapuntally and to improve the speech recognition for CI recipients, we improve the CI front-end signal acquisition platform and research the signal features. Our search includes the waveform, spectrogram, energy intensity, pitch and formant parameters for different speech processing strategies of cochlear implant. Features in two kinds of speech processing strategies are analyzed and extracted for the study of parameter characteristics. Therefore, the proposed aim of this paper is to extend the research on Chinese-based CI speech processing strategy.

RevDate: 2018-05-10

Wang Q, Bai J, Xue P, et al (2018)

[An acoustic-articulatory study of the nasal finals in students with and without hearing loss].

Sheng wu yi xue gong cheng xue za zhi = Journal of biomedical engineering = Shengwu yixue gongchengxue zazhi, 35(2):198-205.

The central aim of this experiment was to compare the articulatory and acoustic characteristics of students with normal hearing (NH) and school aged children with hearing loss (HL), and to explore the articulatory-acoustic relations during the nasal finals. Fourteen HL and 10 control group were enrolled in this study, and the data of 4 HL students were removed because of their high pronunciation error rate. Data were collected using an electromagnetic articulography. The acoustic data and kinematics data of nasal finals were extracted by the phonetics and data processing software, and all data were analyzed by t test and correlation analysis. The paper shows that, the difference was statistically significant (P<0.05 or P<0.01) in different vowels under the first two formant frequencies (F1, F2), the tongue position and the articulatory-acoustic relations between HL and NH group. The HL group's vertical movement data-F1 relations in /en/ and /eng/ are same as NH group. The conclusion of this study about participants with HL can provide support for speech healing training at increasing pronunciation accuracy in HL participants.

RevDate: 2018-05-07

Lee Y, Kim G, Wang S, et al (2018)

Acoustic Characteristics in Epiglottic Cyst.

Journal of voice : official journal of the Voice Foundation pii:S0892-1997(17)30529-5 [Epub ahead of print].

OBJECTIVE: The purpose of this study was to analyze the acoustic characteristics associated with alternation deformation of the vocal tract due to large epiglottic cyst, and to confirm the relation between the anatomical change and resonant function of the vocal tract.

METHODS: Eight men with epiglottic cyst were enrolled in this study. The jitter, shimmer, noise-to-harmonic ratio, and first two formants were analyzed in vowels /a:/, /e:/, /i:/, /o:/, and /u:/. These values were analyzed before and after laryngeal microsurgery.

RESULTS: The F1 value of /a:/ was significantly raised after surgery. Significant differences of formant frequencies in other vowels, jitter, shimmer, and noise-to-harmonic ratio were not presented.

CONCLUSION: The results of this study could be used to analyze changes in the resonance of vocal tracts due to the epiglottic cysts.

RevDate: 2018-05-25

Whitfield JA, Dromey C, P Palmer (2018)

Examining Acoustic and Kinematic Measures of Articulatory Working Space: Effects of Speech Intensity.

Journal of speech, language, and hearing research : JSLHR, 61(5):1104-1117.

Purpose: The purpose of this study was to examine the effect of speech intensity on acoustic and kinematic vowel space measures and conduct a preliminary examination of the relationship between kinematic and acoustic vowel space metrics calculated from continuously sampled lingual marker and formant traces.

Method: Young adult speakers produced 3 repetitions of 2 different sentences at 3 different loudness levels. Lingual kinematic and acoustic signals were collected and analyzed. Acoustic and kinematic variants of several vowel space metrics were calculated from the formant frequencies and the position of 2 lingual markers. Traditional metrics included triangular vowel space area and the vowel articulation index. Acoustic and kinematic variants of sentence-level metrics based on the articulatory-acoustic vowel space and the vowel space hull area were also calculated.

Results: Both acoustic and kinematic variants of the sentence-level metrics significantly increased with an increase in loudness, whereas no statistically significant differences in traditional vowel-point metrics were observed for either the kinematic or acoustic variants across the 3 loudness conditions. In addition, moderate-to-strong relationships between the acoustic and kinematic variants of the sentence-level vowel space metrics were observed for the majority of participants.

Conclusions: These data suggest that both kinematic and acoustic vowel space metrics that reflect the dynamic contributions of both consonant and vowel segments are sensitive to within-speaker changes in articulation associated with manipulations of speech intensity.

RevDate: 2018-05-18

DiNino M, JG Arenberg (2018)

Age-Related Performance on Vowel Identification and the Spectral-temporally Modulated Ripple Test in Children With Normal Hearing and With Cochlear Implants.

Trends in hearing, 22:2331216518770959.

Children's performance on psychoacoustic tasks improves with age, but inadequate auditory input may delay this maturation. Cochlear implant (CI) users receive a degraded auditory signal with reduced frequency resolution compared with normal, acoustic hearing; thus, immature auditory abilities may contribute to the variation among pediatric CI users' speech recognition scores. This study investigated relationships between age-related variables, spectral resolution, and vowel identification scores in prelingually deafened, early-implanted children with CIs compared with normal hearing (NH) children. All participants performed vowel identification and the Spectral-temporally Modulated Ripple Test (SMRT). Vowel stimuli for NH children were vocoded to simulate the reduced spectral resolution of CI hearing. Age positively predicted NH children's vocoded vowel identification scores, but time with the CI was a stronger predictor of vowel recognition and SMRT performance of children with CIs. For both groups, SMRT thresholds were related to vowel identification performance, analogous to previous findings in adults. Sequential information analysis of vowel feature perception indicated greater transmission of duration-related information compared with formant features in both groups of children. In addition, the amount of F2 information transmitted predicted SMRT thresholds in children with NH and with CIs. Comparisons between the two CIs of bilaterally implanted children revealed disparate task performance levels and information transmission values within the same child. These findings indicate that adequate auditory experience contributes to auditory perceptual abilities of pediatric CI users. Further, factors related to individual CIs may be more relevant to psychoacoustic task performance than are the overall capabilities of the child.

RevDate: 2018-04-27

Chiaramonte R, Di Luciano C, Chiaramonte I, et al (2018)

Multi-disciplinary clinical protocol for the diagnosis of bulbar amyotrophic lateral sclerosis.

Acta otorrinolaringologica espanola pii:S0001-6519(18)30056-6 [Epub ahead of print].

INTRODUCTION AND OBJECTIVES: The objective of this study was to examine the role of different specialists in the diagnosis of amyotrophic lateral sclerosis (ALS), to understand changes in verbal expression and phonation, respiratory dynamics and swallowing that occurred rapidly over a short period of time.

MATERIALS AND METHODS: 22 patients with bulbar ALS were submitted for voice assessment, ENT evaluation, Multi-Dimensional Voice Program (MDVP), spectrogram, electroglottography, fiberoptic endoscopic evaluation of swallowing.

RESULTS: In the early stage of the disease, the oral tract and velopharyngeal port were involved. Three months after the initial symptoms, most of the patients presented hoarseness, breathy voice, dysarthria, pitch modulation problems and difficulties in pronunciation of explosive, velar and lingual consonants. Values of MDVP were altered. Spectrogram showed an additional formant, due to nasal resonance. Electroglottography showed periodic oscillation of the vocal folds only during short vocal cycle. Swallowing was characterized by weakness and incoordination of oro-pharyngeal muscles with penetration or aspiration.

CONCLUSIONS: A specific multidisciplinary clinical protocol was designed to report vocal parameters and swallowing disorders that changed more quickly in bulbar ALS patients. Furthermore, the patients were stratified according to involvement of pharyngeal structures, and severity index.

RevDate: 2018-04-25

Prévost F, A Lehmann (2018)

Saliency of Vowel Features in Neural Responses of Cochlear Implant Users.

Clinical EEG and neuroscience [Epub ahead of print].

Cochlear implants restore hearing in deaf individuals, but speech perception remains challenging. Poor discrimination of spectral components is thought to account for limitations of speech recognition in cochlear implant users. We investigated how combined variations of spectral components along two orthogonal dimensions can maximize neural discrimination between two vowels, as measured by mismatch negativity. Adult cochlear implant users and matched normal-hearing listeners underwent electroencephalographic event-related potentials recordings in an optimum-1 oddball paradigm. A standard /a/ vowel was delivered in an acoustic free field along with stimuli having a deviant fundamental frequency (+3 and +6 semitones), a deviant first formant making it a /i/ vowel or combined deviant fundamental frequency and first formant (+3 and +6 semitones /i/ vowels). Speech recognition was assessed with a word repetition task. An analysis of variance between both amplitude and latency of mismatch negativity elicited by each deviant vowel was performed. The strength of correlations between these parameters of mismatch negativity and speech recognition as well as participants' age was assessed. Amplitude of mismatch negativity was weaker in cochlear implant users but was maximized by variations of vowels' first formant. Latency of mismatch negativity was later in cochlear implant users and was particularly extended by variations of the fundamental frequency. Speech recognition correlated with parameters of mismatch negativity elicited by the specific variation of the first formant. This nonlinear effect of acoustic parameters on neural discrimination of vowels has implications for implant processor programming and aural rehabilitation.

RevDate: 2018-06-20

Hamdan AL, Khandakji M, AT Macari (2018)

Maxillary arch dimensions associated with acoustic parameters in prepubertal children.

The Angle orthodontist, 88(4):410-415.

OBJECTIVES: To evaluate the association between maxillary arch dimensions and fundamental frequency and formants of voice in prepubertal subjects.

MATERIALS AND METHODS: Thirty-five consecutive prepubertal patients seeking orthodontic treatment were recruited (mean age = 11.41 ± 1.46 years; range, 8 to 13.7 years). Participants with a history of respiratory infection, laryngeal manipulation, dysphonia, congenital facial malformations, or history of orthodontic treatment were excluded. Dental measurements included maxillary arch length, perimeter, depth, and width. Voice parameters comprising fundamental frequency (f0_sustained), Habitual pitch (f0_count), Jitter, Shimmer, and different formant frequencies (F1, F2, F3, and F4) were measured using acoustic analysis prior to initiation of any orthodontic treatment. Pearson's correlation coefficients were used to measure the strength of associations between different dental and voice parameters. Multiple linear regressions were computed for the predictions of different dental measurements.

RESULTS: Arch width and arch depth had moderate significant negative correlations with f0 (r = -0.52; P = .001 and r = -0.39; P = .022, respectively) and with habitual frequency (r = -0.51; P = .0014 and r = -0.34; P = .04, respectively). Arch depth and arch length were significantly correlated with formant F3 and formant F4, respectively. Predictors of arch depth included frequencies of F3 vowels, with a significant regression equation (P-value < .001; R2 = 0.49). Similarly, fundamental frequency f0 and frequencies of formant F3 vowels were predictors of arch width, with a significant regression equation (P-value < .001; R2 = 0.37).

CONCLUSIONS: There is a significant association between arch dimensions, particularly arch length and depth, and voice parameters. The formant most predictive of arch depth and width is the third formant, along with fundamental frequency of voice.

RevDate: 2018-07-05

Elgendi M, Bobhate P, Jain S, et al (2018)

The Voice of the Heart: Vowel-Like Sound in Pulmonary Artery Hypertension.

Diseases (Basel, Switzerland), 6(2): pii:diseases6020026.

Increased blood pressure in the pulmonary artery is referred to as pulmonary hypertension and often is linked to loud pulmonic valve closures. For the purpose of this paper, it was hypothesized that pulmonary circulation vibrations will create sounds similar to sounds created by vocal cords during speech and that subjects with pulmonary artery hypertension (PAH) could have unique sound signatures across four auscultatory sites. Using a digital stethoscope, heart sounds were recorded at the cardiac apex, 2nd left intercostal space (2LICS), 2nd right intercostal space (2RICS), and 4th left intercostal space (4LICS) undergoing simultaneous cardiac catheterization. From the collected heart sounds, relative power of the frequency band, energy of the sinusoid formants, and entropy were extracted. PAH subjects were differentiated by applying the linear discriminant analysis with leave-one-out cross-validation. The entropy of the first sinusoid formant decreased significantly in subjects with a mean pulmonary artery pressure (mPAp) &ge; 25 mmHg versus subjects with a mPAp < 25 mmHg with a sensitivity of 84% and specificity of 88.57%, within a 10-s optimized window length for heart sounds recorded at the 2LICS. First sinusoid formant entropy reduction of heart sounds in PAH subjects suggests the existence of a vowel-like pattern. Pattern analysis revealed a unique sound signature, which could be used in non-invasive screening tools.

RevDate: 2018-04-21

Brumberg JS, Pitt KM, JD Burnison (2018)

A Noninvasive Brain-Computer Interface for Real-Time Speech Synthesis: The Importance of Multimodal Feedback.

IEEE transactions on neural systems and rehabilitation engineering : a publication of the IEEE Engineering in Medicine and Biology Society, 26(4):874-881.

We conducted a study of a motor imagery brain-computer interface (BCI) using electroencephalography to continuously control a formant frequency speech synthesizer with instantaneous auditory and visual feedback. Over a three-session training period, sixteen participants learned to control the BCI for production of three vowel sounds (/ textipa i/ [heed], / textipa A/ [hot], and / textipa u/ [who'd]) and were split into three groups: those receiving unimodal auditory feedback of synthesized speech, those receiving unimodal visual feedback of formant frequencies, and those receiving multimodal, audio-visual (AV) feedback. Audio feedback was provided by a formant frequency artificial speech synthesizer, and visual feedback was given as a 2-D cursor on a graphical representation of the plane defined by the first two formant frequencies. We found that combined AV feedback led to the greatest performance in terms of percent accuracy, distance to target, and movement time to target compared with either unimodal feedback of auditory or visual information. These results indicate that performance is enhanced when multimodal feedback is meaningful for the BCI task goals, rather than as a generic biofeedback signal of BCI progress.

RevDate: 2018-04-10

Li G, Li H, Hou Q, et al (2018)

Distinct Acoustic Features and Glottal Changes Define Two Modes of Singing in Peking Opera.

Journal of voice : official journal of the Voice Foundation pii:S0892-1997(17)30355-7 [Epub ahead of print].

OBJECTIVE: We aimed to delineate the acoustic characteristics of the Laodan and Qingyi role in Peking Opera and define glottis closure states and mucosal wave changes during singing in the two roles.

METHODS: The range of singing in A4 (440 Hz) pitch in seven female Peking Opera singers was determined using two classic pieces of Peking Opera. Glottal changes during singing were examined by stroboscopic laryngoscope. The fundamental frequency of /i/ in the first 15 seconds of the two pieces and the /i/ pitch range were determined. The relative length of the glottis fissure and the relative maximum mucosal amplitude were calculated.

RESULTS: Qingyi had significantly higher mean fundamental frequency than Laodan. The long-term average spectrum showed an obvious formant cluster near 3000 Hz in Laodan versus Qingyi. No formant cluster was observed in singing in the regular mode. Strobe laryngoscopy showed complete glottal closure in Laodan and incomplete glottal closure in Qingyi in the maximal glottis closure phase. The relative length of the glottis fissure of Laodan was significantly lower than that of Qingyi in the singing mode. The relative maximum mucosal amplitude of Qingyi was significantly lower than that of Laodan.

CONCLUSION: The Laodan role and the Qingyi role in Peking Opera sing in a fundamental frequency range compatible with the respective use of da sang (big voice) and xiao sang (small voice). The morphological patterns of glottal changes also indicate that the Laodan role and the Qingyi role sing with da sang and xiao sang, respectively.

RevDate: 2018-06-04

Brajot FX, Nguyen D, DiGiovanni J, et al (2018)

The impact of perilaryngeal vibration on the self-perception of loudness and the Lombard effect.

Experimental brain research, 236(6):1713-1723.

The role of somatosensory feedback in speech and the perception of loudness was assessed in adults without speech or hearing disorders. Participants completed two tasks: loudness magnitude estimation of a short vowel and oral reading of a standard passage. Both tasks were carried out in each of three conditions: no-masking, auditory masking alone, and mixed auditory masking plus vibration of the perilaryngeal area. A Lombard effect was elicited in both masking conditions: speakers unconsciously increased vocal intensity. Perilaryngeal vibration further increased vocal intensity above what was observed for auditory masking alone. Both masking conditions affected fundamental frequency and the first formant frequency as well, but only vibration was associated with a significant change in the second formant frequency. An additional analysis of pure-tone thresholds found no difference in auditory thresholds between masking conditions. Taken together, these findings indicate that perilaryngeal vibration effectively masked somatosensory feedback, resulting in an enhanced Lombard effect (increased vocal intensity) that did not alter speakers' self-perception of loudness. This implies that the Lombard effect results from a general sensorimotor process, rather than from a specific audio-vocal mechanism, and that the conscious self-monitoring of speech intensity is not directly based on either auditory or somatosensory feedback.

RevDate: 2018-04-01

Lawson E, Stuart-Smith J, JM Scobbie (2018)

The role of gesture delay in coda /r/ weakening: An articulatory, auditory and acoustic study.

The Journal of the Acoustical Society of America, 143(3):1646.

The cross-linguistic tendency of coda consonants to weaken, vocalize, or be deleted is shown to have a phonetic basis, resulting from gesture reduction, or variation in gesture timing. This study investigates the effects of the timing of the anterior tongue gesture for coda /r/ on acoustics and perceived strength of rhoticity, making use of two sociolects of Central Scotland (working- and middle-class) where coda /r/ is weakening and strengthening, respectively. Previous articulatory analysis revealed a strong tendency for these sociolects to use different coda /r/ tongue configurations-working- and middle-class speakers tend to use tip/front raised and bunched variants, respectively; however, this finding does not explain working-class /r/ weakening. A correlational analysis in the current study showed a robust relationship between anterior lingual gesture timing, F3, and percept of rhoticity. A linear mixed effects regression analysis showed that both speaker social class and linguistic factors (word structure and the checked/unchecked status of the prerhotic vowel) had significant effects on tongue gesture timing and formant values. This study provides further evidence that gesture delay can be a phonetic mechanism for coda rhotic weakening and apparent loss, but social class emerges as the dominant factor driving lingual gesture timing variation.

RevDate: 2018-05-01

Waaramaa T, Kukkonen T, Mykkänen S, et al (2018)

Vocal Emotion Identification by Children Using Cochlear Implants, Relations to Voice Quality, and Musical Interests.

Journal of speech, language, and hearing research : JSLHR, 61(4):973-985.

Purpose: Listening tests for emotion identification were conducted with 8-17-year-old children with hearing impairment (HI; N = 25) using cochlear implants, and their 12-year-old peers with normal hearing (N = 18). The study examined the impact of musical interests and acoustics of the stimuli on correct emotion identification.

Method: The children completed a questionnaire with their background information and noting musical interests. They then listened to vocal stimuli produced by actors (N = 5) and consisting of nonsense sentences and prolonged vowels ([a:], [i:], and [u:]; N = 32) expressing excitement, anger, contentment, and fear. The children's task was to identify the emotions they heard in the sample by choosing from the provided options. Acoustics of the samples were studied using Praat software, and statistics were examined using SPSS 24 software.

Results: The children with HI identified the emotions with 57% accuracy and the normal hearing children with 75% accuracy. Female listeners were more accurate than male listeners in both groups. Those who were implanted before age of 3 years identified emotions more accurately than others (p < .05). No connection between the child's audiogram and correct identification was observed. Musical interests and voice quality parameters were found to be related to correct identification.

Conclusions: Implantation age, musical interests, and voice quality tended to have an impact on correct emotion identification. Thus, in developing the cochlear implants, it may be worth paying attention to the acoustic structures of vocal emotional expressions, especially the formant frequency of F3. Supporting the musical interests of children with HI may help their emotional development and improve their social lives.

RevDate: 2018-03-23

de Andrade BMR, Valença EHO, Salvatori R, et al (2018)

Effects of Therapy With Semi-occluded Vocal Tract and Choir Training on Voice in Adult Individuals With Congenital, Isolated, Untreated Growth Hormone Deficiency.

Journal of voice : official journal of the Voice Foundation pii:S0892-1997(18)30006-7 [Epub ahead of print].

OBJECTIVES: Voice is produced by the vibration of the vocal folds expressed by its fundamental frequency (Hz), whereas the formants (F) are fundamental frequency multiples, indicating amplification zones of the vowels in the vocal tract. We have shown that lifetime isolated growth hormone deficiency (IGHD) causes high pitch voice, with higher values of most formant frequencies, maintaining a prepuberal acoustic prediction. The objectives of this work were to verify the effects of the therapy with a semi-occluded vocal tract (SOVTT) and choir training on voice in these subjects with IGHD. We speculated that acoustic vocal parameters can be improved by SOVTT or choir training.

STUDY DESIGN: This is a prospective longitudinal study without control group.

METHODS: Acoustic analysis of isolated vowels was performed in 17 adults with IGHD before and after SOVTT (pre-SOVTT and post-SOVTT) and after choir training (post training), in a 30-day period.

RESULTS: The first formant was higher in post training compared with the pre-SOVTT (P = 0.009). The second formant was higher in post-SOVTT than in pre-SOVTT (P = 0.045). There was a trend of reduction in shimmer in post-choir training in comparison with pre-SOVTT (P = 0.051), and a reduction in post-choir training in comparison with post-SOVTT (P = 0.047).

CONCLUSIONS: SOVTT was relevant to the second formant, whereas choir training improved first formant and shimmer. Therefore, this speech therapy approach was able to improve acoustic parameters of the voice of individuals with congenital, untreated IGHD. This seems particularly important in a scenario in which few patients are submitted to growth hormone replacement therapy.

RevDate: 2018-07-11

Masapollo M, Polka L, Ménard L, et al (2018)

Asymmetries in unimodal visual vowel perception: The roles of oral-facial kinematics, orientation, and configuration.

Journal of experimental psychology. Human perception and performance, 44(7):1103-1118.

Masapollo, Polka, and Ménard (2017) recently reported a robust directional asymmetry in unimodal visual vowel perception: Adult perceivers discriminate a change from an English /u/ viseme to a French /u/ viseme significantly better than a change in the reverse direction. This asymmetry replicates a frequent pattern found in unimodal auditory vowel perception that points to a universal bias favoring more extreme vocalic articulations, which lead to acoustic signals with increased formant convergence. In the present article, the authors report 5 experiments designed to investigate whether this asymmetry in the visual realm reflects a speech-specific or general processing bias. They successfully replicated the directional effect using Masapollo et al.'s dynamically articulating faces but failed to replicate the effect when the faces were shown under static conditions. Asymmetries also emerged during discrimination of canonically oriented point-light stimuli that retained the kinematics and configuration of the articulating mouth. In contrast, no asymmetries emerged during discrimination of rotated point-light stimuli or Lissajou patterns that retained the kinematics, but not the canonical orientation or spatial configuration, of the labial gestures. These findings suggest that the perceptual processes underlying asymmetries in unimodal visual vowel discrimination are sensitive to speech-specific motion and configural properties and raise foundational questions concerning the role of specialized and general processes in vowel perception. (PsycINFO Database Record

RevDate: 2018-03-16

Tamura S, Ito K, Hirose N, et al (2018)

Psychophysical Boundary for Categorization of Voiced-Voiceless Stop Consonants in Native Japanese Speakers.

Journal of speech, language, and hearing research : JSLHR, 61(3):789-796.

Purpose: The purpose of this study was to investigate the psychophysical boundary used for categorization of voiced-voiceless stop consonants in native Japanese speakers.

Method: Twelve native Japanese speakers participated in the experiment. The stimuli were synthetic stop consonant-vowel stimuli varying in voice onset time (VOT) with manipulation of the amplitude of the initial noise portion and the first formant (F1) frequency of the periodic portion. There were 3 tasks, namely, speech identification to either /d/ or /t/, detection of the noise portion, and simultaneity judgment of onsets of the noise and periodic portions.

Results: The VOT boundaries of /d/-/t/ were close to the shortest VOT values that allowed for detection of the noise portion but not to those for perceived nonsimultaneity of the noise and periodic portions. The slopes of noise detection functions along VOT were as sharp as those of voiced-voiceless identification functions. In addition, the effects of manipulating the amplitude of the noise portion and the F1 frequency of the periodic portion on the detection of the noise portion were similar to those on voiced-voiceless identification.

Conclusion: The psychophysical boundary of perception of the initial noise portion masked by the following periodic portion may be used for voiced-voiceless categorization by Japanese speakers.

RevDate: 2018-03-02

Roberts B, RJ Summers (2018)

Informational masking of speech by time-varying competitors: Effects of frequency region and number of interfering formants.

The Journal of the Acoustical Society of America, 143(2):891.

This study explored the extent to which informational masking of speech depends on the frequency region and number of extraneous formants in an interferer. Target formants-monotonized three-formant (F1+F2+F3) analogues of natural sentences-were presented monaurally, with target ear assigned randomly on each trial. Interferers were presented contralaterally. In experiment 1, single-formant interferers were created using the time-reversed F2 frequency contour and constant amplitude, root-mean-square (RMS)-matched to F2. Interferer center frequency was matched to that of F1, F2, or F3, while maintaining the extent of formant-frequency variation (depth) on a log scale. Adding an interferer lowered intelligibility; the effect of frequency region was small and broadly tuned around F2. In experiment 2, interferers comprised either one formant (F1, the most intense) or all three, created using the time-reversed frequency contours of the corresponding targets and RMS-matched constant amplitudes. Interferer formant-frequency variation was scaled to 0%, 50%, or 100% of the original depth. Increasing the depth of formant-frequency variation and number of formants in the interferer had independent and additive effects. These findings suggest that the impact on intelligibility depends primarily on the overall extent of frequency variation in each interfering formant (up to ∼100% depth) and the number of extraneous formants.

RevDate: 2018-03-02

Barreda S, ZY Liu (2018)

Apparent-talker height is influenced by Mandarin lexical tone.

The Journal of the Acoustical Society of America, 143(2):EL61.

Apparent-talker height is determined by a talker's fundamental frequency (f0) and spectral information, typically indexed using formant frequencies (FFs). Barreda [(2017b). J. Acoust. Soc. Am. 141, 4781-4792] reports that the apparent height of a talker can be influenced by vowel-specific variation in the f0 or FFs of a sound. In this experiment, native speakers of Mandarin were presented with a series of syllables produced by talkers of different apparent heights. Results indicate that there is substantial variability in the estimated height of a single talker based on lexical tone, as well as the inherent f0 and FFs of vowel phonemes.

RevDate: 2018-03-16

Croake DJ, Andreatta RD, JC Stemple (2018)

Vocalization Subsystem Responses to a Temporarily Induced Unilateral Vocal Fold Paralysis.

Journal of speech, language, and hearing research : JSLHR, 61(3):479-495.

Purpose: The purpose of this study is to quantify the interactions of the 3 vocalization subsystems of respiration, phonation, and resonance before, during, and after a perturbation to the larynx (temporarily induced unilateral vocal fold paralysis) in 10 vocally healthy participants. Using dynamic systems theory as a guide, we hypothesized that data groupings would emerge revealing context-dependent patterns in the relationships of variables representing the 3 vocalization subsystems. We also hypothesized that group data would mask important individual variability important to understanding the relationships among the vocalization subsystems.

Method: A perturbation paradigm was used to obtain respiratory kinematic, aerodynamic, and acoustic formant measures from 10 healthy participants (8 women, 2 men) with normal voices. Group and individual data were analyzed to provide a multilevel analysis of the data. A 3-dimensional state space model was constructed to demonstrate the interactive relationships among the 3 subsystems before, during, and after perturbation.

Results: During perturbation, group data revealed that lung volume initiations and terminations were lower, with longer respiratory excursions; airflow rates increased while subglottic pressures were maintained. Acoustic formant measures indicated that the spacing between the upper formants decreased (F3-F5), whereas the spacing between F1 and F2 increased. State space modeling revealed the changing directionality and interactions among the 3 subsystems.

Conclusions: Group data alone masked important variability necessary to understand the unique relationships among the 3 subsystems. Multilevel analysis permitted a richer understanding of the individual differences in phonatory regulation and permitted subgroup analysis. Dynamic systems theory may be a useful heuristic to model the interactive relationships among vocalization subsystems.

Supplemental Material: https://doi.org/10.23641/asha.5913532.

RevDate: 2018-03-02

Compton MT, Lunden A, Cleary SD, et al (2018)

The aprosody of schizophrenia: Computationally derived acoustic phonetic underpinnings of monotone speech.

Schizophrenia research pii:S0920-9964(18)30027-6 [Epub ahead of print].

OBJECTIVE: Acoustic phonetic methods are useful in examining some symptoms of schizophrenia; we used such methods to understand the underpinnings of aprosody. We hypothesized that, compared to controls and patients without clinically rated aprosody, patients with aprosody would exhibit reduced variability in: pitch (F0), jaw/mouth opening and tongue height (formant F1), tongue front/back position and/or lip rounding (formant F2), and intensity/loudness.

METHODS: Audiorecorded speech was obtained from 98 patients (including 25 with clinically rated aprosody and 29 without) and 102 unaffected controls using five tasks: one describing a drawing, two based on spontaneous speech elicited through a question (Tasks 2 and 3), and two based on reading prose excerpts (Tasks 4 and 5). We compared groups on variation in pitch (F0), formant F1 and F2, and intensity/loudness.

RESULTS: Regarding pitch variation, patients with aprosody differed significantly from controls in Task 5 in both unadjusted tests and those adjusted for sociodemographics. For the standard deviation (SD) of F1, no significant differences were found in adjusted tests. Regarding SD of F2, patients with aprosody had lower values than controls in Task 3, 4, and 5. For variation in intensity/loudness, patients with aprosody had lower values than patients without aprosody and controls across the five tasks.

CONCLUSIONS: Findings could represent a step toward developing new methods for measuring and tracking the severity of this specific negative symptom using acoustic phonetic parameters; such work is relevant to other psychiatric and neurological disorders.

RevDate: 2018-04-08

Howson P (2018)

Rhotics and Palatalization: An Acoustic Examination of Upper and Lower Sorbian.

Phonetica, 75(2):132-150.

Two of the major problems with rhotics are: (1) rhotics, unlike most other classes, are highly resistant to secondary palatalization, and (2) acoustic cues for rhotics as a class have been elusive. This study examines the acoustics of Upper and Lower Sorbian rhotics. Dynamic measures of the F1-F3 and F2-F1 were recorded and compared using SSANOVAs. The results indicate there is a striking delay in achievement of F2 for both the palatalized rhotics, while F2, F1, and F2-F1 are similar for all the rhotics tested here. The results suggest an inherent articulatory conflict between rhotics and secondary palatalization. The delay in the F2 increase indicates a delay in the palatalization gesture. This is likely due to conflicting constraints on the tongue dorsum. There was also an overlap in the F2 and F2-F1 for both the uvular and alveolar rhotics. This suggests a strong acoustic cue to rhotic classhood is found in the F2 signal. The overall formant similarities in frequency and trajectory also suggest a strong similarity in the vocal tract shapes between uvular and alveolar rhotics.

RevDate: 2018-03-08

Cameron S, Chong-White N, Mealings K, et al (2018)

The Phoneme Identification Test for Assessment of Spectral and Temporal Discrimination Skills in Children: Development, Normative Data, and Test-Retest Reliability Studies.

Journal of the American Academy of Audiology, 29(2):135-150.

BACKGROUND: Previous research suggests that a proportion of children experiencing reading and listening difficulties may have an underlying primary deficit in the way that the central auditory nervous system analyses the perceptually important, rapidly varying, formant frequency components of speech.

PURPOSE: The Phoneme Identification Test (PIT) was developed to investigate the ability of children to use spectro-temporal cues to perceptually categorize speech sounds based on their rapidly changing formant frequencies. The PIT uses an adaptive two-alternative forced-choice procedure whereby the participant identifies a synthesized consonant-vowel (CV) (/ba/ or /da/) syllable. CV syllables differed only in the second formant (F2) frequency along an 11-step continuum (between 0% and 100%-representing an ideal /ba/ and /da/, respectively). The CV syllables were presented in either quiet (PIT Q) or noise at a 0 dB signal-to-noise ratio (PIT N).

RESEARCH DESIGN: Development of the PIT stimuli and test protocols, and collection of normative and test-retest reliability data.

STUDY SAMPLE: Twelve adults (aged 23 yr 10 mo to 50 yr 9 mo, mean 32 yr 5 mo) and 137 typically developing, primary-school children (aged 6 yr 0 mo to 12 yr 4 mo, mean 9 yr 3 mo). There were 73 males and 76 females.

DATA COLLECTION AND ANALYSIS: Data were collected using a touchscreen computer. Psychometric functions were automatically fit to individual data by the PIT software. Performance was determined by the width of the continuum for which responses were neither clearly /ba/ nor /da/ (referred to as the uncertainty region [UR]). A shallower psychometric function slope reflected greater uncertainty. Age effects were determined based on raw scores. Z scores were calculated to account for the effect of age on performance. Outliers, and individual data for which the confidence interval of the UR exceeded a maximum allowable value, were removed. Nonparametric tests were used as the data were skewed toward negative performance.

RESULTS: Across participants, the median value of the F2 range that resulted in uncertain responses was 33% in quiet and 40% in noise. There was a significant effect of age on the width of this UR (p < 0.00001) in both quiet and noise, with performance becoming adult like by age 9 on the PIT Q and age 10 on the PIT N. A skewed distribution toward negative performance occurred in both quiet (p = 0.01) and noise (p = 0.006). Median UR scores were significantly wider in noise than in quiet (T = 2041, p < 0.0000001). Performance (z scores) across the two tests was significantly correlated (r = 0.36, p = 0.000009). Test-retest z scores were significantly correlated in both quiet and noise (r = 0.4 and 0.37, respectively, p < 0.0001).

CONCLUSIONS: The PIT normative data show that the ability to identify phonemes based on changes in formant transitions improves with age, and that some children in the general population have performance much worse than their age peers. In children, uncertainty increases when the stimuli are presented in noise. The test is suitable for use in planned studies in a clinical population.

RevDate: 2018-02-16

Holt YF (2018)

Mechanisms of Vowel Variation in African American English.

Journal of speech, language, and hearing research : JSLHR, 61(2):197-209.

Purpose: This research explored mechanisms of vowel variation in African American English by comparing 2 geographically distant groups of African American and White American English speakers for participation in the African American Shift and the Southern Vowel Shift.

Method: Thirty-two male (African American: n = 16, White American controls: n = 16) lifelong residents of cities in eastern and western North Carolina produced heed,hid,heyd,head,had,hod,hawed,whod,hood,hoed,hide,howed,hoyd, and heard 3 times each in random order. Formant frequency, duration, and acoustic analyses were completed for the vowels /i, ɪ, e, ɛ, æ, ɑ, ɔ, u, ʊ, o, aɪ, aʊ, oɪ, ɝ/ produced in the listed words.

Results: African American English speakers show vowel variation. In the west, the African American English speakers are participating in the Southern Vowel Shift and hod fronting of the African American Shift. In the east, neither the African American English speakers nor their White peers are participating in the Southern Vowel Shift. The African American English speakers show limited participation in the African American Shift.

Conclusion: The results provide evidence of regional and socio-ethnic variation in African American English in North Carolina.

RevDate: 2018-02-02

Wu S, Huang X, Wang J, et al (2018)

Evaluation of speech improvement following obturator prostheses for patients with palatal defect.

The Journal of the Acoustical Society of America, 143(1):202.

Palatal defect is a common maxillofacial defect after maxillectomy that can be repaired by obturator prostheses, which can effectively improve patients' speech. However, comprehensive evaluation methods for speech recovery are still controversial and remain undefined. A prospective cohort study on 34 patients with palatal defect and 34 healthy controls was performed. Patients received obturator prostheses and their speech was recorded without and then with obturators. Participants pronounced six Chinese vowels and 100 syllables for recording. This paper evaluated the recovery of speech function of patients through the combination of subjective and objective assessment methods. Recruited listeners evaluated the speech intelligibility (SI) of 100 syllables. Vowel formant frequency and quantified vowel nasalization were measured using analysis software. The SI of patients improved significantly after wearing obturators. F2 values of six vowels in patients with obturators were higher than patients without obturators and close to the corresponding values in normal controls. The differences in F2 of /i/ and /u/, (A1-P1) of /i/ and /u/ for patients without and with obturator use were significant. Patients' ability to control the pronunciation of /i/ and /u/ improved greatly with obturators. These provide clinical evidence of the treatment outcomes in patients with palatal defect who received obturators.

RevDate: 2018-01-28

Takaki PB, Vieira MM, Said AV, et al (2018)

Does Body Mass Index Interfere in the Formation of Speech Formants?.

International archives of otorhinolaryngology, 22(1):45-49.

Introduction Studies in the fields of voice and speech have increasingly focused on the vocal tract and the importance of its structural integrity, and changes in the anatomy and configuration of the vocal tract determine the variations in phonatory and acoustic measurements, especially in the formation of the formants (Fs). Recent studies have revealed the functional consequences arising from being overweight and having an accumulation of fat in the pharyngeal region, including obstructive sleep apnea syndrome (OSAS) and impacts on the voice. Objectives To assess the relationship between body mass index (BMI) and analysis of the speech. Methods This study was approved by the Ethics Committee of the Universidade Federal de São Paulo (no. 288,430). The cohort consisted of 124 individuals aged between 18 and 45 with full permanent dentition and selected randomly. The participants underwent a brief medical history taking, BMI assessments and recording emissions of the sustained vowels /a/, /ε/, /i/, and /u/ by acoustic program PRAAT (v. 5.3.85, Boersma and Weenink, Amsterdam, Netherlands). Recordings were taken using a unidirectional microphone headset (model Karsect HT-9, Guangdong, China), with a condenser connected to an external sound card (USB-SA 2.0, model Andrea, PureAudio™, Pleasant Grove, UT, USA), to reduce noise. Results There was a significant correlation between BMI and formant 3 (F3) vowel /a/; however, there was a low degree of correlation intensity. Conclusions We did not observe a correlation between the BMI and the speech formants, but we believe there is a trend in this correlation that leads to changes in speech patterns with increases in BMI.

RevDate: 2018-01-15

Deroche MLD, Nguyen DL, VL Gracco (2017)

Modulation of Speech Motor Learning with Transcranial Direct Current Stimulation of the Inferior Parietal Lobe.

Frontiers in integrative neuroscience, 11:35.

The inferior parietal lobe (IPL) is a region of the cortex believed to participate in speech motor learning. In this study, we investigated whether transcranial direct current stimulation (tDCS) of the IPL could influence the extent to which healthy adults (1) adapted to a sensory alteration of their own auditory feedback, and (2) changed their perceptual representation. Seventy subjects completed three tasks: a baseline perceptual task that located the phonetic boundary between the vowels /e/ and /a/; a sensorimotor adaptation task in which subjects produced the word "head" under conditions of altered or unaltered feedback; and a post-adaptation perceptual task identical to the first. Subjects were allocated to four groups which differed in current polarity and feedback manipulation. Subjects who received anodal tDCS to their IPL (i.e., presumably increasing cortical excitability) lowered their first formant frequency (F1) by 10% in opposition to the upward shift in F1 in their auditory feedback. Subjects who received the same stimulation with unaltered feedback did not change their production. Subjects who received cathodal tDCS to their IPL (i.e., presumably decreasing cortical excitability) showed a 5% adaptation to the F1 alteration similar to subjects who received sham tDCS. A subset of subjects returned a few days later to reiterate the same protocol but without tDCS, enabling assessment of any facilitatory effects of the previous tDCS. All subjects exhibited a 5% adaptation effect. In addition, across all subjects and for the two recording sessions, the phonetic boundary was shifted toward the vowel /e/ being repeated, consistently with the selective adaptation effect, but a correlation between perception and production suggested that anodal tDCS had enhanced this perceptual shift. In conclusion, we successfully demonstrated that anodal tDCS could (1) enhance the motor adaptation to a sensory alteration, and (2) potentially affect the perceptual representation of those sounds, but we failed to demonstrate the reverse effect with the cathodal configuration. Overall, tDCS of the left IPL can be used to enhance speech performance but only under conditions in which new or adaptive learning is required.

RevDate: 2018-04-13

Kim C, Lee S, Jin I, et al (2018)

Acoustic Features and Cortical Auditory Evoked Potentials according to Emotional Statues of /u/, /a/, /i/ Vowels.

Journal of audiology & otology, 22(2):80-88.

BACKGROUND AND OBJECTIVES: Although Ling 6 sounds are often used in the rehabilitation process, its acoustic features have not been fully analyzed and represented in cortical responses. Current study was aimed to analyze acoustic features according to gender and emotional statuses of core vowels of Ling 6 sounds, /u/, /a/, and /i/. Cortical auditory evoked potentials (CAEPs) were also observed in those vowels.

SUBJECTS AND METHODS: Vowel sounds /u/, /a/, and /i/ out of Ling 6 sounds representing low, middle and high frequencies were recorded from normal 20 young adults. The participants watched relevant videos for 4-5 minutes in order for them to sympathize emotions with anger (A), happiness (H), and sadness (S) before producing vowels. And without any emotional salience, neutrally production was performed. The recording was extracted for 500 ms to select pure vowel portion of production. For analysis of CAEP, the latencies and amplitudes of P1, N1, P2, N2, N1-P2 were analyzed.

RESULTS: Intensities of /u/, /a/, and /i/ were 61.47, 63.38, and 60.55 dB. The intensities of neutral (N), H, A, S were 60.60, 65.43, 64.21, and 55.75 dB for vowel /u/, vowel /a/ were 61.80, 68.98, 66.50, and 56.23 dB, and vowel /i/ were 59.34, 64.90, 61.90, and 56.05 dB. The statistical significances for vowel and emotion were found but not for gender. The fundamental frequency (F0) of vowels for N, A, H, and S were 168.04, 174.93, 182.72, and 149.76 Hz and the first formant were 743.75, 815.59, 823.32, and 667.62 Hz. The statistical significance of F0 was found by vowel, emotion, and gender. The latencies and amplitudes of CAEP components did not show any statistical significance according to vowel.

CONCLUSIONS: Ling 6 sounds should be produced consistently in the rehabilitation process for considering their difference of intensities and frequencies according to speaker's emotions and gender. The vowels seemed to be interpreted as tonal stimuli for CAEP components of this study with similar acoustic features among them. Careful selection of materials is necessary to observe meaningful conclusion of CAEP measurement with vowel stimuli.

RevDate: 2018-02-05
CmpDate: 2018-02-05

Dawson C, Tervaniemi M, D Aalto (2018)

Behavioral and subcortical signatures of musical expertise in Mandarin Chinese speakers.

PloS one, 13(1):e0190793 pii:PONE-D-17-31505.

Both musical training and native language have been shown to have experience-based plastic effects on auditory processing. However, the combined effects within individuals are unclear. Recent research suggests that musical training and tone language speaking are not clearly additive in their effects on processing of auditory features and that there may be a disconnect between perceptual and neural signatures of auditory feature processing. The literature has only recently begun to investigate the effects of musical expertise on basic auditory processing for different linguistic groups. This work provides a profile of primary auditory feature discrimination for Mandarin speaking musicians and nonmusicians. The musicians showed enhanced perceptual discrimination for both frequency and duration as well as enhanced duration discrimination in a multifeature discrimination task, compared to nonmusicians. However, there were no differences between the groups in duration processing of nonspeech sounds at a subcortical level or in subcortical frequency representation of a nonnative tone contour, for fo or for the first or second formant region. The results indicate that musical expertise provides a cognitive, but not subcortical, advantage in a population of Mandarin speakers.

RevDate: 2018-02-06

Zhang J, Pan Z, Gui C, et al (2018)

Analysis on speech signal features of manic patients.

Journal of psychiatric research, 98:59-63.

Given the lack of effective biological markers for early diagnosis of bipolar mania, and the tendency for voice fluctuation during transition between mood states, this study aimed to investigate the speech features of manic patients to identify a potential set of biomarkers for diagnosis of bipolar mania. 30 manic patients and 30 healthy controls were recruited and their corresponding speech features were collected during natural dialogue using the Automatic Voice Collecting System. Bech-Rafaelsdn Mania Rating Scale (BRMS) and Clinical impression rating scale (CGI) were used to assess illness. The speech features were compared between two groups: mood group (mania vs remission) and bipolar group (manic patients vs healthy individuals). We found that the characteristic speech signals differed between mood groups and bipolar groups. The fourth formant (F4) and Linear Prediction Coefficient (LPC) (P < .05) were significantly differed when patients transmitted from manic to remission state. The first formant (F1), the second formant (F2), and LPC (P < .05) also played key roles in distinguishing between patients and healthy individuals. In addition, there was a significantly correlation between LPC and BRMS, indicating that LPC may play an important role in diagnosis of bipolar mania. In this study we traced speech features of bipolar mania during natural dialogue (conversation), which is an accessible approach in clinic practice. Such specific indicators may respectively serve as promising biomarkers for benefiting the diagnosis and clinical therapeutic evaluation of bipolar mania.

RevDate: 2017-12-29

Escudero P, Mulak KE, Elvin J, et al (2017)

"Mummy, keep it steady": phonetic variation shapes word learning at 15 and 17 months.

Developmental science [Epub ahead of print].

Fifteen-month-olds have difficulty detecting differences between novel words differing in a single vowel. Previous work showed that Australian English (AusE) infants habituated to the word-object pair DEET detected an auditory switch to DIT and DOOT in Canadian English (CanE) but not in their native AusE (Escudero et al.,). The authors speculated that this may be because the vowel inherent spectral change variation (VISC) in AusE DEET is larger than in CanE DEET. We investigated whether VISC leads to difficulty in encoding phonetic detail during early word learning, and whether this difficulty dissipates with age. In Experiment 1, we familiarized AusE-learning 15-month-olds to AusE DIT, which contains smaller VISC than AusE DEET. Unlike infants familiarized with AusE DEET (Escudero et al.,), infants detected a switch to DEET and DOOT. In Experiment 2, we familiarized AusE-learning 17-month-olds to AusE DEET. This time, infants detected a switch to DOOT, and marginally detected a switch to DIT. Our acoustic analysis showed that AusE DEET and DOOT are differentiated by the second vowel formant, while DEET and DIT can only be distinguished by their changing dynamic properties throughout the vowel trajectory. Thus, by 17 months, AusE infants can encode highly dynamic acoustic properties, enabling them to learn the novel vowel minimal pairs that are difficult at 15 months. These findings suggest that the development of word learning is shaped by the phonetic properties of the specific word minimal pair.

RevDate: 2018-03-16

Maruthy S, Feng Y, L Max (2018)

Spectral Coefficient Analyses of Word-Initial Stop Consonant Productions Suggest Similar Anticipatory Coarticulation for Stuttering and Nonstuttering Adults.

Language and speech, 61(1):31-42.

A longstanding hypothesis about the sensorimotor mechanisms underlying stuttering suggests that stuttered speech dysfluencies result from a lack of coarticulation. Formant-based measures of either the stuttered or fluent speech of children and adults who stutter have generally failed to obtain compelling evidence in support of the hypothesis that these individuals differ in the timing or degree of coarticulation. Here, we used a sensitive acoustic technique-spectral coefficient analyses-that allowed us to compare stuttering and nonstuttering speakers with regard to vowel-dependent anticipatory influences as early as the onset burst of a preceding voiceless stop consonant. Eight adults who stutter and eight matched adults who do not stutter produced C1VC2 words, and the first four spectral coefficients were calculated for one analysis window centered on the burst of C1 and two subsequent windows covering the beginning of the aspiration phase. Findings confirmed that the combined use of four spectral coefficients is an effective method for detecting the anticipatory influence of a vowel on the initial burst of a preceding voiceless stop consonant. However, the observed patterns of anticipatory coarticulation showed no statistically significant differences, or trends toward such differences, between the stuttering and nonstuttering groups. Combining the present results for fluent speech in one given phonetic context with prior findings from both stuttered and fluent speech in a variety of other contexts, we conclude that there is currently no support for the hypothesis that the fluent speech of individuals who stutter is characterized by limited coarticulation.

RevDate: 2018-06-18

Xue P, Zhang X, Bai J, et al (2018)

Acoustic and kinematic analyses of Mandarin vowels in speakers with hearing impairment.

Clinical linguistics & phonetics, 32(7):622-639.

The central aim of this experiment was to compare acoustic parameters, formant frequencies and vowel space area (VSA), in adolescents with hearing-impaired (HI) and their normal-hearing (NH) peers; for kinematic parameters, the movements of vocal organs, especially the lips, jaw and tongue, during vowel production were analysed. The participants were 12 adolescents with different degrees of hearing impairment. The control group consisted of 12 age-matched NH adolescents. All participants were native Chinese speakers who were asked to produce the Mandarin vowels /a/, /i/ and /u/, with subsequent acoustic and kinematic analysis. There was significant difference between the two groups. Additionally, the HI group produced more exaggerated mouth and less tongue movements in all vowels, compared to their NH peers. Results were discussed regarding possible relationship between acoustic data, articulatory movements and degree of hearing loss to provide an integrative assessment of acoustic and kinematic characteristics of individuals with hearing loss.

RevDate: 2017-12-20

Zaltz Y, Globerson E, N Amir (2017)

Auditory Perceptual Abilities Are Associated with Specific Auditory Experience.

Frontiers in psychology, 8:2080.

The extent to which auditory experience can shape general auditory perceptual abilities is still under constant debate. Some studies show that specific auditory expertise may have a general effect on auditory perceptual abilities, while others show a more limited influence, exhibited only in a relatively narrow range associated with the area of expertise. The current study addresses this issue by examining experience-dependent enhancement in perceptual abilities in the auditory domain. Three experiments were performed. In the first experiment, 12 pop and rock musicians and 15 non-musicians were tested in frequency discrimination (DLF), intensity discrimination, spectrum discrimination (DLS), and time discrimination (DLT). Results showed significant superiority of the musician group only for the DLF and DLT tasks, illuminating enhanced perceptual skills in the key features of pop music, in which miniscule changes in amplitude and spectrum are not critical to performance. The next two experiments attempted to differentiate between generalization and specificity in the influence of auditory experience, by comparing subgroups of specialists. First, seven guitar players and eight percussionists were tested in the DLF and DLT tasks that were found superior for musicians. Results showed superior abilities on the DLF task for guitar players, though no difference between the groups in DLT, demonstrating some dependency of auditory learning on the specific area of expertise. Subsequently, a third experiment was conducted, testing a possible influence of vowel density in native language on auditory perceptual abilities. Ten native speakers of German (a language characterized by a dense vowel system of 14 vowels), and 10 native speakers of Hebrew (characterized by a sparse vowel system of five vowels), were tested in a formant discrimination task. This is the linguistic equivalent of a DLS task. Results showed that German speakers had superior formant discrimination, demonstrating highly specific effects for auditory linguistic experience as well. Overall, results suggest that auditory superiority is associated with the specific auditory exposure.

RevDate: 2018-02-06

Easwar V, Banyard A, Aiken S, et al (2018)

Phase delays between tone pairs reveal interactions in scalp-recorded envelope following responses.

Neuroscience letters, 665:257-262.

Evoked potentials to envelope periodicity in sounds, such as vowels, are dependent on the stimulus spectrum. We hypothesize that phase differences between responses elicited by multiple frequencies spread tonotopically across the cochlear partition may contribute to variation in scalp-recorded amplitude. The present study evaluated this hypothesis by measuring envelope following responses (EFRs) to two concurrent tone pairs, p1 and p2, that approximated the first and second formant frequencies of a vowel, while controlling their relative envelope phase. We found that the scalp-recorded amplitude of EFRs changed significantly in phase and amplitude when the envelope phase of p2, the higher frequency tone pair, was delayed. The maximum EFR amplitude occurred at the p2 envelope phase delay of 90°, likely because the stimulus delay compensated for the average phase lead of 73.57° exhibited by p2-contributed EFRs relative to p1-contributed EFRs, owing to earlier cochlear processing of higher frequencies. Findings suggest a linear superimposition of independently generated EFRs from tonotopically separated pathways. This suggests that introducing frequency-specific delays may help to optimize EFRs to broadband stimuli like vowels.

RevDate: 2017-12-09

Delviniotis DS, S Theodoridis (2017)

On Exploring Vocal Ornamentation in Byzantine Chant.

Journal of voice : official journal of the Voice Foundation pii:S0892-1997(17)30087-5 [Epub ahead of print].

OBJECTIVES: A special vocal ornament in Byzantine chant (BC), the single cycle ornamentation structure (SCOS), is defined and compared with the vibrato with respect to its time (rate, extent) and spectral (slope [SS], relative speaker's formant [SPF] level, formant frequencies [Fi] and bandwidths [Bi], and noise-to-harmonics ratio [NHR]) characteristics.

STUDY DESIGN: This is a comparative study between the vocal ornaments of SCOS and vibrato, of which time and spectral acoustic parameters were measured, statistically analyzed, and compared.

METHODS: From the same hymn recordings chanted by four chanters, the SS, SPF level, FFi, FBi, and NHR difference values between the vocal ornament and its neighbor steady note, and the rate and extent, were compared with those of vibrato.

RESULTS: The mean extent values for SCOS were found to be almost double the corresponding values for vibrato, and the rate of SCOS tends to be different from the rate of vibrato. The difference values of: 1) the NHR, 2) the spectral slope, and 3) the SPF level, between the vocal ornament and its neighbor steady note were found to be: 1) higher for SCOS, 2) mainly lower for SCOS, and 3) lower for SCOS, respectively. No significant differences were detected for the FFi and FBi. The FF1 differences tend to be negative in both ornaments indicating a formant tuning effect.

CONCLUSIONS: A new vocal ornament (SCOS) in BC is studied, of which the extent, NHR (HNR), the spectral slope, and the SPF level are different compared to those of vibrato.

RevDate: 2018-02-28

Lametti DR, Smith HJ, Freidin PF, et al (2018)

Cortico-cerebellar Networks Drive Sensorimotor Learning in Speech.

Journal of cognitive neuroscience, 30(4):540-551.

The motor cortex and cerebellum are thought to be critical for learning and maintaining motor behaviors. Here we use transcranial direct current stimulation (tDCS) to test the role of the motor cortex and cerebellum in sensorimotor learning in speech. During productions of "head," "bed," and "dead," the first formant of the vowel sound was altered in real time toward the first formant of the vowel sound in "had," "bad," and "dad." Compensatory changes in first and second formant production were used as a measure of motor adaptation. tDCS to either the motor cortex or the cerebellum improved sensorimotor learning in speech compared with sham stimulation (n = 20 in each group). However, in the case of cerebellar tDCS, production changes were restricted to the source of the acoustical error (i.e., the first formant). Motor cortex tDCS drove production changes that offset errors in the first formant, but unlike cerebellar tDCS, adaptive changes in the second formant also occurred. The results suggest that motor cortex and cerebellar tDCS have both shared and dissociable effects on motor adaptation. The study provides initial causal evidence in speech production that the motor cortex and the cerebellum support different aspects of sensorimotor learning. We propose that motor cortex tDCS drives sensorimotor learning toward previously learned patterns of movement, whereas cerebellar tDCS focuses sensorimotor learning on error correction.

RevDate: 2017-12-06

Elbashti ME, Sumita YI, Hattori M, et al (2017)

Digitized Speech Characteristics in Patients with Maxillectomy Defects.

Journal of prosthodontics : official journal of the American College of Prosthodontists [Epub ahead of print].

PURPOSE: Accurate evaluation of speech characteristics through formant frequency measurement is important for proper speech rehabilitation in patients after maxillectomy. This study aimed to evaluate the utility of digital acoustic analysis and vowel pentagon space for the prediction of speech ability after maxillectomy, by comparing the acoustic characteristics of vowel articulation in three classes of maxillectomy defects.

MATERIALS AND METHODS: Aramany's classifications I, II, and IV were used to group 27 male patients after maxillectomy. Digital acoustic analysis of five Japanese vowels-/a/, /e/, /i/, /o/, and /u/-was performed using a speech analysis system. First formant (F1) and second formant (F2) frequencies were calculated using an autocorrelation method. Data were plotted on an F1-F2 plane for each patient, and the F1 and F2 ranges were calculated. The vowel pentagon spaces were also determined. One-way ANOVA was applied to compare all results between the three groups.

RESULTS: Class II maxillectomy patients had a significantly higher F2 range than did Class I and Class IV patients (p = 0.002). In contrast, there was no significant difference in the F1 range between the three classes. The vowel pentagon spaces were significantly larger in class II maxillectomy patients than in Class I and Class IV patients (p = 0.014).

CONCLUSION: The results of this study indicate that the acoustic characteristics of maxillectomy patients are affected by the defect area. This finding may provide information for obturator design based on vowel articulation and defect class.

RevDate: 2018-02-15

Sumita YI, Hattori M, Murase M, et al (2018)

Digitised evaluation of speech intelligibility using vowels in maxillectomy patients.

Journal of oral rehabilitation, 45(3):216-221.

Among the functional disabilities that patients face following maxillectomy, speech impairment is a major factor influencing quality of life. Proper rehabilitation of speech, which may include prosthodontic and surgical treatments and speech therapy, requires accurate evaluation of speech intelligibility (SI). A simple, less time-consuming yet accurate evaluation is desirable both for maxillectomy patients and the various clinicians providing maxillofacial treatment. This study sought to determine the utility of digital acoustic analysis of vowels for the prediction of SI in maxillectomy patients, based on a comprehensive understanding of speech production in the vocal tract of maxillectomy patients and its perception. Speech samples were collected from 33 male maxillectomy patients (mean age 57.4 years) in two conditions, without and with a maxillofacial prosthesis, and formant data for the vowels /a/,/e/,/i/,/o/, and /u/ were calculated based on linear predictive coding. The frequency range of formant 2 (F2) was determined by differences between the minimum and maximum frequency. An SI test was also conducted to reveal the relationship between SI score and F2 range. Statistical analyses were applied. F2 range and SI score were significantly different between the two conditions without and with a prosthesis (both P < .0001). F2 range was significantly correlated with SI score in both the conditions (Spearman's r = .843, P < .0001; r = .832, P < .0001, respectively). These findings indicate that calculating the F2 range from 5 vowels has clinical utility for the prediction of SI after maxillectomy.

RevDate: 2017-12-02

Hauser I (2017)

A revised metric for calculating acoustic dispersion applied to stop inventories.

The Journal of the Acoustical Society of America, 142(5):EL500.

Dispersion Theory [DT; Liljencrants and Lindblom (1972). Language 12(1), 839-862] claims that acoustically dispersed vowel inventories should be typologically common. Dispersion is often quantified using triangle area between three mean vowel formant points. This approach is problematic; it ignores distributions, which affect speech perception [Clayards, Tanenhaus, Aslin, and Jacobs (2008). Cognition 108, 804-809]. This letter proposes a revised metric for calculating dispersion which incorporates covariance. As a test case, modeled vocal tract articulatory-acoustic data of stop consonants [Schwartz, Boe, Badin, and Sawallis (2012). J. Phonetics 40, 20-36] are examined. Although the revised metric does not recover DT predictions for stop inventories, it changes results, showing that dispersion results depend on metric choice, which is often overlooked. The metric can be used in any acoustic space to include information about within-category variation when calculating dispersion.

RevDate: 2017-12-19

Thoppil MG, Kumar CS, Kumar A, et al (2017)

Speech Signal Analysis and Pattern Recognition in Diagnosis of Dysarthria.

Annals of Indian Academy of Neurology, 20(4):352-357.

Background: Dysarthria refers to a group of disorders resulting from disturbances in muscular control over the speech mechanism due to damage of central or peripheral nervous system. There is wide subjective variability in assessment of dysarthria between different clinicians. In our study, we tried to identify a pattern among types of dysarthria by acoustic analysis and to prevent intersubject variability.

Objectives: (1) Pattern recognition among types of dysarthria with software tool and to compare with normal subjects. (2) To assess the severity of dysarthria with software tool.

Materials and Methods: Speech of seventy subjects were recorded, both normal subjects and the dysarthric patients who attended the outpatient department/admitted in AIMS. Speech waveforms were analyzed using Praat and MATHLAB toolkit. The pitch contour, formant variation, and speech duration of the extracted graphs were analyzed.

Results: Study population included 25 normal subjects and 45 dysarthric patients. Dysarthric subjects included 24 patients with extrapyramidal dysarthria, 14 cases of spastic dysarthria, and 7 cases of ataxic dysarthria. Analysis of pitch of the study population showed a specific pattern in each type. F0 jitter was found in spastic dysarthria, pitch break with ataxic dysarthria, and pitch monotonicity with extrapyramidal dysarthria. By pattern recognition, we identified 19 cases in which one or more recognized patterns coexisted. There was a significant correlation between the severity of dysarthria and formant range.

Conclusions: Specific patterns were identified for types of dysarthria so that this software tool will help clinicians to identify the types of dysarthria in a better way and could prevent intersubject variability. We also assessed the severity of dysarthria by formant range. Mixed dysarthria can be more common than clinically expected.

RevDate: 2017-12-19

Themistocleous C (2017)

Effects of Two Linguistically Proximal Varieties on the Spectral and Coarticulatory Properties of Fricatives: Evidence from Athenian Greek and Cypriot Greek.

Frontiers in psychology, 8:1945.

Several studies have explored the acoustic structure of fricatives, yet there has been very little acoustic research on the effects of dialects on the production of fricatives. This article investigates the effects of two linguistically proximal Modern Greek dialects, Athenian Greek and Cypriot Greek on the temporal, spectral, and coarticulatory properties of fricatives and aims to determine the acoustic properties that convey information about these two dialects. Productions of voiced and voiceless labiodental, dental, alveolar, palatal, and velar fricatives were extracted from a speaking task from typically speaking female adult speakers (25 Cypriot Greek and 20 Athenian Greek speakers). Measures were made of spectral properties, using a spectral moments analysis. The formants of the following vowel were measured and second degree polynomials of the formant contours were calculated. The findings showed that Athenian Greek and Cypriot Greek fricatives differ in all spectral properties across all places of articulation. Also, the co-articulatory effects of fricatives on following vowel were different depending on the dialect. Duration, spectral moments, and the starting frequencies of F1, F2, F3, and F4 contributed the most to the classification of dialect. These findings provide a solid evidence base for the manifestation of dialectal information in the acoustic structure of fricatives.

RevDate: 2018-05-29

Sjerps MJ, Zhang C, G Peng (2018)

Lexical tone is perceived relative to locally surrounding context, vowel quality to preceding context.

Journal of experimental psychology. Human perception and performance, 44(6):914-924.

Important speech cues such as lexical tone and vowel quality are perceptually contrasted to the distribution of those same cues in surrounding contexts. However, it is unclear whether preceding and following contexts have similar influences, and to what extent those influences are modulated by the auditory history of previous trials. To investigate this, Cantonese participants labeled sounds from (a) a tone continuum (mid- to high-level), presented with a context that had raised or lowered fundamental frequency (F0) values and (b) a vowel quality continuum (/u/ to /o/), where the context had raised or lowered first formant (F1) values. Contexts with high or low F0/F1 were presented in separate blocks or intermixed in 1 block. Contexts were presented following (Experiment 1) or preceding the target continuum (Experiment 2). Contrastive effects were found for both tone and vowel quality (e.g., decreased F0 values in contexts lead to more high tone target judgments and vice versa). Importantly, however, lexical tone was only influenced by F0 in immediately preceding and following contexts. Vowel quality was only influenced by the F1 in preceding contexts, but this extended to contexts from preceding trials. Contextual influences on tone and vowel quality are qualitatively different, which has important implications for understanding the mechanism of context effects in speech perception. (PsycINFO Database Record

RevDate: 2018-02-09

Daliri A, L Max (2018)

Stuttering adults' lack of pre-speech auditory modulation normalizes when speaking with delayed auditory feedback.

Cortex; a journal devoted to the study of the nervous system and behavior, 99:55-68.

Auditory modulation during speech movement planning is limited in adults who stutter (AWS), but the functional relevance of the phenomenon itself remains unknown. We investigated for AWS and adults who do not stutter (AWNS) (a) a potential relationship between pre-speech auditory modulation and auditory feedback contributions to speech motor learning and (b) the effect on pre-speech auditory modulation of real-time versus delayed auditory feedback. Experiment I used a sensorimotor adaptation paradigm to estimate auditory-motor speech learning. Using acoustic speech recordings, we quantified subjects' formant frequency adjustments across trials when continually exposed to formant-shifted auditory feedback. In Experiment II, we used electroencephalography to determine the same subjects' extent of pre-speech auditory modulation (reductions in auditory evoked potential N1 amplitude) when probe tones were delivered prior to speaking versus not speaking. To manipulate subjects' ability to monitor real-time feedback, we included speaking conditions with non-altered auditory feedback (NAF) and delayed auditory feedback (DAF). Experiment I showed that auditory-motor learning was limited for AWS versus AWNS, and the extent of learning was negatively correlated with stuttering frequency. Experiment II yielded several key findings: (a) our prior finding of limited pre-speech auditory modulation in AWS was replicated; (b) DAF caused a decrease in auditory modulation for most AWNS but an increase for most AWS; and (c) for AWS, the amount of auditory modulation when speaking with DAF was positively correlated with stuttering frequency. Lastly, AWNS showed no correlation between pre-speech auditory modulation (Experiment II) and extent of auditory-motor learning (Experiment I) whereas AWS showed a negative correlation between these measures. Thus, findings suggest that AWS show deficits in both pre-speech auditory modulation and auditory-motor learning; however, limited pre-speech modulation is not directly related to limited auditory-motor adaptation; and in AWS, DAF paradoxically tends to normalize their otherwise limited pre-speech auditory modulation.

RevDate: 2018-01-10

Kato S, Homma A, T Sakuma (2018)

Easy Screening for Mild Alzheimer's Disease and Mild Cognitive Impairment from Elderly Speech.

Current Alzheimer research, 15(2):104-110.

OBJECTIVE: This study presents a novel approach for early detection of cognitive impairment in the elderly. The approach incorporates the use of speech sound analysis, multivariate statistics, and data-mining techniques. We have developed a speech prosody-based cognitive impairment rating (SPCIR) that can distinguish between cognitively normal controls and elderly people with mild Alzheimer's disease (mAD) or mild cognitive impairment (MCI) using prosodic signals extracted from elderly speech while administering a questionnaire. Two hundred and seventy-three Japanese subjects (73 males and 200 females between the ages of 65 and 96) participated in this study. The authors collected speech sounds from segments of dialogue during a revised Hasegawa's dementia scale (HDS-R) examination and talking about topics related to hometown, childhood, and school. The segments correspond to speech sounds from answers to questions regarding birthdate (T1), the name of the subject's elementary school (T2), time orientation (Q2), and repetition of three-digit numbers backward (Q6). As many prosodic features as possible were extracted from each of the speech sounds, including fundamental frequency, formant, and intensity features and mel-frequency cepstral coefficients. They were refined using principal component analysis and/or feature selection. The authors calculated an SPCIR using multiple linear regression analysis.

CONCLUSION: In addition, this study proposes a binary discrimination model of SPCIR using multivariate logistic regression and model selection with receiver operating characteristic curve analysis and reports on the sensitivity and specificity of SPCIR for diagnosis (control vs. MCI/mAD). The study also reports discriminative performances well, thereby suggesting that the proposed approach might be an effective tool for screening the elderly for mAD and MCI.

RevDate: 2018-03-08

Tyan M, Espinoza-Cuadros F, Fernández Pozo R, et al (2017)

Obstructive Sleep Apnea in Women: Study of Speech and Craniofacial Characteristics.

JMIR mHealth and uHealth, 5(11):e169 pii:v5i11e169.

BACKGROUND: Obstructive sleep apnea (OSA) is a common sleep disorder characterized by frequent cessation of breathing lasting 10 seconds or longer. The diagnosis of OSA is performed through an expensive procedure, which requires an overnight stay at the hospital. This has led to several proposals based on the analysis of patients' facial images and speech recordings as an attempt to develop simpler and cheaper methods to diagnose OSA.

OBJECTIVE: The objective of this study was to analyze possible relationships between OSA and speech and facial features on a female population and whether these possible connections may be affected by the specific clinical characteristics in OSA population and, more specifically, to explore how the connection between OSA and speech and facial features can be affected by gender.

METHODS: All the subjects are Spanish subjects suspected to suffer from OSA and referred to a sleep disorders unit. Voice recordings and photographs were collected in a supervised but not highly controlled way, trying to test a scenario close to a realistic clinical practice scenario where OSA is assessed using an app running on a mobile device. Furthermore, clinical variables such as weight, height, age, and cervical perimeter, which are usually reported as predictors of OSA, were also gathered. Acoustic analysis is centered in sustained vowels. Facial analysis consists of a set of local craniofacial features related to OSA, which were extracted from images after detecting facial landmarks by using the active appearance models. To study the probable OSA connection with speech and craniofacial features, correlations among apnea-hypopnea index (AHI), clinical variables, and acoustic and facial measurements were analyzed.

RESULTS: The results obtained for female population indicate mainly weak correlations (r values between .20 and .39). Correlations between AHI, clinical variables, and speech features show the prevalence of formant frequencies over bandwidths, with F2/i/ being the most appropriate formant frequency for OSA prediction in women. Results obtained for male population indicate mainly very weak correlations (r values between .01 and .19). In this case, bandwidths prevail over formant frequencies. Correlations between AHI, clinical variables, and craniofacial measurements are very weak.

CONCLUSIONS: In accordance with previous studies, some clinical variables are found to be good predictors of OSA. Besides, strong correlations are found between AHI and some clinical variables with speech and facial features. Regarding speech feature, the results show the prevalence of formant frequency F2/i/ over the rest of features for the female population as OSA predictive feature. Although the correlation reported is weak, this study aims to find some traces that could explain the possible connection between OSA and speech in women. In the case of craniofacial measurements, results evidence that some features that can be used for predicting OSA in male patients are not suitable for testing female population.

LOAD NEXT 100 CITATIONS

RJR Experience and Expertise

Researcher

Robbins holds BS, MS, and PhD degrees in the life sciences. He served as a tenured faculty member in the Zoology and Biological Science departments at Michigan State University. He is currently exploring the intersection between genomics, microbial ecology, and biodiversity — an area that promises to transform our understanding of the biosphere.

Educator

Robbins has extensive experience in college-level education: At MSU he taught introductory biology, genetics, and population genetics. At JHU, he was an instructor for a special course on biological database design. At FHCRC, he team-taught a graduate-level course on the history of genetics. At Bellevue College he taught medical informatics.

Administrator

Robbins has been involved in science administration at both the federal and the institutional levels. At NSF he was a program officer for database activities in the life sciences, at DOE he was a program officer for information infrastructure in the human genome project. At the Fred Hutchinson Cancer Research Center, he served as a vice president for fifteen years.

Technologist

Robbins has been involved with information technology since writing his first Fortran program as a college student. At NSF he was the first program officer for database activities in the life sciences. At JHU he held an appointment in the CS department and served as director of the informatics core for the Genome Data Base. At the FHCRC he was VP for Information Technology.

Publisher

While still at Michigan State, Robbins started his first publishing venture, founding a small company that addressed the short-run publishing needs of instructors in very large undergraduate classes. For more than 20 years, Robbins has been operating The Electronic Scholarly Publishing Project, a web site dedicated to the digital publishing of critical works in science, especially classical genetics.

Speaker

Robbins is well-known for his speaking abilities and is often called upon to provide keynote or plenary addresses at international meetings. For example, in July, 2012, he gave a well-received keynote address at the Global Biodiversity Informatics Congress, sponsored by GBIF and held in Copenhagen. The slides from that talk can be seen HERE.

Facilitator

Robbins is a skilled meeting facilitator. He prefers a participatory approach, with part of the meeting involving dynamic breakout groups, created by the participants in real time: (1) individuals propose breakout groups; (2) everyone signs up for one (or more) groups; (3) the groups with the most interested parties then meet, with reports from each group presented and discussed in a subsequent plenary session.

Designer

Robbins has been engaged with photography and design since the 1960s, when he worked for a professional photography laboratory. He now prefers digital photography and tools for their precision and reproducibility. He designed his first web site more than 20 years ago and he personally designed and implemented this web site. He engages in graphic design as a hobby.

Order from Amazon

This is a must read book for anyone with an interest in invasion biology. The full title of the book lays out the author's premise — The New Wild: Why Invasive Species Will Be Nature's Salvation. Not only is species movement not bad for ecosystems, it is the way that ecosystems respond to perturbation — it is the way ecosystems heal. Even if you are one of those who is absolutely convinced that invasive species are actually "a blight, pollution, an epidemic, or a cancer on nature", you should read this book to clarify your own thinking. True scientific understanding never comes from just interacting with those with whom you already agree. R. Robbins

21454 NE 143rd Street
Woodinville, WA 98077

206-300-3443

E-mail: RJR8222@gmail.com

Collection of publications by R J Robbins

Reprints and preprints of publications, slide presentations, instructional materials, and data compilations written or prepared by Robert Robbins. Most papers deal with computational biology, genome informatics, using information technology to support biomedical research, and related matters.

Research Gate page for R J Robbins

ResearchGate is a social networking site for scientists and researchers to share papers, ask and answer questions, and find collaborators. According to a study by Nature and an article in Times Higher Education , it is the largest academic social network in terms of active users.

Curriculum Vitae for R J Robbins

short personal version

Curriculum Vitae for R J Robbins

long standard version

RJR Picks from Around the Web (updated 11 MAY 2018 )