Your search

In authors or contributors
Resource language
  • This selected overview of audiovisual (AV) speech perception examines the influence of visible articulatory information on what is heard. Thought to be a cross-cultural phenomenon that emerges early in typical language development, variables that influence AV speech perception include properties of the visual and the auditory signal, attentional demands, and individual differences. A brief review of the existing neurobiological evidence on how visual information influences heard speech indicates potential loci, timing, and facilitatory effects of AV over auditory only speech. The current literature on AV speech in certain clinical populations (individuals with an autism spectrum disorder, developmental language disorder, or hearing loss) reveals differences in processing that may inform interventions. Finally, a new method of assessing AV speech that does not require obvious cross-category mismatch or auditory noise was presented as a novel approach for investigators.

  • Designed to familiarize anyone who reads to young children with the essentials of promoting early and emerging literacy. Irwin and Moore share activities that can be used to foster this critical skill development, and have linked these activities to popular children's books.--

  • Using eye-tracking methodology, gaze to a speaking face was compared in a group of children with autism spectrum disorders (ASD) and a group with typical development (TD). Patterns of gaze were observed under three conditions: audiovisual (AV) speech in auditory noise, visual only speech and an AV non-face, non-speech control. Children with ASD looked less to the face of the speaker and fixated less on the speakers' mouth than TD controls. No differences in gaze were reported for the non-face, non-speech control task. Since the mouth holds much of the articulatory information available on the face, these findings suggest that children with ASD may have reduced access to critical linguistic information. This reduced access to visible articulatory information could be a contributor to the communication and language problems exhibited by children with ASD.

  • When a speaker talks, the visible consequences of what they are saying can be seen. Listeners are influenced by this visible speech both in a noisy listening environment and even when auditory speech can easily be heard. While visible influence on heard speech has been reported to increase from early to late childhood, little is known about the mechanism that underlies this developmental trend. One possible account of developmental differences is that looking behavior to the face of a speaker changes with age. To examine this possibility, the gaze to a speaking face was examined in children from 5 to 10 yrs of age and adults. Participants viewed a speaker's face in a range of conditions that elicit looking: in a visual only (speech reading) condition, in the presence of auditory noise (speech in noise) condition, and in an audiovisual mismatch (McGurk) condition. Results indicate an increase in gaze on the face, and specifically, to the mouth of a speaker between the ages of 5 and 10 for all conditions. This change in looking behavior may help account for previous findings in the literature showing that visual influence on heard speech increases with development.

  • Children with speech sound disorders may perceive speech differently than children with typical speech development. The nature of these speech differences is reviewed with an emphasis on assessing phoneme-specific perception for speech sounds that are produced in error. Category goodness judgment, or the ability to judge accurate and inaccurate tokens of speech sounds, plays an important role in phonological development. The software Speech Assessment and Interactive Learning System, which has been effectively used to assess preschoolers' ability to perform goodness judgments, is explored for school-aged children with residual speech errors (RSEs). However, data suggest that this particular, task may not be sensitive to perceptual differences in school-aged children. The need for the development of clinical tools for assessment of speech perception in school-aged children with RSE is highlighted, and clinical suggestions are provided.

  • This study analyzed distributions of Euclidean displacements in gaze (i.e. “gaze steps) to evaluate the degree of componential cognitive constraints on audio-visual speech perception tasks. Children performing these tasks exhibited distributions of gaze steps that were closest to power-law or lognormal distributions, suggesting a multiplicatively interactive, flexible, self-organizing cognitive system rather than a component-dominant stipulated cognitive structure. Younger children and children diagnosed with an autism spectrum disorder (ASD) exhibited distributions that were closer to power-law than lognormal, indicating a reduced degree of self-organized structure. The relative goodness of lognormal fit was also a significant predictor of ASD, suggesting that this type of analysis may point towards a promising diagnostic tool. These results lend further support to an interaction-dominant framework that casts cognitive processing and development in terms of self-organization instead of fixed components and show that these analytical methods are sensitive to important developmental and neuropsychological differences.

  • Children with autism spectrum disorders have been reported to be less influenced by a speaker’s face during speech perception than those with typically development. To more closely examine these reported differences, a novel visual phonemic restoration paradigm was used to assess neural signatures (event-related potentials [ERPs]) of audiovisual processing in typically developing children and in children with autism spectrum disorder. Video of a speaker saying the syllable /ba/ was paired with (1) a synthesized /ba/ or (2) a synthesized syllable derived from /ba/ in which auditory cues for the consonant were substantially weakened, thereby sounding more like /a/. The auditory stimuli are easily discriminable; however, in the context of a visual /ba/, the auditory /a/ is typically perceived as /ba/, producing a visual phonemic restoration. Only children with ASD showed a large /ba/-/a/ discrimination response in the presence of a speaker producing /ba/, suggesting reduced influence of visual speech. © 2021, The Author(s), under exclusive licence to Springer Science+Business Media, LLC part of Springer Nature.

  • PURPOSE: Autistic adults consistently report difficulties understanding speech in adverse listening environments, which may be related to differences in social communication and participation. Research examining masked-speech recognition in autistic adults is limited, particularly in competing speech backgrounds with high degrees of informational masking. This work characterizes speech-in-speech and speech-in-noise recognition in young adults on the autism spectrum, as well as evaluates self-reported functional listening abilities and listening-related fatigue. METHOD: Masked-speech recognition was evaluated in both autistic (n = 20) and non-autistic (n = 20) young adults with normal hearing. Speech reception thresholds were adaptively measured in two-talker speech and speech-shaped noise using target sentences that were either semantically meaningful or anomalous. Functional listening abilities and listening-related fatigue were assessed using the Speech, Spatial, and Qualities of Hearing Scale and the Vanderbilt Fatigue Scale for Adults. Autism characteristics and social communication experiences were quantified using the Social Responsiveness Scale-Second Edition. RESULTS: Autistic adults displayed significantly poorer speech-in-speech recognition than their non-autistic peers, while speech-in-noise recognition did not differ between groups. Functional listening difficulties in daily life and listening-related fatigue were significantly higher for autistic participants. Autism characteristics strongly predicted functional listening abilities and listening-related fatigue in both groups. CONCLUSIONS: Autistic young adults experience objective speech-in-speech recognition difficulties that correspond with listening challenges in daily life. Autism characteristics and social communication experiences predict functional listening abilities reported by both autistic and non-autistic young adults with normal hearing. Speech-in-speech recognition difficulties observed here may amplify social communication challenges for adults on the autism spectrum. Future work must prioritize improved awareness of autistic listening differences.

  • In face-to-face conversation, when a speaker talks, the outcome of their speech can both be heard (audio) and seen (visual). We employed a novel visual phonemic restoration paradigm to assess neural signatures (event related potentials [ERPs]) of audiovisual processing in typically developing children and in children with ASD. During EEG recording, two types of auditory stimuli were alternately presented with video of a speaker saying the consonant-vowel syllable /ba/: 1) a synthesized consonant-vowel syllable /ba/ or 2) a synthesized syllable derived from /ba/ in which auditory cues for the consonant are substantially weakened, such that it sounds more like /a/. The auditory stimuli are easily discriminable, however, in the context of a visual /ba/, the auditory /a/ is typically perceived as /ba/, producing a visual phonemic restoration. In an ERP context, we have shown that this restoration leads to an attenuated phoneme discrimination response in an active task in typical adults and children. To explore the hypothesis that children with autism spectrum disorder (ASD) have atypical AV speech integration under pre-attentive processing conditions, we tested whether children with ASD would show a reduction in this restoration effect under passive listening conditions. Indeed, in this task, children with ASD showed a large /ba/-/a/ discrimination response, even in the presence of a speaker producing /ba/, suggesting reduced influence of visual speech. © 2019 Proceedings of the International Congress on Acoustics. All rights reserved.

  • This study used eye-tracking methodology to assess audiovisual speech perception in 26 children ranging in age from 5 to 15 years, half with autism spectrum disorders (ASD) and half with typical development. Given the characteristic reduction in gaze to the faces of others in children with ASD, it was hypothesized that they would show reduced influence of visual information on heard speech. Responses were compared on a set of auditory, visual, and audiovisual speech perception tasks. Even when fixated on the face of the speaker, children with ASD were less visually influenced than typical development controls. This indicates fundamental differences in the processing of audiovisual speech in children with ASD, which may contribute to their language and communication impairments.

  • This study examines how across-trial (average) and trial-by-trial (variability in) amplitude and latency of the N400 event-related potential (ERP) reflect temporal integration of pitch accent and beat gesture. Thirty native English speakers viewed videos of a talker producing sentences with beat gesture co-occurring with a pitch accented focus word (synchronous), beat gesture co-occurring with the onset of a subsequent non-focused word (asynchronous), or the absence of beat gesture (no beat). Across trials, increased amplitude and earlier latency were observed when beat gesture was temporally asynchronous with pitch accenting than when it was temporally synchronous with pitch accenting or absent. Moreover, temporal asynchrony of beat gesture relative to pitch accent increased trial-by-trial variability of N400 amplitude and latency and influenced the relationship between across-trial and trial-by-trial N400 latency. These results indicate that across-trial and trial-by-trial amplitude and latency of the N400 ERP reflect temporal integration of beat gesture and pitch accent during language comprehension, supporting extension of the integrated systems hypothesis of gesture-speech processing and neural noise theories to focus processing in typical adult populations. Copyright © 2020 Elsevier B.V. All rights reserved.

  • Perception of spoken language requires attention to acoustic as well as visible phonetic information. This article reviews the known differences in audiovisual speech perception in children with autism spectrum disorders (ASD) and specifies the need for interventions that address this construct. Elements of an audiovisual training program are described. This researcher-developed program delivered via an iPad app presents natural speech in the context of increasing noise, but supported with a speaking face. Children are cued to attend to visible articulatory information to assist in perception of the spoken words. Data from four children with ASD ages 8-10 are presented showing that the children improved their performance on an untrained auditory speech-in-noise task.

  • When a speaker talks, the consequences of this can both be heard (audio) and seen (visual). A novel visual phonemic restoration task was used to assess behavioral discrimination and neural signatures (event-related potentials, or ERP) of audiovisual processing in typically developing children with a range of social and communicative skills assessed using the social responsiveness scale, a measure of traits associated with autism. An auditory oddball design presented two types of stimuli to the listener, a clear exemplar of an auditory consonant-vowel syllable /ba/ (the more frequently occurring standard stimulus), and a syllable in which the auditory cues for the consonant were substantially weakened, creating a stimulus which is more like /a/ (the infrequently presented deviant stimulus). All speech tokens were paired with a face producing /ba/ or a face with a pixelated mouth containing motion but no visual speech. In this paradigm, the visual /ba/ should cause the auditory /a/ to be perceived as /ba/, creating an attenuated oddball response; in contrast, a pixelated video (without articulatory information) should not have this effect. Behaviorally, participants showed visual phonemic restoration (reduced accuracy in detecting deviant /a/) in the presence of a speaking face. In addition, ERPs were observed in both an early time window (N100) and a later time window (P300) that were sensitive to speech context (/ba/ or /a/) and modulated by face context (speaking face with visible articulation or with pixelated mouth). Specifically, the oddball responses for the N100 and P300 were attenuated in the presence of a face producing /ba/ relative to a pixelated face, representing a possible neural correlate of the phonemic restoration effect. Notably, those individuals with more traits associated with autism (yet still in the non-clinical range) had smaller P300 responses overall, regardless of face context, suggesting generally reduced phonemic discrimination.

  • Visual information on a talker's face can influence what a listener hears. Commonly used approaches to study this include mismatched audiovisual stimuli (e.g., McGurk type stimuli) or visual speech in auditory noise. In this paper we discuss potential limitations of these approaches and introduce a novel visual phonemic restoration method. This method always presents the same visual stimulus (e.g., /ba/) dubbed with a matched auditory stimulus (/ba/) or one that has weakened consonantal information and sounds more /a/-like). When this reduced auditory stimulus (or /a/) is dubbed with the visual /ba/, a visual influence will result in effectively 'restoring' the weakened auditory cues so that the stimulus is perceived as a /ba/. An oddball design in which participants are asked to detect the /a/ among a stream of more frequently occurring /ba/s while either a speaking face or face with no visual speech was used. In addition, the same paradigm was presented for a second contrast in which participants detected /pa/ among /ba/s, a contrast which should be unaltered by the presence of visual speech. Behavioral and some ERP findings reflect the expected phonemic restoration for the /ba/ vs. /a/ contrast; specifically, we observed reduced accuracy and P300 response in the presence of visual speech. Further, we report an unexpected finding of reduced accuracy and P300 response for both speech contrasts in the presence of visual speech, suggesting overall modulation of the auditory signal in the presence of visual speech. Consistent with this, we observed a mismatch negativity (MMN) effect for the /ba/ vs. /pa/ contrast only that was larger in absence of visual speech. We discuss the potential utility for this paradigm for listeners who cannot respond actively, such as infants and individuals with developmental disabilities.

  • Audiovisual speech perception includes the simultaneous processing of auditory and visual speech. Deficits in audiovisual speech perception are reported in autistic individuals; however, less is known regarding audiovisual speech perception within the broader autism phenotype (BAP), which includes individuals with elevated, yet subclinical, levels of autistic traits. We investigate the neural indices of audiovisual speech perception in adults exhibiting a range of autism-like traits using event-related potentials (ERPs) in a phonemic restoration paradigm. In this paradigm, we consider conditions where speech articulators (mouth and jaw) are present (AV condition) and obscured by a pixelated mask (PX condition). These two face conditions were included in both passive (simply viewing a speaking face) and active (participants were required to press a button for a specific consonant–vowel stimulus) experiments. The results revealed an N100 ERP component which was present for all listening contexts and conditions; however, it was attenuated in the active AV condition where participants were able to view the speaker’s face, including the mouth and jaw. The P300 ERP component was present within the active experiment only, and significantly greater within the AV condition compared to the PX condition. This suggests increased neural effort for detecting deviant stimuli when visible articulation was present and visual influence on perception. Finally, the P300 response was negatively correlated with autism-like traits, suggesting that higher autistic traits were associated with generally smaller P300 responses in the active AV and PX conditions. The conclusions support the finding that atypical audiovisual processing may be characteristic of the BAP in adults.

  • This study examined fMRI activation when perceivers either passively observed or observed and imitated matched or mismatched audiovisual (McGurk) speech stimuli. Greater activation was observed in the inferior frontal gyrus (IFG) overall for imitation than for perception of audiovisual speech and for imitation of the McGurk-type mismatched stimuli than matched audiovisual stimuli. This unique activation in the IFG during imitation of incongruent audiovisual speech may reflect activation associated with direct matching of incongruent auditory and visual stimuli or conflict between category responses. This study provides novel data about the underlying neurobiology of imitation and integration of AV speech. (C) 2011 Elsevier Ltd. All rights reserved.

  • The lexical decision (LD) and naming (NAM) tasks are ubiquitous paradigms that employ printed word identification. They are major tools for investigating how factors like morphology, semantic information, lexical neighborhood and others affect identification. Although use of the tasks is widespread, there has been little research into how performance in LD or NAM relates to reading ability, a deficiency that limits the translation of research with these tasks to the understanding of individual differences in reading. The present research was designed to provide a link from LD and NAM to the specific variables that characterize reading ability (e.g., decoding, sight word recognition, fluency, vocabulary, and comprehension) as well as to important reading-related abilities (phonological awareness and rapid naming). We studied 99 adults with a wide range of reading abilities. LD and NAM strongly predicted individual differences in word identification, less strongly predicted vocabulary size and did not predict comprehension. Fluency was predicted but with differences that depended on the way fluency was defined. Finally, although the tasks did not predict individual differences in rapid naming or phonological awareness, the failures nevertheless assisted in understanding the cognitive mechanisms behind these reading-related abilities. The results demonstrate that LD and NAM are important tools for the study of individual differences in reading.

  • By 12months, children grasp that a phonetic change to a word can change its identity (phonological distinctiveness). However, they must also grasp that some phonetic changes do not (phonological constancy). To test development of phonological constancy, sixteen 15-month-olds and sixteen 19-month-olds completed an eye-tracking task that tracked their gaze to named versus unnamed images for familiar words spoken in their native (Australian) and an unfamiliar non-native (Jamaican) regional accent of English. Both groups looked longer at named than unnamed images for Australian pronunciations, but only 19-month-olds did so for Jamaican pronunciations, indicating that phonological constancy emerges by 19months. Vocabulary size predicted 15-month-olds' identifications for the Jamaican pronunciations, suggesting vocabulary growth is a viable predictor for phonological constancy development.

  • Abstract Objectives Listening2Faces (L2F) is a therapeutic, application-based training program designed to improve audiovisual speech perception for persons with communication disorders. The purpose of this research was to investigate the feasibility of using the L2F application with young adults with autism and complex communication needs. Methods Three young adults with autism and complex communication needs completed baseline assessments and participated in training sessions within the L2F application. Behavioral supports, including the use of cognitive picture rehearsal, were used to support engagement with the L2F application. Descriptive statistics were used to provide (1) an overview of the level of participation in L2F application with the use of behavioral supports and (2) general performance on L2F application for each participant. Results All three participants completed the initial auditory noise assessment (ANA) as well as 8 or more levels of the L2F application with varying accuracy levels. One participant completed the entire L2F program successfully. Several behavioral supports were used to facilitate participation; however, each individual demonstrated varied levels of engagement with the application. Conclusions The L2F application may be a viable intervention tool to support audiovisual speech perception in persons with complex communication needs within a school-based setting. A review of behavioral supports and possible beneficial modifications to the L2F application for persons with complex communication needs are discussed.

Last update from database: 3/13/26, 4:15 PM (UTC)

Explore

Resource type

Resource language