|Year : 2018 | Volume
| Issue : 3 | Page : 139-147
The relative contribution of visual cues and acoustic enhancement strategies in improving speech perception of individuals with auditory neuropathy spectrum disorders
Jithin Raj Balan, Sandeep Maruthy
Department of Audiology, All India Institute of Speech and Hearing, Mysore, Karnataka, India
|Date of Web Publication||11-Jan-2019|
Mr. Jithin Raj Balan
Department of Audiology, All India Institute of Speech and Hearing, Mysore - 570 006, Karnataka
Source of Support: None, Conflict of Interest: None
Background and Objectives: The present study aimed to assess the relative benefits of visual cue supplementation and acoustic enhancement in improving speech perception of individuals with Auditory Neuropathy Spectrum Disorders (ANSD). Methods: The study utilized repeated measure research design. Based on the purposive sampling 40 participants with ANSD were selected. They were assessed for their speech identification of monosyllables in auditory only (A), visual only (V), and auditory-visual (AV) modalities. In the A and AV modalities, the perception of the primary, temporally enhanced, and spectrally enhanced syllables were assessed in quiet as well as 0 dB signal to noise ratio (SNR) conditions. The identification scores were compared across modalities, stimuli, and conditions to derive the relative benefits of visual cues and acoustic enhancement on speech perception of individuals with ANSD. Results: The group data showed a significant effect of modality with the mean identification score being the maximum in AV modality. This was true both in quiet and 0 dB SNR. The mean identification scores in quiet were significantly higher compared to that in 0 dB SNR. However, acoustic enhancement of speech did not significantly enhance speech perception. When acoustic enhancement and visual cues were simultaneously provided, speech perception was determined only by visual cues. The evidence from individual data showed that most of the individuals benefit from AV modality. Conclusions: The findings indicate that both auditory and visual modality needs to be facilitated in ANSD to enhance speech perception. The acoustic enhancements in the current form have negligible influence. However, the inference shall be restricted to the perception of stop consonants.
Keywords: Acoustic enhancement, auditory neuropathy spectrum disorders, speech perception, visual cues
|How to cite this article:|
Balan JR, Maruthy S. The relative contribution of visual cues and acoustic enhancement strategies in improving speech perception of individuals with auditory neuropathy spectrum disorders. Indian J Otol 2018;24:139-47
|How to cite this URL:|
Balan JR, Maruthy S. The relative contribution of visual cues and acoustic enhancement strategies in improving speech perception of individuals with auditory neuropathy spectrum disorders. Indian J Otol [serial online] 2018 [cited 2019 Dec 12];24:139-47. Available from: http://www.indianjotol.org/text.asp?2018/24/3/139/249870
| Introduction|| |
The management of speech perception deficits in individuals with Auditory Neuropathy Spectrum Disorders (ANSDs) is always a challenge to audiologists. The individuals with ANSD have poor speech perception as their primary complaint, more so in adverse listening conditions. Earlier studies have shown disrupted temporal processing in individuals with ANSD,, which probably is the underlying reason for their speech perception deficits.
Conventional amplification devices do not address their temporal processing deficits and are known to yield limited benefit in improving speech perception. FM devices provide more benefit compared to conventional hearing aids, but the utility is limited to only a few listening situations. Cochlear implantation is known to benefit individuals with ANSD only if the lesion is presynaptic or synaptic., In view of these, attempts have been made to enhance the input speech signal in various ways to facilitate speech perception in individuals with ANSD. Companding is one such method of spectral enhancement wherein the peak to valley difference in the spectrum is increased. It was assumed that the spectral contrast enhancement may be compensating for the poor frequency resolution in individuals with hearing impairment.
Oxenham et al. simulated cochlear implant processing and utilized companding strategy in individuals with normal hearing. It was found that companding has improved speech perception in them. Similar findings were reported by Bhattacharya and Zeng. They reported improvement in phoneme as well as sentence perception in cochlear implant users with companding. Such improvements were not observed in normal hearing individuals. Other acoustic enhancements such as envelope enhancement although has been reported to improve speech perception in ANSD, the maximum improvement observed was up to 22.4% in 10 dB signal to noise ratio (SNR) condition, and the benefit was more in the individuals with good speech identification score.
Zeng et al., presented speech with degraded temporal envelope to individuals with normal hearing and found perceptual deficits similar to ANSD. It was reported that the temporal smearing affects the spectral contrast which in turn results in poor consonant-vowel distinction in individuals with ANSD. It was inferred that impaired speech perception seen in individuals with ANSD may be due to failure in following the low-frequency envelope of the signal. Therefore, enhancing the envelope was thought to compensate for the loss in temporal modulation, in individuals with poor speech perception such as ANSD. In view of this, Narne and Vanaja enhanced the envelope of the speech signal by a magnitude of 15 dB and found maximum improvement in speech perception in individuals with ANSD when the envelope bandwidth was enhanced from 3 to 30 Hz. In cochlear hearing loss, it was found that envelope enhancement can improve speech perception even in the presence of background noise. The contrasting result was reported among individuals with normal hearing, in which input envelope compression led to degraded consonant and vowel recognition. Bhattacharya et al., studied the combined effect of spectral expansion and temporal enhancement spectral maxima (TESM) in cochlear implant users in quiet and noisy conditions. The results showed significant improvement in vowel and consonant recognition in noise. Improvement in sentence recognition was noted with spectral enhancement alone and also in combination with TESM.
Earlier studies have reported that in the instance of competing noise, visual cues help to compensate the impaired speech perception through supplementing the missing cues.,, A better understanding of speech in poor acoustic condition is possible by means of integration of auditory and visual cues., This was further supported by the behavioral and neurophysiological evidence. Among individuals with hearing impairment speech perception in audiovisual (AV) modality is reported to be more precise compared to audio alone or visual alone condition. This improvement in AV modality was seen in degraded acoustic condition as well as in silent discourse.
Ramirez and Mann studied speech perception in ANSD and found that individuals with ANSD compensate their speech perception difficulty both in quiet and noise, by focusing on visual cues. In their study, the participants were asked to identify the consonant-vowel (CV) presented in the audio alone, video alone, and audiovisual conditions. There was no difference in speech identification scores for stimulus presented in video alone and AV conditions, and they concluded that ANSD rely solely on the visual cues.
With this background, the present study aimed to investigate the relative contribution of visual cues and acoustic enhancements in improving speech perception of individuals with ANSD. In the past, the benefit of visual cues and acoustic enhancements was documented in independent studies. However, these reports do not provide a clear picture about which of these yield better speech perception in individuals with ANSD. Therefore, it is important to compare their relative contribution in improving the speech perception of individuals with ANSD. This will guide the clinical audiologists in choosing the right strategy for the best possible management of ANSD. Furthermore, it is also important to understand whether the combination of acoustic enhancement and visual cue supplementation results in better benefit compared to a single strategy. The interaction between the facilitation provided by the two strategies warrants a systematic investigation, to guide clinicians in the best possible management of individuals with ANSD. The study also aims to highlight the individual differences in the benefit derived from the acoustic enhancement and visual cue supplementation, which needs definite consideration in deriving the clinical benefits of these strategies.
Aim and objectives
This study aims to study the relative benefits of visual cue supplementation and acoustic enhancement in improving speech perception of individuals with ANSD.
| Methods|| |
The present study incorporated a repeated measure design to study the null hypothesis that there is no significant difference in the benefit yielded by acoustic enhancements and visual cues in the speech perception of individuals with ANSD. A total of 40 individuals diagnosed to have ANSD participated in the study. They were in the age range of 16–35 years (mean age = 24.19 years, standard deviation [SD] being 7.4).
The approximate age of onset of hearing loss in these participants was 13 years, and all of them had acquired ANSD postlingually. Few of the participants (n = 3) had normal hearing, and the rest of the participants had sensorineural hearing loss up to moderate degree with pure tone average ranging up to 55 dB HL. The presence of middle ear pathology was ruled out by an experienced otologist and through immittance evaluation. All the participants had normal outer hair cell function revealed by robust transient evoked otoacoustic emissions (amplitude of more than 6 dB sound pressure level). They had absent click-evoked auditory brainstem responses, indicative of neuronal dys-synchrony. They had undergone neurological examination to rule out the presence of space-occupying lesions. Neurological evaluation included computed tomography scan and/or magnetic resonance imaging. Based on the opinion of the neurologist, and the audiological profile, diagnosis of ANSD was made by an experienced Audiologist.
The participants had normal or corrected vision (6/6) in a Snellen eye chart. They were native speakers of Kannada and were literates with minimum educational qualification of secondary school. All of them could comfortably read the nonmeaningful CV syllables used in the present study. Informed consent was taken from each participant before carrying out the test. The method of the study was approved by AIISH Ethical Committee for bio-behavioral research.
Test stimuli and test condition
In the present study, participants were tested for their speech perception in Auditory only (A modality), Visual only (V modality), and Auditory-visual (AV) modalities. Six CV syllables which were nonmeaningful in Kannada language were the test stimuli used for speech perception. The consonants in the syllables were, unvoiced stop consonants; velar/k/, retroflex/ṭ/and bilabial/p/, and their voiced counterparts/b/,/ḍ/and/g/. The vowel used was /a/ in all the syllables.
The auditory stimuli were acoustically enhanced using two methods; envelope enhancement and companding. The speech perception in A and AV modalities was tested using the primary, companded, and envelope enhanced stimuli. Visual stimuli included the video of an adult male speaker articulating the CVs.
Generation of primary stimuli
The test stimuli were audio recorded using a unidirectional microphone (AHUJA AUD-101 XLR) placed approximately 6 cm away from speaker's mouth. The microphone was connected to the computer with adobe audition (version 3) software, where the recording was saved. The syllables were digitized at a sampling frequency of 44,100 Hz and 16 bit digitization.
The six syllables were spoken by five adult males, who were native speakers of Kannada. It was ensured that all the speakers have clinically normal speech. The speakers were instructed to produce the CVs clearly at a normal conversational level avoiding exaggeration in articulation. Each syllable was recorded three times, and out of the three samples, the best audio sample in terms of acoustical properties and the perceptual quality was selected. The recorded syllables were normalized (root mean square normalization) to minimize differences in the energy across the syllables, using adobe audition (version 3) software.
The recorded syllables were played to 10 sophisticated listeners, through Sennheiser headphones (HDA 200). The listeners were instructed to rate the clarity of the recorded syllables on a 3-point scale (unclear, clear, and very clear). Based on their ratings, only the syllables which were rated “very clear” by all the listeners were shortlisted. It was found that all the six syllables spoken by one of the speakers were rated as “very clear” by all the listeners. Therefore, the audio samples of that speaker were used for speech perception testing. These six recorded syllables are operationally termed as “primary stimuli” of this study.
Generation of enhanced auditory stimuli
The primary stimuli were spectrally enhanced through companding, using the procedure described by Turicchia and Sarpeshkar. MATLAB-7 (The Math Works, Natick, USA) was used for the purpose. During companding (based on instantaneous amplitude) signals were enhanced by a factor ranging between 0.3 and 1. The resultant syllables are called “Companded syllables.” On the other hand, the primary syllables were temporally enhanced (envelope enhancement) using the procedure recommended. The syllables were enhanced by a factor of compression value of 0.3 and expansion value of 4. The resultant syllables are called “Envelope enhanced syllables.” [Figure 1] shows the spectra of the representative vowel/a/and the spectra of the same vowel after companding. [Figure 2] shows the waveform of the representative primary and the envelope enhanced syllable/pa/.
|Figure 1: Spectra of the representative vowel /a/ and the spectra of the same vowel after companding|
Click here to view
|Figure 2: waveform of the representative primary (a) and the envelope enhanced (b) syllable /pa/|
Click here to view
Generation of visual stimuli
The close-up video of an adult male uttering the six target syllables served as the visual stimuli. It was the same individual whose audio samples were chosen as the primary auditory stimuli. The video was recorded by a professional videographer using high definition Sony HXR-MC2500 professional camera (Recording frame rate: × (24Mbps) 1920 × 1080/50i, 25p, 16:9, 1280 × 720/50p, 16:9). The recording was done in an audiometric room with appropriate lighting. A white screen was used as a background.
The video camera was kept on a tripod stand at a distance of 3 feet from the speaker. The speaker was instructed to produce the syllables clearly without exaggerating the articulation. He was also informed to minimize eye blinks and avoid head movements, during the recording. The syllables which were articulated unclearly were recorded twice. The recording was edited to improve the picture clarity and to keep the duration of each visual stimulus to 4 s. The initial one second was a steady video (without articulation), and the articulation began at the end of the first second. After the end of articulation of the syllable, the steady video was continued till the end of the 4th second. Video pad editor software (version 4.2.2) was used for editing the video, and the video of each syllable was saved separately. The picture sequence showing the production of syllable/pa/is shown in [Figure 3].
Generation of audiovisual stimuli
To generate AV stimuli, the auditory stimulus was dubbed on to the corresponding visual stimulus. While dubbing, the two stimuli were time aligned so that the articulation of the speaker starts and ends with the auditory stimulus. This was done using video pad editor software (version 4.2.2). The AV stimuli thus prepared were played to 5 experienced audiologists to judge the synchrony between auditory and visual components of the stimuli. All five audiologists confirmed good synchrony in all the six syllables.
For the enhanced AV stimulus conditions, acoustically enhanced (companded and temporal enhanced) stimuli were time aligned to the visual stimuli. The experimental setup and the screenshot of the CVs displayed on liquid crystal display (LCD) + screen are represented in [Figure 4] and [Figure 5], respectively.
Each participant was individually tested for their speech identification in different stimulus and test conditions in a double wall, acoustically treated room, where the ambient noise level is within the permissible limit. No previous practice trial was given for listeners with any of the stimuli used for the study. The task was closed set identification of CV syllables in A, V, and AV modalities. In the A and AV modalities, perception of the primary, temporally enhanced and spectrally enhanced CV syllables was assessed in quiet and 0 dB SNR conditions. In the V modality perception of syllables was tested for primary stimuli only in quiet. The procedure is in line with the earlier studies wherein speech perception in the V modality was tested only for primary stimuli in quiet condition only, whereas the A and AV modalities of speech perception were tested in different SNRs.,,
Paradigm software (version 22.214.171.124) was used for stimulus delivery and recording of response. The audio stimuli were presented through a loudspeaker connected to GSI Audiostarpro Audiometer, at 45° azimuth at most comfortable level. The visual stimuli were presented through a 21”' Samsung LCD monitor. Each syllable was presented 10 times in pseudorandom sequence. The selection of modality of presentation was random. Considering that there were six syllables, presented in 13 different stimulus conditions, there were total 780 (6 × 10 × 13) stimulus presentations.
The task of the participants was to identify the syllable presented. Each correct response was given a score of one, and incorrect response a score of zero. The total number of correct responses of each participant in each stimulus condition was noted down as the raw score. The group data were statistically analyzed to derive the independent and combined effects of acoustic enhancements and visual cue supplementation on speech perception abilities.
| Results|| |
The present study aimed to investigate the effect of acoustic enhancements and visual cues on speech perception in individuals with ANSD. To do this, data were tabulated and statistically analyzed using Statistical Package for social science (SPSS) version 21 (IBM Corporation in Armonk, New York). The mean and SD of identification scores in all test conditions are given in [Table 1]. The total identification scores of each participant in each modality, in the quiet and 0 dB SNR conditions are shown in [Figure 6] and [Figure 7], respectively.
|Table 1: Mean and standard deviation of syllable identification score obtained for the three types of stimuli (primary, companded and envelope enhanced syllables) in auditory, auditory-visual and visual modalities in quiet and 0 dB signal to noise ratio conditions|
Click here to view
|Figure 6: The total identification scores of each participant in each modality, in the quiet condition. The scores are depicted with reference to the scores in the A modality|
Click here to view
|Figure 7: Total identification scores of each participant in each modality, in the 0 dB signal to noise ratio condition (A and AV) and quiet condition for V modality. The scores are depicted with reference to the scores in the A modality|
Click here to view
The effect of modality on speech identification scores
In the identification of primary stimuli, the mean score in quiet was best in AV followed by A and least in V modality. Whereas in 0 dBSNR, the mean score was higher in AV than in A-modality and the mean score in A was lesser than that in V modality of quiet condition. In both [Figure 6] and [Figure 7], the scores have been presented in comparison to the score in the A condition. It can be observed that in most of the participants, score in the AV modality was better than that in A and V modalities. This was true both in quiet as well as 0 dBSNR condition. In the quiet condition, among the A and V modalities, most of them obtained higher identification scores in A compared to V modality. However, exceptional cases did exist wherein, A was better than AV, V was better than AV, and V was better than A. On the other hand, at 0 dBSNR, the individual scores in the A modality was poorer than that in the V modality (obtained in quiet) in many instances.
The effect of acoustic enhancement on speech identification scores
In the quiet condition, mean syllable identification score in the envelope enhanced stimuli was comparable to that in primary stimuli. Whereas the mean identification score of companded stimuli was lower than the primary stimuli. In 0 dB SNR condition, the envelope enhanced stimuli had higher means scores compared to primary and companded stimuli. The mean scores were comparable between primary and companded stimuli. [Figure 8] and [Figure 9] show total identification scores of each participant in each stimulus type (primary, companded, and envelope enhanced), in the quiet and 0 dB SNR condition, respectively.
|Figure 8: Total identification scores of each participant in each stimulus type in quiet condition. The scores are depicted with reference to the scores in the A primary condition|
Click here to view
|Figure 9: Total identification scores of each participant in each stimulus type in 0 dB signal to noise ratio condition. The scores are depicted with reference to the scores in the A primary condition|
Click here to view
The effect of visual cues on the speech identification scores of acoustically enhanced stimuli
As shown in [Table 1], the mean identification scores increased in the AV compared to A modality for both envelope enhanced and companded stimuli. In the quiet condition, the increase in the mean score was more for companded stimulus compared to envelope enhanced stimulus. Whereas, in 0 dB SNR mean score was comparable among the two types of enhancements. The difference in mean syllable identification score obtained in the two modalities (A and AV), two conditions (quiet and 0 dB SNR), and three stimuli (primary, companded, and envelope enhanced) was tested using three-way repeated measure ANOVA (2 × 2 × 3). The result showed significant main effect of condition (F (1, 39) = 159.48, P < 0.01) and modality (F (1, 39) = 105.58, P < 0.01). There was no significant main effect of stimulus (F (2, 78) = 2.48, P > 0.05). There was no three-way interaction among modality, condition, and stimulus. However, there was a significant two-way interaction between stimulus and condition (F (2, 78) = 6.24, P < 0.01) and modality and condition (F (1, 39) = 5.01, P < 0.05). There was no two-way interaction between stimulus and modality (F (2, 78) = 0.97, P > 0.05).
Because there was an interaction between stimulus and condition, the two conditions (quiet and 0 dBSNR) were compared separately in each stimulus type using paired t-test. Results showed that the scores in the quiet condition were significantly higher than that in the 0 dBSNR in primary (t = 11.82, df = 39, P < 0.05), companded (t = 10.80, df = 39, P < 0.05), and envelope enhanced stimuli (t = 8.02, df = 39, P < 0.05). Similarly, the effect of stimulus was separately tested in each stimulus condition using one way repeated measure ANOVA. Results showed no significant main effect of stimulus in quiet condition (F (2, 78) = 63.57, P > 0.05), while there was a significant main effect of stimulus in 0 dB SNR (F (2, 78) = 4.44, P < 0.05). Subsequent Bonferroni adjusted multiple comparison showed a significant difference in identification score between companded and envelope enhanced stimulus (P < 0.05). There was no significant difference between the other two pairs of stimulus type (P > 0.05).
Because there was an interaction between modality and condition, the two conditions (quiet and 0 dB SNR) were compared separately in A and AV modalities using paired t-test. Results of both A (t = 11.82, df = 39, P < 0.05) and AV modality (t = 10.49, df = 39, P < 0.05) showed significant difference between the two conditions. Similarly, the identification scores were compared across the modalities, separately in each condition (quiet and 0 dB SNR). In quiet condition, identification score in A, AV, and V modalities were compared using one-way repeated measure ANOVA. Results of quiet condition showed a significant main effect of modality (F (2, 78) = 63.71, P < 0.05). Bonferroni adjusted multiple comparison showed significant difference across all the three modalities. In the 0 dB SNR, the identification scores in A and AV modalities were compared using paired t-test. The results showed a significant difference between the two modalities (t = −9.60, df = 39, P < 0.05).
In the earlier study, envelope enhancement has been shown to improve speech perception; however, the present study did not show such differences. Therefore, in line with their study, we divided the participants into two groups based on their speech identification scores obtained for PB words. The participants in the “Good SIS group” had SIS of more than 70% (20 individuals), whereas those in “Poor SIS group” had SIS of <60% (19 individuals). The mean difference in their speech identification scores was tested using repeated measure ANOVA taking stimulus (primary, companded, and envelope enhanced) as within-subject factor, and group as between-subject factor. This was done separately for the quiet and 0 dB SNR conditions. In quiet condition, there was no significant main effect of stimulus type (F (2, 74) = 1.1, P > 0.05) group (F [1, 37] = 0.06, P > 0.05) and no interaction between stimulus type and group (F (2, 74) =1.05, P > 0.05). Similarly, in the 0 dB SNR, there was no significant main effect of stimulus type (F (2, 74) = 3.41, P > 0.05) group (F [1, 37] = 0.57, P > 0.05) and no interaction between stimulus type and group (F (2, 74) = 0.66, P > 0.05).
The effect of combination of the two strategies (visual cue supplementation and acoustic enhancement) on speech perception was tested by comparing the identification scores in the three AV stimulus types (AV primary, AV companded, and AV envelope enhanced). The results of one-way repeated measures ANOVA showed that there was no significant main effect of stimulus type (F (2, 78) = 1.77, P > 0.05) on speech identification scores. [Figure 10] shows the identification scores of each participant in AV modality for the three stimulus types.
|Figure 10: Identification scores of each participant in AV modality for the three stimuli types. The scores are depicted with reference to the scores in the AV primary condition|
Click here to view
| Discussion|| |
The objective of the study was to document the relative contribution of visual cues and acoustic enhancement in improving speech perception in individuals with ANSD. Both these strategies have been investigated in separate studies, and have been found to be beneficial in improving speech perception in ANSD. However, their relative contribution to speech perception in ANSD cannot be derived from the literature.
The results of the present study showed that visual cues enhance speech perception significantly in individuals with ANSD. This is in partial agreement with the previous study. In their study, the scores in the auditory-visual modality did not differ from the visual alone modality. Based on this, it was inferred that individuals with ANSD primarily depend on visual cues, with insignificant role of auditory cues. However, in the present study, we found that scores in the AV modality were significantly higher than that in V modality. This suggests that individuals with ANSD make use of auditory as well as visual cues for their speech perception. The individual scores showed that speech identification in the AV modality was higher than auditory as well as visual modality scores in most of the individuals. This indicates that Audiologists can recommend visual cue supplementation as a strategy to facilitate speech perception in individuals with ANSD. This also indicates that both auditory and visual modality needs to be facilitated for a comprehensive management of ANSD.
The current study used nonmeaningful monosyllables with stop consonants to assess the speech identification. Monosyllables have least redundancy and stop consonants, in particular, are most challenging for individuals with ANSD in terms of perception., Therefore, one can expect greater benefits with visual cues when words or sentences are used. This however needs to be investigated. It is also important to note that none of the participants of the current study were systematically trained for speech reading. If trained they may be able to derive greater benefits from the AV modality for speech perception.
The speech identification was poorer in the presence of noise compared to quiet condition both in A and AV modalities. The reason for such reduction is primarily due to their inability to extract the envelope and fine structure cues from speech in the auditory modality. The reduction in speech perception was seen in all the participants.
Despite reduction in speech perception in the presence of noise, the benefit derived from the visual cues was retained. In fact, the mean difference showed that the benefit derived from visual cues was more in the presence of noise. This is in agreement with the earlier studies in individuals with hearing impairment.,,,, Participant's speech perception in the AV modality was significantly better than that in the visual modality even in the presence of noise. This indicates that individuals with ANSD use auditory cues even in degraded listening environments. This is in contradiction with the reports of Ramirez and Mann although the exact reason for differences in the findings of these two studies is not known. The finding of the present study is derived from data of 40 individuals with ANSD, while Ramirez and Mann had reported their finding from four individuals with ANSD. The difference in the range of speech identification scores across subjects and the difference in the scoring pattern would have contributed for the difference in the results of the two studies.
It was also observed that the identification scores of most the participants in A modality at 0 dB SNR condition were poorer compared to that in V modality at quiet condition. In the present study speech identification in the V modality was tested only in quiet in line with the earlier studies., These studies had shown similar speech identification across different SNRs in the visual alone modality. In view of this, it can be inferred that the visual processing shall be more useful in the degraded listening environment. Furthermore, the speech identification in the AV modality must be primarily contributed by the visual cues.
Effect of acoustic enhancements
In the present study, two types of acoustic enhancements were used; companding and envelope enhancement. Both these enhancements have shown to benefit individuals with ANSD in their speech perception., While companding, compensate for their poor spectral resolution, envelope enhancement is meant to address their deficit in temporal processing. In contrary to the previous studies, neither of the two acoustic enhancements showed significant benefits in speech perception in the present study. Narne and Vanaja had shown benefits of envelope enhancement only in individuals with good speech identification scores at 0 dB SNR. However, the present study showed that the benefits were negligible both in good as well as poor speech identification groups. The procedures used in the present study for companding as well as envelope enhancement were exactly same as that of the earlier studies.,, Narne et al. had found benefits of companding only in quiet and not at 0 dB SNR in individuals with ANSD. However, in the present study, benefits of companding were absent both in quiet and 0 dB SNR. The absence of benefits of acoustic enhancement may be primarily due to the test stimuli used in the present study. The present study used only stop consonants whereas the previous studies had included other classes of consonants also. Considering that the individuals with ANSD have more difficulty with perception of transient sounds, the perception of stop consonants would be a challenge and this would have led to the absence of appreciable benefits with acoustic enhancements. However, this is only a speculation and needs to be systematically investigated. Furthermore, in the study by Narne and Vanaja, words were used as test stimuli, which possess greater redundancy. The use of monosyllables in the present study would have hindered the benefits derived with acoustic enhancements. It is proposed that future studies may be undertaken to investigate the effect of different type of stimuli on the benefit derived from acoustic enhancement in ANSD.
The role of acoustic enhancements for the perception of consonants should not be totally ruled out based on the present findings. The participants were not exposed to companded and envelope enhanced speech before the testing in this study. Listening to the acoustically enhanced speech was a naïve experience to them. Therefore, it is suggested that future studies be undertaken to train the individuals with ANSD for listening to the acoustically enhanced speech and then conclude on the benefits derived from it.
In the present study, we were also interested to investigate the combined effect of visual cue supplementation and acoustic enhancement on speech perception in individuals with ANSD. It was found that there is no benefit of combining the two strategies and the benefit derived from the combined input was only due to visual cues. That is, there was no integration benefit when both the strategies were delivered in unison to the individuals with ANSD.
Overall, the results suggest that individuals with ANSD benefit from AV modality of speech perception and the benefit from acoustic enhancement is negligible. Therefore, Audiologists should attempt to facilitate visual modality along with the auditory modality in these individuals. The possible approaches to be used for facilitating visual mode are guiding them about anticipatory compensatory strategies and training them for speech reading. There was a significant deterioration in speech perception in the presence of noise both in auditory and auditory visual modalities. This suggests that the auditory cues are also of importance to these individuals. Therefore, maintaining a good SNR of the input speech is necessary, and steps should be taken for the same during the management of ANSD. The findings of the present study on the lack of benefit of acoustic enhancements should be restricted to the perception of stop consonants.
| Conclusions|| |
Among the visual cues and acoustic enhancements, it was found that individuals with ANSD benefit primarily from visual cues. The benefits of visual cues are present both in quiet and noisy situations. Individuals with ANSD are not solely dependent on the visual cues. The findings suggest that Audiologists have an action plan to facilitate speech reading during the management of individual with ANSD. Although the current study showed that there are negligible benefits from acoustic enhancements, the inference should be restricted to the perception of stop consonants in individuals with ANSD.
We wish to thank our Director, All India Institute of Speech and Hearing, for allowing us to conduct the study. We extend our sincere thanks to Dr. Vijaya Kumar Narne, for his timely help.
Declaration of patient consent
The authors certify that they have obtained all appropriate patient consent forms. In the form the patient(s) has/have given his/her/their consent for his/her/their images and other clinical information to be reported in the journal. The patients understand that their names and initials will not be published and due efforts will be made to conceal their identity, but anonymity cannot be guaranteed.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| References|| |
Starr A, Picton TW, Sininger Y, Hood LJ, Berlin CI. Auditory neuropathy. Brain 1996;119 (Pt 3):741-53.
Zeng FG, Kong YY, Michalewski HJ, Starr A. Perceptual consequences of disrupted auditory nerve activity. J Neurophysiol 2005;93:3050-63.
Shallop JK. The diagnosis and mis-diagnosis of auditory neuropathy/dys-synchrony in pediatric patients. SIG 9 Perspect Hear Hear Disord Child 2002;12:27-30.
Narne VK, Vanaja CS. Perception of speech with envelope enhancement in individuals with auditory neuropathy and simulated loss of temporal modulation processing. Int J Audiol 2009;48:700-7.
Kumar AU, Jayaram M. Auditory processing in individuals with auditory neuropathy. Behav Brain Funct 2005;1:21.
Rance G, Corben LA, Du Bourg E, King A, Delatycki MB. Successful treatment of auditory perceptual disorder in individuals with friedreich ataxia. Neuroscience 2010;171:552-5.
Berlin CI, Hood LJ, Morlet T, Wilensky D, Li L, Mattingly KR, et al.
Multi-site diagnosis and management of 260 patients with auditory neuropathy/dys-synchrony (auditory neuropathy spectrum disorder). Int J Audiol 2010;49:30-43.
Sininger YS. Changing Considerations for Cochlear Implant Candidacy: Age, Hearing Level and Auditory Neuropathy. In: A Sound Foundation Through Early Amplification 2001: Proceedings of an International Conference. Stäfa, Switzerland: Phonak AG; 2002.
Bhattacharya A, Zeng FG. Companding to improve cochlear-implant speech recognition in speech-shaped noise. J Acoust Soc Am 2007;122:1079-89.
Tyler RS, Fernandes M, Wood EJ. Masking, temporal integration and speech intelligibility in individuals with noise-induced hearing loss. In: Disorders of Auditory Function. Manchester: Elsevier; 1980. p. 211-36.
Oxenham AJ, Simonson AM, Turicchia L, Sarpeshkar R. Evaluation of companding-based spectral enhancement using simulated cochlear-implant processing. J Acoust Soc Am 2007;121:1709-16.
Narne VK, Barman A, Deepthi M. Effect of companding on speech recognition in quiet and noise for listeners with ANSD. Int J Audiol 2014;53:94-100.
Narne VK, Vanaja CS. Perception of envelope-enhanced speech in the presence of noise by individuals with auditory neuropathy. Ear Hear 2009;30:136-42.
Zeng FG, Oba S, Garde S, Sininger Y, Starr A. Temporal and speech processing deficits in auditory neuropathy. Neuroreport 1999;10:3429-35.
Drullman R, Festen JM, Plomp R. Effect of reducing slow temporal modulations on speech reception. J Acoust Soc Am 1994;95:2670-80.
Narne VK, Vanaja CS. Effect of envelope enhancement on speech perception in individuals with auditory neuropathy. Ear Hear 2008;29:45-53.
Baer T, Moore BC, Gatehouse S. Spectral contrast enhancement of speech in noise for listeners with sensorineural hearing impairment: Effects on intelligibility, quality, and response times. J Rehabil Res Dev 1993;30:49-72.
Loizou PC, Dorman M, Fitzke J. The effect of reduced dynamic range on speech understanding: Implications for patients with cochlear implants. Ear Hear 2000;21:25-31.
Bhattacharya A, Vandali A, Zeng FG. Combined spectral and temporal enhancement to improve cochlear-implant speech perception. J Acoust Soc Am 2011;130:2951-60.
Tye-Murray N, Sommers MS, Spehar B. Audiovisual integration and lipreading abilities of older adults with normal and impaired hearing. Ear Hear 2007;28:656-68.
Munhall KG, Kroos C, Jozan G, Vatikiotis-Bateson E. Spatial frequency requirements for audiovisual speech perception. Percept Psychophys 2004;66:574-83.
MacLeod A, Summerfield Q. Quantifying the contribution of vision to speech perception in noise. Br J Audiol 1987;21:131-41.
Anderson E. Audiovisual Speech Perception with Degraded Auditory Cues (Doctoral Dissertation, The Ohio State University; 2006.
Ross LA, Saint-Amour D, Leavitt VM, Javitt DC, Foxe JJ. Do you see what I am saying? Exploring visual enhancement of speech comprehension in noisy environments. Cereb Cortex 2007;17:1147-53.
Grant KW, Walden BE, Seitz PF. Auditory-visual speech recognition by hearing-impaired subjects: Consonant recognition, sentence recognition, and auditory-visual integration. J Acoust Soc Am 1998;103:2677-90.
Ramirez J, Mann V. Using auditory-visual speech to probe the basis of noise-impaired consonant-vowel perception in dyslexia and auditory neuropathy. J Acoust Soc Am 2005;118:1122-33.
Venkatesan VS. Ethical guidelines for Bio-Behavioral Research Involving Human Subjects. Mysuru: All India Institute Speech Hearing; 2009. p. 1-23.
Turicchia L, Sarpeshkar R. A bio-inspired companding strategy for spectral enhancement. IEEE Trans Speech Audio Process 2005;13:243-53.
Apoux F, Tribut N, Debruille X, Lorenzi C. Identification of envelope-expanded sentences in normal-hearing and hearing-impaired listeners. Hear Res 2004;189:13-24.
American National Standards Institute. Maximum Permis-sible Ambient Noise Levels for Audiometric Test Rooms. ANSI S3. 1-1991. New York: American National Standards Institute; 1991. p. 399-407.
Hassan DM. Perception of temporally modified speech in auditory neuropathy. Int J Audiol 2011;50:41-9.
Narne VK. Temporal processing and speech perception in noise by listeners with auditory neuropathy. PLoS One 2013;8:e55995.
Buss E, Hall JW 3rd
, Grose JH. Temporal fine-structure cues to speech and pure tone modulation in observers with sensorineural hearing loss. Ear Hear 2004;25:242-50.
Bernstein LE, Auer ET, Takayanagi S. Auditory speech detection in noise enhanced by lipreading. Speech Community 2004;44:5-18.
Erber NP. Interaction of audition and vision in the recognition of oral speech stimuli. J Speech Hear Res 1969;12:423-5.
Grant KW, Seitz PF. The use of visible speech cues for improving auditory detection of spoken sentences. J Acoust Soc Am 2000;108:1197-208.
Sumby WH, Pollack I. Visual contribution to speech intelligibility in noise. J Acoust Soc Am 1954;26:212-5.
[Figure 1], [Figure 2], [Figure 3], [Figure 4], [Figure 5], [Figure 6], [Figure 7], [Figure 8], [Figure 9], [Figure 10]