Poster A14, Thursday, August 16, 10:15 am – 12:00 pm, Room 2000AB

Articulatory suppression enhances visual discrimination of speech

Matthew Masapollo1, Frank Guenther1,2;1Department of Speech, Language, & Hearing Sciences, Boston University, 2Department of Biomedical Engineering, Boston University

It is well established that somatosensory signals from the vocal tract play an important role in speech production and motor control (see Perkell, 2012; Guenther, 2016, for reviews). Recently, studies have provided evidence that orofacial somatosensory inputs also influence the perception of speech sounds in both adult speakers (Ito et al., 2009; Ito & Ostry, 2012) and pre-babbling infants (Yeung & Werker, 2013; Bruderer et al., 2015). Whereas these and other psychophysical experiments have demonstrated complex somatosensory-auditory interactions during speech processing at a behavioral level, neuroimaging studies indicate that visual speech cues in talking faces influence activity in somatosensory cortex above and beyond its response to auditory speech cues alone (e.g., Skipper, Nusbaum & Small, 2005; Matchin, Groulx & Hickok, 2014). Thus, understanding the contribution of potential somatosensory-visual interactions during speech processing may yield additional key insights into perception-action linkages for speech. Toward this end, we examined whether there are effects of articulator suppression on visual speech perception by measuring viseme discriminability while participants held a bite block or lip tube in their mouths. Visual stimuli were natural productions of vowels from within the same phonemic category (/u/) or from two different phonemic categories (/ɛ-æ/). We chose vowels whose corresponding visemes are optically distinct. For the within-category contrast, the variants of /u/ were produced with different degrees of visible lip compression and protrusion. For the between-category contrast, /æ/ was produced with a lower mandibular position than /ɛ/. Multiple tokens of each vowel were video recorded by a female speaker. To create the visual-only discrimination pairings, the audio track was removed from the AV recordings of the model speakers’ productions. Thirty-eight monolingual English speakers were tested using a traditional same-different (AX) discrimination task. For the baseline group, the AX task was conducted with no oral-motor manipulation. For the experimental group, the AX task was conducted with either a tube between the lips or a bite block between the upper and lower teeth. Subjects were assigned randomly to one of the two conditions. On each trial, subjects watched silent video sequences of the model speaker articulating the vocalic gestures, and then judged whether they were the same or different. We employed a signal detection theory analysis to assess perceptual sensitivity; the dependent measure was a-prime. Results showed that when proprioceptive information from subjects’ own vocal tracts was constrained, their overall discrimination performance was significantly better compared to baseline (p = .058). There was also a trend such that discrimination of the within-category (/u/) pairings was highest for the subjects with a tube inserted between their lips, whereas discrimination of the between-category pairings (/ɛ-æ/) was highest for the subjects with the bite block between their teeth. There is not sufficient statistical power to compare performance across the two perturbation types, thus further research is still needed to assess whether this pattern interacts with vowel contrast. Nevertheless, these findings raise the intriguing possibility that suppression of an articulator (lips or jaw) may heighten attention to visible movements of that articulator during concurrent speech perception.

Topic Area: Perception: Speech Perception and Audiovisual Integration