A brain-computer interface that decodes intended speech from neural activity in speech-related motor cortex, producing text or synthesized voice output for people who have lost the ability to speak.

What category does Speech BCI belong to in BCI?

Speech BCI is a Applications concept in brain-computer interface science.

Speech BCI — BCI Glossary

A speech BCI decodes the neural signals associated with attempted or imagined speech — the brain's commands to the tongue, lips, jaw, larynx, and respiratory muscles — and translates them into text or synthesized audio. Speech BCIs represent the frontier of BCI communication, achieving speeds (62+ WPM) that approach natural conversational rates and far exceed what cursor-based typing BCIs can achieve.

Neural Basis

Speech production involves coordinated activity across multiple cortical areas:

Ventral premotor cortex / Broca's area: Speech planning and sequencing
Primary motor cortex (ventral/lateral): Commands to articulatory muscles (tongue, lips, jaw, larynx)
Supplementary motor area: Speech initiation and sequencing
Somatosensory cortex: Sensory feedback from articulatory organs

Speech BCIs typically record from the ventral portion of precentral gyrus (motor cortex face/mouth area) using either intracortical electrodes or high-density ECoG grids.

Key Demonstrations

Moses et al. (2021): UCSF/Chang lab decoded attempted speech from ECoG in a patient with anarthria (inability to speak due to brainstem stroke). Achieved 15 WPM with a 50-word vocabulary. First demonstration of a speech neuroprosthesis.
Willett et al. (2023): Stanford/BrainGate decoded attempted speech from intracortical recordings in motor cortex at 62 WPM with 23.8% word error rate (reduced to 9.1% with language model). Record speed for intracortical speech BCI.
Metzger et al. (2023): UCSF decoded attempted speech from ECoG at 78 WPM using a combination of neural decoding and large language model postprocessing. Also demonstrated decoding of facial expressions for emotional communication.

Decoding Pipeline

A typical speech BCI pipeline:

Neural recording: Capture activity from speech motor cortex during attempted speech
Feature extraction: Extract relevant neural features (spike rates, high-gamma power)
Phoneme/articulatory decoding: RNN or transformer maps neural features to phonemes or articulatory gestures
Language model: A language model (n-gram, RNN, or LLM) corrects errors and produces fluent text
Output: Display text and/or drive a speech synthesizer for audio output

Significance

Speech BCIs have the potential to restore real-time conversational communication for people with ALS, locked-in syndrome, and brainstem stroke — conditions that progressively destroy the ability to speak while leaving cognitive function intact. The rapid progress from 15 WPM (2021) to 78 WPM (2023) suggests that natural-rate conversational BCIs are within reach.