Speech as an input and output modality is of particular relevance in the automotive context, since it represents a “hands-free, eyes-free” modality. Numerous studies show that speech (or the broader class of audio in the output case) is the least distracting of the common modalities with respect to the driving task, and therefore should be considered the default for many situations.
DFKI has longstanding experience in designing command-based and natural-language dialogue systems. The latter have recently been brought much more into the user’s awareness via smartphones, resulting in requests for similar interfaces in the vehicle. Through our collaboration with Nuance, IBM, and other speech technology partners in different projects, we can offer the best speech engines (speech recognition and speech synthesis) combined with advanced dialogue system technology.
Aside from the actual dialogue, we have often successfully employed paralinguistic information to personalize dialogue applications. Speech contains cues about gender, age, emotion, stress, etc. that can be extracted in approximation using pattern recognition methods.
Areas of further challenges are e.g.:
- Usage of statistical language models for modeling natural language interaction
- Situation-adaptive speech output, e.g. with respect to choice of words and voice parameters
- Cross-modal correction of speech recognition errors
- Context-sensitive treatment of speech recognition errors depending on the type of content