Engineering and Music
"Human Supervision and Control in Engineering and Music"

Orchestra Concert
Ensemble Concert
About us

Dr.-Ing. Thomas Jürgensohn

Is Gestural Control in Automotive Engineering Comparable to Gestural Control of Music?


Novel driver assistant systems like navigation systems increasingly make use of multimedia display techniques. By utilisation of several modalities, e.g. language and gestures, a higher efficiency of usage can be expected, because independent resources are addressed. For example, hand gestures can be used to control the menu of an information system in a vehicle. In contrast to the operation of a musical instrument, gestural commands are discrete by nature and restricted to a small number of commands (iconic gestures). However, if gestural control is understood in a broader sense as Wanderley (2001) does in his definition of gestural control in music, computer aided steering (steer by wire) can be interpreted as a kind of gestural control. Interestingly, problems quite similar to those in gestural control in music are encountered in the course of designing steer by wire systems.

What is Gestural Control?

There is no single universally accepted definition of what a gesture actually is. McNeill (1995) defines a gesture as “movements of the arms and hands which are closely synchronized with the flow of speech”. This explicitly excludes the involvement of the body or gestures without speech. In colloquial language, gestures are thought of as movements of any part of the body with a close connection to either emotional states or information which is to be expressed. In general, these movements are free in space and not bounded by mechanical constraints (for instance by playing a musical instrument). The information expressed by gestures is received by the observer by looking at the gesticulating person. In this sense, gestural control is the controlling of something by means of movements of the body. With this definition, a conductor of an orchestra is a gestural controller. One of the most important properties of such a human-to-human gestural control is the non-discrete character of the transferred information. The most famous conductors excel by their ability to translate their inspiration and their interpretation into gestures and by those into the behaviour of the musicians. This is much more than just the definition of the tempo.

The non-discrete character of the information flow is also one of the most important points in gestural control of music. In opposite to the above definitions of gestures, here every action of an instrumentalist is regarded as a gesture. The essential difference to “normal” music performance is the separation of a gesture controller unit and the sound generation unit (Wanderley, 2001). With this, gestural control can be defined as controlling of something by means of any possible body movement interpreted by a non-mechanical transformation unit. The characteristic feature of gestural control in music is the non-mechanical transformation of mechanical expressed information with an auditive feedback.

Gesture Control in Cars

The main task of a car user is driving. This demands visual attention as continuous as possible and requires hands and feet for the control of the car. At the same time side tasks like handling the radio or the climate control are to be fulfilled. Usually, these are to be operated by hand and with short control glances. This can lead to conflicts as the performance of each resource is limited. In the course of ever more complex information and driver assistance systems a non-mechanical control of certain functions in vehicles using language or gestures is therefore being thought of for year. (Westphal, Waibel, 1999; Akyol et al. 2000). Fig. 1 shows an example of such a system. An infrared sensitive CCD-camera integrated in the ceiling trim of the vehicle records the gesture space. The image data is analysed and classified by means of methods of digital image processing.


Figure 1: Gesture recognition for control of an information system in a car (from Akyol et al., 2001)

Via gestures it would also be possible to control the play of videos or animations or navigate through a menu hierarchy of the information system. For this purpose, the following gestures with the corresponding functions can be utilized (Fig. 2).



Figure 2: Gestures to control a multimodal demonstrator  (from Akyol et al., 2001)

Gestural control here means as in the example of the conductor above postures of the hands in space, which are received visually. But, in contrast to the emotionally controlled gestures of the conductor, here the information is discrete – every hand posture corresponds to a specific command.

Steering by Wire: A Kind of Gestural Control?

If we adopt the definition of gestural control in music to the driving task, steering with modern steering systems, in which the turning of the steering wheel is decoupled mechanically from the wheels (steer by wire), can be taken as gestural control of the lateral movements of the car. Like playing a digital musical instrument we have a physical interaction of the “player” with “gestural interface” and separately a “movement generation unit”. Like playing a digital musical instrument some variables of the gestural input are mapped multi-dimensionally and continuously to some variables of the output (i.e. movements of the car steering system). In playing a digital musical instrument the controlled medium is a sound – in handling a steer by wire system the controlled medium is the movement of the car on the street. In both cases, there is an infinite number of possibilities to define the mapping. In both systems the designers are spoilt for choice due to the virtually unlimited potential of the computer-based mapping functions.

Wanderley (2001) describes how the variables breath, lip pressure, and fingering by playing an electronic woodwind can be mapped with different strategies to variables of sound (dynamics, loudness, vibrato, and fundamental frequency). Very similar is the problem of mapping the variables steering wheel angle, steering force, and steering velocity to variables of the vehicle dynamic (e.g. lateral acceleration, side slip angle, yaw rate). Up to now, it is very much unknown what the “optimal” mapping strategy is. It could be that every driver needs his or her individual mapping function. We know that the variety of different drivers is enormous. From the technical point of view, such individual mapping strategies are possible. But, the problem is to define the optimisation criteria. It could be an interesting new research area to adopt the methods of designing steer by wire systems to the design of digital musical instruments and vice versa.


Akyol, S.; Libuda, L.; Kraiss, K.-F. (2001). Multimodale Benutzung adaptiver Kfz-Bordsysteme. In Jürgensohn, T.; Timpe, K.-P. (Eds.), Kraftfahrzeugführung (p. 137-154), Springer Verlag, Berlin, 2001.

Akyol, S.; Canzler, U.; Bengler, K. & Hahn, W. (2000). Gesture Control for use in Automobiles. Proceedings of the IAPR MVA 2000 Workshop on Machine Vision Applications, p. 349–352

McNeill, D. (1995). Hand and Mind: What Gestures Reveal About Thought, Chicago, University of Chicago Press (2nd edition), 1995.

Wanderley, M. M. (2001). Gestural Control of Music. In: Human Supervision and Control in Engineering and Music, Kassel, September 2001.

Westphal, M. & Waibel, A. (1999). Towards Spontaneous Speech Recognition for on-board Car Navigation and Information Systems. Proceedings of EUROSPEECH 1999, Vol. 5, 19551958