Engineering and Music
"Human Supervision and Control in Engineering and Music"



Workshop
Orchestra Concert
Ensemble Concert
About us

Giovanni De Poli
with Sergio Canazza
Carlo Drioli
Antonio Roda’
Alvise Vidolin
Patrick Zanon

Analysis and modeling of expressive intentions in music performance

Abstract
Performer can convey different expressive intentions when playing a piece of music. Perceptual and acoustic analysis of expressive music performances tries to understand the musicians’ strategies. Moreover models for rendering different expressive intentions were developed both for analysis purpose and for a deeper multimedia products fruition.
 
Musical performance and expressive intentions
Music is an important means of communication where three actors participate: the composer, the performer and the listener. The composer instills into his works his own emotions, feelings, sensations and the performer communicates them to the listeners. The performer uses his own musical experience and culture in order to get from the score a performance that may convey the compositor’s intention.
Different musicians, even when referring to the same score, can produce very different performances. The score carries information such as the rhythmical and melodic structure of a certain piece, but there is not yet a notation able to describe precisely the temporal and timbre characteristics of the sound. The conventional score is quite inadequate to describe the complexity of a musical performance so that a computer might be able to perform it. Whenever the information of a score (essentially note pitch and duration) is stored in a computer, the performance sounds mechanic and not very pleasant. The performer, in fact, introduces some micro-deviations in the timing of performance, in the dynamics, in the timbre, following a procedure that is correlated to his own experience and common in the instrumental practice. From such measurements it would be possible to deduce general performance features and principles.
However no musician plays the same piece in the same way on every single occasion. Each performance depends on the performer's emotional state at that particular moment as well as his/her hypothetical dialogue with other musicians and subjective artistic choices. Moreover, the same piece of music can be performed trying to convey different interpretations of the score and emotions, according to different “expressive intentions”, which can be even in contrast with the usual performance praxis of that particular piece. A textual or musical document, in fact, can assume different meanings and nuances depending on how it is played.
A major problem of analysis-by-measurements method is that a specific deviation on one note could originate from several different principles, so the "true" origin may be impossible to trace. It is difficult to identify a multidimensional structure underlying a surface level, merely by analyzing this surface level.
Some musical performance studies tried to understand how the expressiveness is conveyed in music performance using the analysis-by-synthesis approach. To this purpose, some models or rules systems for generating automatic performance were developed. First, an assumption is hypothesized, then it is realized in terms of a synthetic performance, and finally it is evaluated by listening. If needed, the hypothesized principle is further modified and the process repeated. Eventually, a new rule has been formulated. In other words, the method is to teach the computer how to play more musically. The success of this method is entirely dependent on the formulation of hypotheses and on competent listeners.

For the great variety in the performance of a piece, it is difficult to determine a general system of rules for the execution. An important step in this direction was made by Sundberg (KTH) and co-workers (http://www.speech.kth.se/
music/performance). They determined a group of criteria which, once applied to the generic score, can bring to a musically correct performance [Friberg et al. 1991]. Further on, the performer operates on the microstructure of the musical piece not only to convey the structure of the text written by the composer, but also to communicate his own feeling or expressive intention. Quite a lot of studies have been carried out in order to understand how much the performer’s intentions are perceived by the listener, that is to say how far they share a common code. Gabrielsson (http://www.psyk.uu.se/hemsidor/alf.
gabrielsson) in particular, studied the importance of emotions in the musical message. An increasing number of studies is concerned with how the musician’s intentions affect the performance. In this case the experimental material is obtained by asking the performer either to provide different performances of his own choice and to describe the intentions behind them, or to play with a certain intention in mind (see [Gabrielsson 1999] for a review).
In this context, we started researches in order to understand the way an expressive intention can be communicated to the listener and we realized a model able to explain how it can be possible to modify the performance of a musical piece in such a way that it may convey a certain expressive intention. In the following section the main aspects of analysis and modeling expressive intentions will be presented with reference to our results.
 

Analysis of expressive intentions
In fig. 1 is summarized the analysis-by-synthesis methodology.

Fig. 1: Analysis-by-synthesis methodology
Fig. 1: Analysis-by-synthesis methodology.
 

Perceptual analysis
Aim of the perceptual analysis is to see whether the performer’s intentions are grasped by the listeners and to determine the judgment categories used by the listeners.
We selected a set of scores from Western Classical and Afro-American music. For each score, different performances (correlated with different expressive intentions) were played by professional musicians.
Two different factor analysis were made. Factor analysis on adjectives (e.g. see fig. 2) allowed us to determine a semantic space defined by the adjectives proposed to the listeners. By means of factor scores, it was possible to insert the performances into this space. The comparison between the performance positions and the evaluation adjectives demonstrated a good recognition of the performers intentions by the subjects.

Fig. 2: Factor analysis on adjectives
Fig. 2: Factor analysis on adjectives. Evaluation adjectives: black, oppressive, serious, dismal, massive, rigid, mellow, tender, sweet, limpid, airy, gentle, effervescent, vaporous, fresh, abrupt, sharp. Performances: Neutral, Light, Bright, Hard, Dark, Heavy, Soft. It can be noticed a good recognition of the performers intentions by the subjects.

The second factor analysis used performances as variables (e.g. see fig. 3). It showed that the subjects had placed the performances along only two axes. The two dimensional space (Perceptual Parametric Space, PPS) so obtained represents how subjects arranged the pieces in their own minds. The first factor (expressive intention bright vs. expressive intention dark) seem to be closely correlated to the acoustic parameters which regard the kinetics of the music (for instance Tempo). The second factor (expressive intention soft vs. expressive intention hard) is connected to the parameters which concern the energy of the sound (intensity, attack time). Other acoustic parameters (e. g. legato, brightness) are related to the PPS axes and were deduced from acoustic analyses [Canazza et al. 1998].

Fig 3: factor analysis using performance as variables. First factor is correlated with kinetics of music; second factor is correlated with energy of the sound.
Fig 3: factor analysis using performance as variables. First factor is correlated with kinetics of music; second factor is correlated with energy of the sound.
 

Acoustic analysis
Acoustic analysis aims to identify which physical parameters, and how many of them, are subject to modifications when the expressive intention of the performer is varied.
Every musical instrument has its own expressive resources (vibrato in strings, the tongue in wind instruments, etc.), which are used by the musician to communicate his expressive intention. It is inevitable, therefore, that the results of any acoustic measure depend, not only on the score, but also on the characteristics of the instrument used and the strategy adopted by the musician. Consequently, it is necessary to compare the data relative to different scores, musicians and instruments, in order to identify the expressive rules that can be considered valid in a general way and which are specific cases.
Several acoustic analyses have been carried out on various musical pieces using different instruments and performers. The recordings are either in MIDI or audio format. A typical relation among expressive intentions and acoustic parameters variation in relation to a neutral (i.e. a performance without any expressive intentions) performance is shown in tab. 1.
 

Hard Soft Heavy Light Bright Dark
Tempo +
- - + + + + +
Legato
+ + - --
Attack Duration - +
+ - - -
Dynamic
+
- - +
Up-beat / Down-beat
-
+

Envelope Centroid beginning center beginning

center
Brightness + + - - + - + + - -
Vibrato
+ + +

+ +

Tab. 1: relation among expressive intentions and acoustic parameters variation in relation to a neutral performance in Violin Sonata Op. V by Arcangelo Corelli.
 

Modeling expressive intentions
According the analysis-by-synthesis method, using results of the analysis and expert’s experience, some models for producing performances with different expressive intentions has been developed. Combinations of KTH performance rules and of their parameters were used for synthesizing interpretations that differ in emotional quality (http://www.speech.kth.se/music/performance/
performance_emotion.html). 
We developed models suitable to compute the expressive deviations necessary to the rendering step in order to synthesize an expressive performance starting from a neutral one. The rendering step can be done in MIDI or by real time post-processing a recorded human performance. The system was used to generate performances of different musical repertories (sound examples in  http://www.dei.unipd.it/~musica/Expressive). Besides the fact that the models were developed mainly for western classical music, they showed a general validity in its architecture, even if some tuning of the parameters is needed. Expressive syntheses of pieces belonging to different musical genres (European classical, European ethnic, Afro-American) verified the generalization of the rules used in the models [Canazza et al. 1998].
Besides analysis validation these models allowed more general practical applications. In fact, in multimedia products, textual information is enriched by means of graphical and audio objects. A correct combination of these elements is extremely effective for the communication between author and user. Usually, attention is put on visual rather than sound, which is merely used as a realistic complement to image, or as a musical comment to text and graphics. With increasing interaction, while the visual part has evolved consequently the paradigm of the use of audio has not changed adequately, resulting in a choice among different objects rather than in a continuous transformation on these. Performance researches have demonstrated that it is possible to communicate expressive content at an abstract level, so to change the interpretation of a musical piece. A more intensive use of expressive intention control in multimedia systems will allow to interactively adapt music to different situations. Our models permit a gradual transition (morphing) between different expressive intentions, leading to a deeper fruition of the multimedia product [Canazza et al. 2000].
 
References
Canazza S., De Poli G., Di Sanzo G., Vidolin A. (1998). “A model to add expressiveness to automatic musical performance”. In Proc. of 1998 International Computer Music Conference. Ann Arbour. Pp. 163-169.

Canazza S. De Poli G., Drioli C., Rodà A., Vidolin A. (2000). “Audio morphing different expressive intentions for Multimedia Systems”. IEEE Multimedia, July-September, 7(3), pp. 79-83.

De Poli G., Roda’ A. and Vidolin A. (1998). “Note by note analysis of the influence of expressive intentions and musical structure in violin performance”. Journal of New Music Research, 27(3), pp . 293-321.

Friberg, A., Frydén, L., Bodin, L.-G., and Sundberg, J. (1991) Performance Rules for Computer-Controlled Contemporary Keyboard Music, Computer Music Journal, 15-2, pp. 49-55.

Gabrielsson, A. (1999). The performance of music. In D. Deutsch (Ed.), The psychology of music (2nd ed., pp. 501-602). San Diego: Academic Press.