Models for sound in human-computer and human-environment interaction

National project funded by MIUR (Cofin2000)


 

Table of contents

Project consortium
Project overview
Project phases
Results and activities
Publications

 

Project consortium

Department of Information Engineering, University of Padova (Giovanni De Poli, national coordinator): algorithms for sound synthesis and processing.
DIST - Lab. of Musical Informatics, University of Genova (Antonio Camurri): gesture-based control models.
Dep. of Phylosophy and Social Science, University of Udine (Bruno Vicario): psychoacustic experiments.

 

Up to Table of Contents

 

Project overview

In human-environment interaction sounds are characterised by a high semantic richness, can be associated in a direct way to well defined physical phenomena, convey many information flows which can be distributed in space and processed in parallel in time. For these and other reasons, in future multimedia systems sound will play a central role as a vehicle of complex information, often directly related to visual information in order to integrate their communicational properties. A potential obstacle to the utilisation of non-verbal sound in man-machine communication is given by the absence of a consolidated phenomenology of sound phenomena, from which the development of sound generation and control models could start. Such a phenomenology is rather scarce even for sounds produced by elementary physical systems, like collisions or friction of objects.
Having outlined these needs, the project focuses on physical parameters of sound production mechanisms and their perceptual relevance. The first task is to construct timber spaces which are not based on properties of the audio signal (e.g., brightness, nasality, etc.), but instead on the physical properties of the sounding objects (e.g., force, dimensions, shape, typology of interaction). Psychophysical scales must be constructed, from which the physical quantities themselves are better represented. Together with the perceptual studies, the project develops models for sound generation characterised by compactness (i.e. few parameters), ease of parameterization (i.e. the parameters have to be controllable in a direct and intuitive way), versatility (in order to minimize the number of models), consistency of behavior under different conditions. A third level of investigation concerns the construction of gesture-based models for control of sound synthesis.

 

Up to Table of Contents

 

Project phases

  1. Co-ordination between the Units. Planning of the experiments for the study of sound phenomenology and perception, for the analysis of gesture expressiveness and dynamics, for the analysis and synthesis of the interaction and control models. Definition of the experimental settings, definition of the sound phenomena to be investigated. Definition of the architectures for the sound generation models and the computational structures used in their parametric control.
  2. Phenomenological and perceptual aspects of sound: mechanisms and physical parameters of sound production. Analysis of simple physical systems, i.e. characterised by elementary geometrical shapes e involved in elementary interactions (collisions, frictions, and so on). Perceptual experiments, investigation of perception mechanisms of physical attributes of object in relation with the produced sound (e.g., a collision of an object on a wall produces a sound which can be short or long, soft, and so on: the human observer obtains from this noise various information: the wall material, the mass, etc.). Analysis by means of multidimensional scaling techniques, in order to extrapolate those physical dimensions that are perceptually more significant in the discrimination of audio events. Construction of psychophysical scales for the dimensions derived from analysis.
  3. Models for sound generation and control. Physically-based algorithms for sound generation, which can be adapted to different tipologies of objects. Results of phase 2 exploited for model optimisation. Integration of the sound generation algorithms within systems for human-computer interaction and gesture acquisition and analysis. Definition of models for gesture-based, results from phase 2 exploited to obtain perceptually motivated control. Continuation of the perceptual experiments and phenomenological analysis. Aspects related to simultaneous perception of visual and acoustical events, in order to understand and isolate the critical parameters for maintaining identity of audio-visual stimuli and recognition of causes and effects.
  4. Perceptual validation to the generation and control models. Analysis of the model effectiveness in conveying correct information on gesture tipology and dynamics, and physical parameters involved in the experiments (shape, weight, material. etc.). Optimization of the sound generation algorithms with regard to control and computational efficiency. Sound-gesture integration, and development of applications for the interaction systems. Prototypal systems for installations in museums, theatres, will be planned, in which the channel of expressive non verbal communication is emphasised. Development of experimental tests (usability tests) to be used on such installations.
  5. Experimental analysis of the improvements deriving from integration of the models in multimodal applications. Evaluation and usability tests of the installations developed in phase 4 in real scenarios (museums, theatres, science centres). Organization of an international workshop, with researchers and representatives from digital musical instruments industry and multimedia communication systems industry.

 

Up to Table of Contents

 

Results and activities

Perception and phenomenology

The following issues have been investigated: Listening tests have been conducted with experimental subjects, using both recorded and synthesized stimuli. Statistical and acoustical analysis has been condcted, in order to identify the acoustic cues that convey perception of a given acoustic event.

Physical models for sound generation

The models developed so far are summarized as All the models provide cartoon sounds, i.e. the sounds emphasize the salient features of the acoustic events.

Analysis of movement and gesture

General conceptual model for expressive gesture analysis, based on a multi-layer approach Developments of a set of mapping strategies, that relate the actions performed by the user to acoustic feedback.

Participation to international conferences/workshops

Public events

 

Up to Table of Contents

 

Publications

[1] D. Rocchesso, "Acoustic cues for 3-D shape information", Proceedings of the 2001 International Conference on Auditory Display, Espoo, Finland, 2001, pp. 175-180.

[2] F. Fontana, D. Rocchesso, and E. Apollonio, "Acoustic cues from shapes between spheres and cubes", Proceedings of the International Computer Music Conference 2001, September 17-22 2001, La Habana, Cuba

[3] F. Avanzini and D. Rocchesso, "Controlling material properties in physical models of sounding objects", Proceedings of the International Computer Music Conference 2001, September 17-22 2001, La Habana, Cuba

[4] D. Rocchesso, "Simple resonators with shape control", Proceedings of the International Computer Music Conference 2001, September 17-22 2001, La Habana, Cuba

[5] D. Rocchesso and L. Ottaviani, "Can one hear the volume of a shape?", IEEE Workshop on Applications of Signal Processing to Audio and Acoustics 2001, October 21-24 2001, New Paltz, New York, pp. 115-118.

[6] L. Ottaviani, F. Fontana, D. Rocchesso, and M. Rath, "Sounds from shape morphing of 3-D resonators", Workshop on Current Research Directions in Computer Music, Barcelona, Spain, November 15-17, 2001, pp. 233-238.

[7] F. Avanzini and D. Rocchesso, "Modeling Collision Sounds: Non-Linear Contact Force", COST G-6 Conference on Digital Audio Effects (DAFx01), Limerick, Ireland, December 6-8, 2001, pp. 61-66.

[8] F. Fontana, L. Ottaviani, M. Rath, and D. Rocchesso "Recognition of ellipsoids from acoustic cues", COST G-6 Conference on Digital Audio Effects (DAFx01), Limerick, Ireland, December 6-8, 2001, pp. 160-164.

[9] D. Rocchesso, L. Ottaviani, F. Avanzini, F. Fontana, and M. Rath, "Sonic Rendering of Shape and Material", IEEE Comp. Graphics and Applications, submitted 2001.

[10] Camurri A., Multisensory Expressive Gesture Applications, MediaFuture Intl. Conf., Firenze, May 2001

[11] Camurri A., Leman M., Temporal Aspects of Multisensory Expressive Gesture Processing, Intl. Conference on Systematic and Cognitive Musicology, Jyvaskyla, August 2001

[12] Camurri A., Mazzarino B., Trocca R., Volpe G., Real-Time Analysis of Expressive Cues in Human Movement, Int. Conf. CAST 01, Bonn, September 2001

[13] Camurri, A., De Poli G., Leman M., MEGASE – A Multisensory Expressive Gesture Applications System Environment for Artistic Performances, Int. Conf. CAST 01, Bonn, September 2001

[14] Camurri, A., De Poli G., Leman M., Volpe G., A Multi-Layered conceptual framework for expressive gesture applications, Workshop on Current Research Directions in Computer Music, Barcelona, November 2001

[15] Camurri A., Mazzarino B., Trocca R., Volpe G., Modelli Computazionali di analisi dell’espressivita’ nel movimento per interfacce multimodali, VII Congresso Nazionale SIE, Firenze, September 2001.

[16] Camurri A., Peri M., Volpe G., Strategie di mapping per interfacce uomo-computer espressive, VII Congresso Nazionale SIE, Firenze, September 2001.

 

Up to Table of Contents

 

back
Last modified: 05-31-2002