Abstract
This article follows the historical and technological evolution of multichannel audio in electroacoustic music, from its origins in the 1950s with pioneers like Pierre Schaeffer to its present role as a central element in music composition and performance. It examines key advancements in spatialization techniques, including surround sound, ambisonics, and site-specific diffusion systems, highlighting their impact on spatial perception, listener envelopment, and auditory aesthetics. The discussion integrates insights into the creative and technical challenges of multichannel composition, emphasizing its capacity to enhance spatial imagery, evoke emotional responses, and redefine performance practices. By exploring emerging technologies such as augmented reality, machine learning, and networked audio systems, the article offers a comprehensive view of multichannel audio's transformative role in shaping contemporary electroacoustic music and its future possibilities.
Keywords
Multichannel Audio, electroacoustic music, listener experience, music technology.
DOI https://doi.org/10.23277/emille.2024.22..002
초록
이 글에서는 1950년대 페이르 셰퍼Pierre Schaeffer와 같은 선구자와 함께 시작되어 현재 음악 작곡과 공연의 필수요소로서의 역할까지 전자음향음악의 역사적, 기술적 발전과정을 따라가본다. 서라운드 사운드, 앰비소닉, 특정장소에서의 확산 시스템을 아우르는 공간화 기술에서 중요한 발전사항을 짚어보고, 공간적 인식과 청취작 포위, 청각적 미학에 미친 영향에 초점을 둔다. 이 논의를 통해 다채널 작곡 시 창의적이고 기술적인 과제를 통찰하는 사항들을 정리하면서, 공간적 이미지를 향상시키고 감정적 반응을 불러일으키며 공연 관행을 재정의할 수 있는 능력에 대해 역설한다. 증강 현실, 머신 러닝, 네트워크 오디오 시스템과 같은 참신한 기술을 탐색하며, 현시점의 전자음향음악과 이것의 미래 가능성에서 다채널 오디오의 혁신적 역할에 대한 포괄적 관점을 제공한다.
주제어
다채널 오디오, 전자음향 음악, 청취 경험, 음악 기술.
Since the earliest performances of electroacoustic music, composers have utilized multichannel loudspeaker diffusion. In the early 1950s, Pierre Schaeffer and Pierre Henry implemented a four-channel setup, which included an elevated loudspeaker, to spatialize their compositions (Lynch / Sazdov 2017). Musique Concrète, a term coined by Pierre Schaeffer, describes music created using recorded sounds as the basis of composition. In English-speaking regions, it contrasts with Elektronische Musik, which uses electronically generated sounds. However, in French-speaking areas, it’s understood that Musique Concrète involves a hands-on approach to working with sound, akin to sculpting or painting, where the composer manipulates sound directly. Schaeffer’s method reversed the traditional composition process, starting with concrete sounds and moving towards abstract structures, unlike conventional instrumental music which begins with abstract concepts. Despite its basis in fixed media, Schaeffer’s first public concert in 1950 included live, variable elements, and by 1951, his team utilized magnetic tape and innovative playback devices for multitrack sound distribution (Harrison 1998).
The evolution of multichannel audio from a specialized technology to a foundational aspect of modern music production and performance mirrors broader trends at the intersection of art and technology. Despite significant advancements in both technical capability and creative application of multichannel audio (Leider 2007), several challenges persist. Acousmatic music has explored the incorporation of space as a musical element, alongside traditional parameters such as pitch, rhythm, and duration (Normandeau 2009). Examining the diverse dimensions of multichannel audio in electroacoustic music reveals its role in enhancing spatial and immersive musical qualities, while also opening new avenues for artistic expression and engaging listeners on deeper levels.
The landscape of musical composition and auditory experiences has been profoundly transformed by the advent and evolution of multichannel audio setups. Multichannel audio is increasingly important for composers, particularly those who treat the studio as a musical tool for creating, processing, and mixing music. (Leider 2007). This article focuses on the multifaceted impact of multichannel audio, exploring its aesthetic advantages, the creative choices it enables, the technical challenges it presents, and its influence on both composers and listeners.
The roots of multichannel audio can be traced back to the early experiments with stereophonic sound in the late 19th and early 20th centuries. However, it was not until the latter half of the 20th century that multichannel audio began to take shape as we know it today (Ouzounian 2020). The advent of quadraphonic sound in the 1970s marked a significant step forward, although its commercial success was limited (Postrel 1990).
The true revolution in multichannel audio came with the development of surround sound systems for cinema in the 1980s and 1990s. These systems, which typically used five or more channels, provided a more immersive auditory experience that soon found its way into home entertainment systems. As digital technology advanced, so did the capabilities of multichannel audio systems, leading to formats like 5.1, 7.1, and even more complex configurations (Kerins 2010).
In the realm of music production and performance, the adoption of multichannel audio was driven by composers and sound engineers seeking to push the boundaries of spatial audio. Early pioneers in electronic and electroacoustic music recognized the potential of multichannel systems to create new sonic landscapes and immersive experiences. This led to the development of specialized venues and systems for multichannel music, such as the Birmingham ElectroAcoustic Sound Theatre (BEAST), which has played a crucial role in advancing multichannel composition techniques (Wilson / Harrison 2010).
The landscape of multichannel concert presentations for electroacoustic music has significantly evolved in recent years. A key factor in this transformation is the new accessibility of relatively affordable commercially produced multichannel hardware. This development has greatly expanded the potential and implications of working with multichannel formats, regardless of the style or design of the chosen presentation system (Wilson / Harrison 2010).
Enhanced Spatial Perception and Depth
Spatial audio deals with how sound is reproduced to create a sense of direction and space around the listener. Most research on spatial audio focuses on a single listener who is ideally placed in the best spot among the speakers, where the spatial effects are most effective. However, this method doesn't work well in large or public spaces or for listeners who are not in the center. Many current solutions are only suitable for cinematic special effects or need a lot of hardware (Etlinger 2009).
One of the primary benefits of multichannel systems is their ability to enhance spatial perception and add depth to music, particularly in electroacoustic compositions. The spatial arrangement of sound sources contributes to creating intricate sound environments, adding layers of complexity to the listening experience (Batchelor 2015; Stefani / Lauke 2010). This enhanced spatiality not only makes the music more engaging but also adds significant emotional impact and meaning to the composition.
Electroacoustic composition offers a unique opportunity for spatial exploration because it provides the tools necessary to manipulate spatial settings, distribute sounds, and create movements both during the composition process (by precisely controlling spectral space and embedding spatial characteristics into sound files) and during live performances (using sound diffusion techniques) (Barreiro 2010).
Multichannel setups allow for precise localization of sound sources, enhancing the spatial impression and making the listening experience more dynamic and engaging (Howie et al. 2016; Leider 2007). The most exhilarating and captivating idea related to space in acousmatic music is that it not only involves moving sounds through space but also transports the listener to different auditory environments, creating parallel listening universes. This is largely due to the acousmatic medium's ability to evoke spatial and locational references in sound (Barreiro 2010).
Improved Listener Envelopment
Listener envelopment (LEV) describes the feeling of being surrounded by sound. This perceptual quality has been studied in concert hall acoustics, spatial sound reproduction, and electroacoustic music (Lynch / Sazdov 2017; Riedel / Zotter 2023; Soulodre et al. 2003). This definition suggests that an ideal diffuse sound field is enveloping, because it is characterized by infinitely many incoherent plane waves impinging from all directions with equal variance (Jacobsen / Roisin 2000). Additionally, the spatial granular synthesis method is employed to accurately manage the timing and directional concentration of sound events. (Riedel et al. 2023).
Multichannel systems significantly improve listener envelopment (LEV), providing a more realistic and engaging auditory scene (Soulodre et al. 2003). This enhanced envelopment closely approximates the acoustics of a concert hall, creating a more immersive experience for the listener. The total size of the sound scene is significantly larger in multichannel compositions, providing a more expansive auditory experience that surrounds the listener from all directions.
Realistic and Hyper-realistic Sound Reproduction
Soundscape composition, though it treats sound as inherently tied to its context, is associated with the idea of decontextualization. In a way, it can be seen as a specific type of acousmatic music. The space itself is a thematic focus, and the techniques used to spatialize its diffusion serve as tools for representation, striving to preserve the coherence of the subject as much as possible (Martusciello 2022).
The phenomenon of reproduction has influenced our perception of quality listening. We believe we are experiencing high quality sound, although this is not always the primary goal of audio reproduction methods. Ironically, in cinema, the realism between sound and image is achieved by removing real recordings, which are often deemed disappointing and uninteresting, and by creating a fabricated illusion of reality (Puronas 2014). This feeds the schizophonic phenomenon of separating the sound from its source. Nevertheless, the public accepts this illusion as more real than reality itself. In fact, the more a sound is processed or highlighted, the more convincingly real it seems (Martusciello 2022).
The technology accessible to electroacoustic composers allows us not only to record sound but also to fully immerse listeners in our auditory creations. Techniques such as binaural sound, ambisonics, and loudspeaker domes enable composers to craft intricate soundscapes and project them in a manner that renders them hyper-realistic (Rossiter 2020). Additionally, multi-channel soundscape compositions can evoke mythical places and events, transporting the listener to an imaginary realm. The magical qualities of this space are conveyed through transformed versions of hyper-realistic soundscape elements (Truax 2012a).
Enhanced Expressive and Theatrical Elements
Multichannel spatialization enables acousmatic composers to intricately map sonic events to specific locations within a listening space, in ways that were previously unattainable with stereo sources (Stefani / Lauke, 2010). Multichannel setups can significantly enhance the expressive and theatrical elements of musical performances. By leveraging advanced spatialization techniques, composers can create more engaging and immersive performances. Site-specific approaches to spatialization can further enhance these aspects by tailoring the sound diffusion to the unique acoustics of the performance space (Knight-Hill 2015; Stefani / Lauke 2010).
In music, ambiophony can be crafted with meticulous detail by the composer to highlight archetypal and global environmental perception. It creates a space where attention is drawn to the overall soundscape rather than isolated sounds. Consequently, the listener does not pinpoint the sources of sounds but is enveloped in a diffuse ambience, with this diffuse quality defining its archetypal nature (Lotis 2003; Stefani / Lauke 2010).
Loudspeakers play a significant visual role in the listening space during acousmatic performances, often serving as the audience's only visual connection to the sound source. They can be hidden to create a more ambiguous perception of sound localization or prominently displayed to enhance the theatrical aspect of the performance. By using various loudspeaker types and sizes, performances can become visually and sonically engaging. This approach, seen in setups like BEAST's Tweeter Trees and the GRM Acousmonium, emphasizes the importance of the visual element in sound diffusion. Switching the focus between different loudspeakers can create dramatic effects and a sense of dialogue, adding cultural, comic, or melodramatic dimensions to the work (Stefani / Lauke 2010).
Exploring New Sonic Territories
The discovery of new sounds through tape manipulation or the creation of custom devices for generating unique timbres became as crucial to the success of a piece as its temporal context. The tools available in the electroacoustic music studio offered complete control over every aspect of the compositional process, down to the finest details of the acoustic signals. It became possible to construct entirely new soundscapes using just a tape machine, amplifier, and loudspeakers for playback (Emmerson 1986).
Programs exist at various levels, from assembler code (very low level) to high-level scripting languages that often feature more human-readable structures, resembling spoken languages or graphical representations of familiar objects. Domain-specific languages maintain general programmability while offering additional abstractions suited to particular domains, such as sound synthesis (Collins / d'Escrivan 2017). These programs include CSound, SuperCollider, ChucK, and MAX 9.
Composers utilize multichannel audio to craft immersive aesthetic experiences by harnessing spatialization techniques, advanced recording methods, and innovative playback formats. These approaches enhance the listener's perception of space, depth, and realism, leading to a more engaging and meaningful auditory experience. The ability to manipulate sound in three-dimensional space opens new creative possibilities, allowing composers to explore sonic territories that were previously inaccessible.
Enhancing Spatial Imagery and Meaning
The aim of spatial audio in electroacoustic music is to evoke experiences in listeners that hold artistic significance, particularly through the spatial characteristics of perceived sound. Consequently, a multichannel audio system should strive to deliver acoustic signals that evoke these artistic spatial experiences and understandings. A deeper understanding of the intricate relationship between spatial sound systems and the listener's perception allows for more effective utilization of these systems for artistic purposes (Kendall 2010).
There exists a unique and complex connection between meaning and space. Spatial meaning arises from the embodied nature of spatial perception, reflecting how we think about and experience space. It also emerges from the distinct characteristics of multichannel reproduction and the inherent qualities of the artistic medium. Through the fusion of technical and artistic exploration, new and undiscovered avenues for meaning can be uncovered through spatial audio (Kendall 2010).
Space functions as a multifaceted musical element that can seamlessly integrate into a composition's structure, sometimes becoming the primary conveyer of meaning within the piece. This process of creation and interpretation of meaning is influenced by cultural norms of interpersonal communication, such as concepts of personal space and territoriality. Additionally, it acknowledges the intimate relationship between electroacoustic music composition and technology, highlighting how available technologies influence aesthetic decisions and strategies for spatial composition (Henriksen 2002). Fully grasping the spatial implications of a composed environment is challenging, if not impossible. However, recognizing the presence of these subconscious elements is essential for effectively incorporating space into musical compositions and interpreting spatial aspects during music appreciation (Henriksen 2002).
The foundational model of acoustic communication posits that information and meaning emerge through attentive listening, drawing from both the internal structure and patterns of the sound itself, as well as the listener's contextual understanding. Ultimately, both internal and external complexities contribute to our comprehension of sound (Truax 2012b). Techniques like image dispersion and the precedence effect are crucial for crafting meaningful spatial experiences. Through manipulation of sound's spatial attributes, composers can create intricate auditory landscapes that convey narratives, emotions, and abstract concepts in ways that traditional stereo compositions cannot achieve.
Creating New Aesthetic Paradigms
Composers aim to create new aesthetic paradigms by leveraging technology to broaden rather than narrow aesthetic boundaries, providing listeners with tools to perceive complex relationships of time and space (Newcomb 1998). The development of environmentally interactive computer music systems is driven by the desire to offer profound personal benefits and limitless variation with infinite control. This push towards new aesthetic frontiers reflects a broader trend in contemporary music towards more immersive and interactive experiences.
Advancements in music technology open significant new opportunities in multichannel composition and system design. These emerging possibilities necessitate new strategies and aesthetic considerations and have implications for presentation, performance, and reception (Wilson / Harrison 2010). While it is easy to be excited about the opportunities these developments present, it is important to remember that they also come with potential negative side effects. Along with the loss of the straightforward exchange of artistic works, which may also lead to a decline in the sharing of the aesthetic and technical knowledge they contain (Wilson / Harrison 2010).
Integrating Performance and Composition
The use of multichannel systems has led to a blurring of lines between composition and performance practices. Traditionally, composition and performance were seen as distinct stages in the creation and presentation of music. Composers would write music, often with minimal consideration for the performance space, while performers would interpret these compositions, adapting them to the performance environment. However, modern advancements in technology, particularly the use of multichannel audio systems, have led to a significant shift in this dynamic, resulting in a more holistic approach where composition and performance practices are intertwined.
The integration of performance and composition has seen significant transformation with the advent and increased use of multichannel systems in the realm of music and sound art. Composers now often consider the spatial layout of speakers and the performance space as integral components of the composition process (Stefani / Lauke 2010; Wilson / Harrison 2010). This integration allows for more dynamic and responsive performances, where the spatial aspects of the music can be adjusted in real-time to suit the specific acoustics of the venue or to react to the audience's response.
Exploring Psychoacoustic Effects
Psychoacoustics has significantly contributed to understanding the fundamental connections between tones, their combinations, and sequences, and the resulting sensations of loudness, pitch, timbre, consonance, dissonance, and rhythm (Houtsma 1999). Multichannel systems allow composers to explore and exploit various psychoacoustic effects. The ability to precisely control the direction and movement of sound enables composers to create illusions of space and movement, significantly enhancing the emotional impact of their work. This exploration of psychoacoustics opens new avenues for musical expression and listener engagement.
Furthermore, electroacoustic music holds the same potential as modern visual art, making it a valuable tool for therapeutic or educational applications. Investigations show how electroacoustic environments, created using various semi-automatic signal processing techniques, can impact listeners' perceptions and foster psychological experiences in areas such as creativity, emotion, self-perception, and mental association (Parada-Cabaleiro et al. 2017).
Understanding Listener Perception
Electroacoustic music lacks a clear vocabulary for discussing its spatial aspects. It not only misses terms to describe the spatial characteristics of individual sound sources but also lacks language to explain how these characteristics contribute to artistic expression. To date, there have been few perceptual studies focused on multi-channel electroacoustic music. As a result, there is no standardized method for obtaining perceptual responses from listeners (Kendall / Ardila 2007). Several studies have drawn from the fields of psychoacoustics, concert hall acoustics, and audio reproduction research to design perceptual experiments aimed at understanding how multi-channel electroacoustic music influences the perception of spatial characteristics (Lynch 2014).
These studies examine how various frequency ranges, levels of sonic complexity, and loudspeaker placements affect the perception of spatial attributes, spatial clarity, envelopment, and engulfment. The spatial attribute of envelopment is a well-understood term often used to describe the sensation of being surrounded by sound (Lynch 2014; Rumsey 1998).
Understanding listener perception is vital in multichannel audio compositions. Composers consider how listeners perceive spatiality and meaning, focusing on mental processes related to space and meaning, such as image dispersion and the precedence effect (Kendall 2010). This attention to psychoacoustics ensures that the spatial elements of the composition enhance rather than detract from the overall musical experience.
Site-Specific Considerations
The performance spaces for electroacoustic music have evolved significantly, incorporating various spatial and technological innovations to enhance the listening experience. These innovations are crucial for composers and listeners alike, offering new ways to interact with and perceive sound. The concept of a Sound House, for instance, proposes a multi-space center for electroacoustic music and sonic art, emphasizing social interaction and practical application beyond traditional concert halls (Jones 2001). Site-specific approaches to multichannel spatialization also enhance the theatricality and expressive functions of sound diffusion in electroacoustic music, making the listening experience more immersive and engaging (Stefani / Lauke 2010).
Technological integration plays a pivotal role in these innovative performance spaces. Electroacoustic feedback experiments by pioneers like Alvin Lucier and Max Neuhaus have led to the emergence of sound installations that highlight the spatial dimension of sound propagation (Matthieu 2017). New auditoriums designed with electroacoustic technology include advanced sound reinforcement systems, movable loudspeakers, and artificial reverberation systems to improve sound quality and control (Yamaguchi 1978). These advancements allow for precise manipulation of sound in space, creating a more dynamic and interactive auditory experience.
Contemporary trends show an increased use of surround 5.1, four-channel, and eight-channel systems, with a decline in stereo usage, as composers become more familiar with various spatialization systems. This familiarity enables them to focus less on performance and interpretation issues and more on the artistic potential of spatial sound (Otondo 2008). Handling space in multichannel electroacoustic works involves enhancing the spatial dimension of sounds through specific processing techniques, such as Max/MSP (MAX 9) patches (Barreiro 2010), Csound, and SuperCollider. The research highlights a shift towards more innovative and technologically integrated performance spaces, designed to enhance the spatial and theatrical aspects of sound. These developments are transforming traditional concert halls into more interactive and immersive environments, thereby shaping the contemporary electroacoustic music experience.
Innovations in Recording Techniques
The evolution of multichannel audio has been significantly influenced by advancements in recording techniques. These techniques aim to enhance the spatial and immersive qualities of audio, providing listeners with a more engaging and realistic auditory experience. Exploration in acousmatic music is advanced by the opportunities provided by recording technology, studio composition methods, and sound diffusion techniques (Barreiro 2010).
Perceptually motivated techniques focus on reproducing only the aspects of the sound scene that are relevant to human perception, thus demanding less computational power and fewer equipment resources. In contrast, physically motivated techniques strive to achieve a physically accurate sound field reproduction, which requires more computational and equipment load (Hacihabiboglu et al. 2017).
The evolution of multichannel audio has also seen the development of complex microphone arrays and sophisticated recording methods to capture the full spatial characteristics of sound sources. Techniques such as vector-base amplitude panning, and perceptual sound field reconstruction have provided new ways to capture and reproduce three-dimensional sound fields, offering composers and sound engineers unprecedented control over the spatial aspects of audio (Hacihabiboglu et al. 2017).
Digital Signal Processing and Spatial Audio Rendering
Digital signal processing techniques, such as time stretching, granulation, filtering, and transposition, alter the source material. These manipulations generate abstract sounds that are not readily identifiable as originating from real-world sources (Lynch 2014). The purpose of creating a variety of material is to utilize different sound sources to express and represent the diverse and evolving themes throughout the narrative of the piece (Lynch 2014).
Spatial audio is a field dedicated to exploring techniques for replicating the spatial characteristics of sound (such as direction, distance, width of sound sources, and room envelopment) for listeners. These attributes cannot be accurately reproduced with a single loudspeaker, which led to the introduction of two-channel stereophony and its subsequent extension to various surround sound formats using five to eight loudspeakers. Even more precise reproduction of spatial attributes can be achieved with loudspeaker configurations typically found in theaters and some public venues, where many loudspeakers are positioned around, and sometimes above or below, the listeners (Hacihabiboglu et al. 2017).
A fundamental question in spatial audio is how to place a sound source in a specific direction within the virtual auditory space. A well-known method, called amplitude panning, involves applying a sound signal with varying amplitudes to different loudspeakers. Traditionally, amplitude panning has been limited to two-dimensional (2D) loudspeaker setups, but it has been extended to three-dimensional (3D) multichannel loudspeaker configurations (Hacihabiboglu et al. 2017).
The spatial dimension in multichannel systems enables unique artistic expressions and amplifies the theatricality of sound diffusion, making performances more engaging and expressive (Stefani / Lauke 2010; Timmermans 2015). Advances in digital signal processing have played a crucial role in the development of multichannel audio. Sophisticated algorithms for spatial audio rendering allow for the creation of virtual sound sources and the manipulation of sound fields in ways that were previously impossible. These technologies enable composers to create highly detailed and dynamic spatial audio experiences, even with limited physical speaker configurations.
Virtual and Augmented Reality Integration
Audio augmented reality (AAR) involves technology that integrates computer-generated auditory content into the user's real-world acoustic environment. An AAR system has unique requirements distinct from typical human-computer interfaces: an audio playback system to enable the simultaneous perception of real and virtual sounds; motion tracking to facilitate interactivity and location-awareness; the design and implementation of an auditory display to present AAR content; and spatial rendering to convey spatialized AAR content (Gamper 2014).
Sound field reproduction systems, such as wave field synthesis (WFS), operate on the principle of natural sound wave propagation. This approach allows them to create a true sound field uniformly over an extended listening area. WFS virtual sources are localized with much greater accuracy compared to stereophonic phantom sources (Ranjan 2016).
Spatial rendering is often achieved by convolving virtual sounds with head-related transfer functions (HRTFs). This framework uses Delaunay triangulation to organize HRTFs into subsets appropriate for interpolation and employs barycentric coordinates as the interpolation weights (Gamper 2014).
The rise of virtual and augmented reality technologies has opened new frontiers for multichannel audio. These immersive technologies require sophisticated spatial audio techniques to create convincing and engaging experiences. As a result, there has been significant research and development in HRTFs and binaural audio rendering, which enable the creation of three-dimensional sound environments through headphones.
Network Audio and Distributed Performance
The evolution of music from traditional concert performances to multimedia and interactive art forms reflects a rich history of creativity spurred by technological advancements. Over the past few decades, experimental scores incorporating visual elements have expanded the scope of musical expression beyond classical boundaries, with pioneers like Mauricio Kagel breaking new ground. Randall Packer and Ken Jordan’s book, Multimedia: From Wagner to Virtual Reality (2001), illustrates how the concept of the total artwork evolved into diverse multimedia performances. Modern communication technologies, such as mobile phones and the Internet, have transformed how artists and scientists collaborate, enabling rapid, global exchanges and the creation of innovative projects. These advancements have shifted the scale and pace of artistic collaboration, allowing new media to serve as dynamic platforms for interactive and evolving art forms. As music continues to evolve from analog to digital and from live performances to virtual experiences, it offers new opportunities for exploration and expression in contemporary culture (Dal Farra 2005).
One notable technological advancement in this context is network audio, which refers to the transmission of audio signals over computer networks. This technology enables the distribution of high-quality audio across various devices and locations by leveraging standard networking protocols and infrastructures. Moustakas, Floros and Kapralos demonstrated the Augmented Reality Audio Network (ARAN) concept in the context of a live electroacoustic music concert. Their subjective evaluation indicated that the ARAN framework could represent a significant alternative to traditional augmented reality audio (ARA) approaches in the artistic and creative domain (Moustakas et al. 2016).
Network audio technologies have thus enabled new forms of distributed performance and composition. Multichannel audio can now be transmitted over networks with low latency, allowing for real-time collaboration between musicians and composers in different locations. This has opened up new possibilities for remote performances and interactive installations that span multiple physical spaces.
Machine Learning and AI in Spatial Audio
The integration of machine learning and artificial intelligence (AI) in audio production is beginning to influence the field of multichannel audio as well. AI algorithms are being developed to assist in the spatial mixing and mastering of multichannel audio, automating complex tasks and offering new creative tools for composers and sound designers. As noted by Whalley, within this broader context, integrating machine learning, mapping, and automation into electroacoustic music presents several challenges and speculative opportunities. A notable gap in the literature is the lack of comprehensive frameworks that link electroacoustic music to emotional responses and techniques for mapping micro-gestures to enhance subtlety in performances. Machine learning offers a way to manage this complexity by analyzing vast data sets and real-time data streams, using tools like WEKA to create a database of emotional responses associated with sound gestures. This involves developing a common language for mapping micro-gestures and micro-sounds and automating these processes to handle increasing data complexity. The ultimate aim is to establish a networked, interactive environment that bridges human and machine interactions in electroacoustic music, which could lead to new ways of understanding and experiencing networked life through sonic exploration (Whalley 2015).
The development of multichannel music has led to some of the most creative works in contemporary music, broadening the possibilities of spatial and auditory experiences. A key example is Karlheinz Stockhausen’s Gesang der Jünglinge 1 (1956), considered one of the first significant works of multichannel electronic music. Stockhausen used five channels to create an immersive soundscape, blending electronic sounds with recorded voices (Stone 1963). His precise control over the positioning and movement of sounds in space shifted the understanding of music from a linear progression to an engaging, surround experience, paving the way for future innovations in spatial composition.
Sound 1.
Excerpt of Gesang der Jünglinge (1956) by Karlheinz Stockhausen
© 1956 Stockhausen-Stiftung für Musik, Kürten. Reprinted by permission.
Iannis Xenakis’s Persepolis 2 (1971) is a significant 8-channel electroacoustic composition (Yardumian 2024). In this work, Xenakis treated space as an active musical element, using detailed textures and powerful sounds to create an intense sensory impact. The spatial arrangement was a key part of the piece, with sounds spread across large outdoor areas to surround the audience in a constantly changing acoustic environment. Persepolis demonstrates how multichannel techniques can provoke not only auditory but also physical reactions, pushing the boundaries of traditional music performance and listening experiences.
Sound 2.
Excerpt of Persepolis (1971) by Iannis Xenakis
© 1971 Editions Salabert, part of Universal Music Publishing Classics & Screen International Copyright Secured. All Rights Reserved Reprinted by Permission of Hal Leonard Europe BV
John Chowning's 1972 composition Turenas is an important work in electronic music, using quadraphonic sound spatialization to create the illusion of moving sounds. Chowning achieved this effect by combining Doppler shift and changes in amplitude, controlled through Lissajous figures, which are mathematical curves describing smooth, looping paths. This method enabled Chowning to simulate sound movement in a realistic and convincing way. By 1968, he had developed a system using four speakers and a computer program to control the perceived direction and distance of sounds. However, this system had some challenges, such as difficulty with perceiving multiple trajectories at once and sudden changes in sound direction. The use of Lissajous figures solved these problems by producing smooth and natural sound movements. Additionally, FM synthesis in Turenas enhanced the spatial effects, enabling Chowning to create timbral changes that aligned with the spatial motion of the sounds (Chowning 2011). Both the stereo and 4-channel versions are available online as individual files.
Sound 3.
Excerpt of Turenas (1972) by John Chowning
© 1972 John Chowning. All rights reserved. Reprinted by permission.
Provenance – émergence is a 24-channel electroacoustic composition by Hans Tutschku 3 , created in 2022 (Tutschku 2024a). The piece premiered on October 29, 2022, at the GRM in Maison de la Radio, Paris, and was produced in studios at Harvard University and GRM Paris (Tutschku 2024b). Lasting 18 minutes and 45 seconds, the composition takes listeners on an introspective journey, where “fragments of dreams and memories converge in a vast space full of dynamic movements (Tutschku 2024a).” The soundscape explores an unfamiliar medium, "between air and liquid," with distinct voices emerging through three slow, large waves that unify the piece (Tutschku 2024a). Both 8-channel and 24-channel versions are available online as individual mono files.
Sound 4.
Excerpt of Provenance – émergence (2022) by Hans Tutschku
© 2022 Hans Tutschku. All rights reserved. Reprinted by permission.
Curtis Roads’ 4 composition Then (2010–2016) was created using a method called multiscale planning. This approach involves designing flexible, temporary systems on multiple levels, from small sound fragments to the overall structure of the piece (Roads 2016). Multiscale planning was essential for organizing and mixing the piece, which involved over 2,100 stereo tracks. Spatial processing was a key element, using techniques like tape echo feedback to create a ping-pong effect and various reverberators to balance dry and reverberant sounds. The piece has been performed in different multichannel setups, such as an 8-speaker system and the 47-speaker ZKM Klangdom. To adapt the spatialization to different setups, Roads developed a program called Spatial Chords, which enables real-time generative upmixing. This ensures that the spatial design of Then matches its musical structure during live performances, regardless of the sound system (Roads 2016, 2019). A stereo version of Then is accessible online.
Sound 5.
Excerpt of Then (2010-2016) by Curtis Roads
© 2016 Curtis Roads. All rights reserved. Reprinted by permission.
For Reza (2022) is a musical piece composed by Joachim Heintz 5 to celebrate the 50th birthday of Reza Korourian (1971–2015). It was premiered at the Tehran International Electronic Music Festival (TIEMF) in 2022. The composition was inspired by Heintz's experience with Korourian's music, particularly the opening of Violet Room. Heintz describes the music as "bells, bells, loud, persistent, thousand years without any change," suggesting it expresses something deeply difficult in Korourian's life. The piece was created at the request of the friends of the Yarava Music Group, where Korourian was a member (Heintz 2022). Both the stereo and 4-channel versions are available online as individual files.
Sound 6.
Excerpt of For Reza (2022) by Joachim Heintz
© 2022 Joachim Heintz. All rights reserved. Reprinted by permission.
AI Phantasy, an electroacoustic composition by Panayiotis Kokoras 6 , explores the line between reality and imagination through a unique mix of sounds. Created in 2020, the piece combines studio recordings, found objects, and machine learning software. It includes various sound sources, such as manipulated vacuum cleaner sounds, pan flute-like instruments, and premade sound effects. Kokoras used a technique he developed called fabrication synthesis to modulate sounds from the vacuum cleaner and other objects (Lautour 2021). He also used commercial software, like Sononym and Orchidea, to analyze and categorize over 500,000 sound files, which helped him find similar sounds and create orchestrations based on specific targets. The title AI Phantasy reflects both the fantasies of a child, blending internal and external realities, and the daydreams adults create. This idea is expressed in the composition’s sound world, which combines studio recordings, instrumental sounds, and synthesized noises to form an engaging listening experience. Both stereo and 8-channel versions are available online as individual mono files.
Sound 7.
Excerpt of AI Phantasy (2020) by Panayiotis Kokoras
© 2020 Panayiotis Kokoras. All rights reserved. Reprinted by permission.
Untitled No. 2, composed by Ali Balighi in 2023, uses Max/MSP for granular synthesis and multichannel algorithmic composition (Balighi 2023). Both the stereo and 4-channel versions are available online as individual mono files.
Sound 8.
Excerpt of Untitled No. 2 (2023) by Ali Balighi
© 2023 Ali Balighi. All rights reserved.
Multichannel audio has significantly transformed the landscape of musical composition and listener experience. By enhancing spatial perception, enabling more immersive soundscapes, and providing composers with new tools for artistic expression, these systems have opened new possibilities in music creation and appreciation. The technology has not only changed how we create and perceive music but has also influenced broader areas of entertainment, communication, and artistic expression.
As we look to the future, the potential for even more sophisticated and engaging multichannel audio experiences remains an exciting prospect for both composers and listeners alike. The ongoing integration of multichannel audio with other emerging technologies promises to continue pushing the boundaries of what is possible in sound and music.
However, as with any technological advancement, the evolution of multichannel audio brings both opportunities and challenges. Balancing artistic vision with technical constraints, ensuring accessibility and standardization, and addressing ethical considerations will be crucial in shaping the future of this technology.
Ultimately, the true value of multichannel audio lies not just in its technical capabilities, but in its ability to enhance human experiences, evoke emotions, and create meaningful connections between creators and audiences. As the technology continues to evolve, it will undoubtedly play a pivotal role in shaping the future of music and sound, offering new ways to explore, create, and experience the rich world of auditory art.
Acknowledgements. I would like to sincerely thank five outstanding composers: John Chowning, Hans Tutschku, Curtis Roads, Joachim Heintz, and Panayiotis Kokoras. Their exceptional multichannel music greatly enhanced my work. Their compositions added significant meaning and insight to my articles, and their creativity continues to inspire me. I am grateful for their remarkable music and its valuable contribution to my research.
Balighi, A. (2023). Untitled No.2. On Ghazale Vay. Noise A Noise. www.alibalighi.com
Barreiro, D. L. (2010). Considerations on the handling of space in multichannel electroacoustic works. Organised Sound, 15(3), 290-296. https://doi.org/10.1017/ S1355771810000294
Batchelor, P. (2015). Acousmatic Approaches to the Construction of Image and Space in Sound Art. Organised Sound, 20(2), 148-159. https://doi.org/10.1017/ S1355771815000035
Chowning, J. (2011). Turenas: the realization of a dream. Journées d'Informatique Musicale.
Collins, N./ d'Escrivan, J. (2017). The Cambridge Companion to Electronic Music. https://doi.org/10.1017/9781316459874
Dal Farra, R. (2005). Re-thinking the gap. Electroacoustic music in the age of virtual networking. Proceedings of the Electroacoustic Music Studies Network–EMS09 international conference,
Emmerson, S. (1986). The Language of electroacoustic music. In: New York : Harwood Academic Publishers.
Etlinger, D. (2009). A Musically Motivated Approach to Spatial Audio for Large Venues Northwestern University. https://www.proquest.com/dissertations-theses/musically-motivated-approach-spatial-audio-large/docview/304970961/se-2
Gamper, H. (2014). Enabling technologies for audio augmented reality systems.
Hacihabiboglu, H./ De Sena, E./ Cvetkovic, Z./ Johnston, J./ Smith III, J. O. (2017). Perceptual spatial audio recording, simulation, and rendering: An overview of spatial-audio techniques based on psychoacoustics. IEEE Signal Processing Magazine, 34(3), 36-54.
Harrison, J. (1998). Sound, space, sculpture: some thoughts on the ‘what’, ‘how’ and ‘why’ of sound diffusion. Organised Sound, 3(2), 117-127. https://doi.org/10.1017 /S1355771898002040
Heintz, J. (2022). For Reza. https://joachimheintz.net/for-reza.html
Henriksen, F. E. (2002). Space in electroacoustic music: composition, performance and perception of musical space City University London].
Houtsma, A. J. M. (1999). On the tones that make music. The Journal of the Acoustical Society of America, 105(2_Supplement), 1237-1237. https://doi.org/ 10.1121/1.425950
Howie, W./ King, R. L./ Martin, D. (2016). A Three-Dimensional Orchestral Music Recording Technique, Optimized for 22.2 Multichannel Sound. Journal of the Audio Engineering Society. https://aes2.org/publications /elibrary-page/?id=18416
Jacobsen, F./ Roisin, T. (2000). The coherence of reverberant sound fields. The Journal of the Acoustical Society of America, 108(1), 204-210. https://doi.org/10.1121 /1.429457
Jones, S. (2001). The Legacy of the 'Stupendious' Nicola Matteis. Early Music, 29(4), 553-568. http://www.jstor.org/ stable/3519116
Kendall, G. S. (2010). Spatial perception and cognition in multichannel audio for electroacoustic music. Organised Sound, 15(3), 228-238. https://doi.org/ 10.1017/S1355771810000336
Kendall, G. S./ Ardila, M. (2007). The artistic play of spatial organization: Spatial attributes, scene analysis and auditory spatial schemata. International Symposium on Computer Music Modeling and Retrieval,
Kerins, M. (2010). Beyond Dolby (Stereo): Cinema in the Digital Sound Age. Indiana University Press.
Knight-Hill, A. (2015). Theatres of Sounds: the role of context in the presentation of electroacoustic music. https://doi.org/10.1386/scene_00016_1
Kokoras, P. (2020). AI Phantasy. On AI Phantasy. https://panayiotiskokoras.com/_ai_phantasy/
Lautour, R. d. (2021). A Listening Art Electroacoustic Music Studies Network (EMS),
Leider, C. (2007). Multichannel Audio in Electroacoustic Music: An Aesthetic and Technical Research Agenda. 2007 IEEE International Conference on Multimedia and Expo, 1890-1893. https://doi.org/10.1109/ICME.2007.4285044
Lotis, T. (2003). The creation and projection of ambiophonic and geometrical sonic spaces with reference to Denis Smalley's Base Metals. Organised Sound, 8(3), 257-267.
Lynch, H./ Sazdov, R. (2017). A perceptual investigation into spatialization techniques used in multichannel electroacoustic music for envelopment and engulfment. Computer music journal, 41(1), 13-33.
Lynch, H. A. (2014). Space in multi-channel electroacoustic music: developing sound spatialisation techniques for composing multi-channel electroacoustic music with emphasis on spatial attribute perception University of Limerick.
Martusciello, F. (2022). The reality of the reproduction. Aesthetics of a” conscious” approach to sound design in the soundscape composition: a case study. Proceedings of the 17th International Audio Mostly Conference,
Matthieu, S. (2017). Electroacoustic Feedback and the Emergence of Sound Installation: Remarks on a line of flight in the live electronic music by Alvin Lucier and Max Neuhaus. Organised Sound, 22, 268-275. https://doi.org/10.1017/S1355771817000176
Moustakas, N./ Floros, A./ Kapralos, B. (2016). An Augmented Reality Audio Live Network for Live Electroacoustic Music Concerts. Audio Engineering Society Conference: 2016 AES International Conference on Audio for Virtual and Augmented Reality,
Newcomb, R. S. (1998). Music In The Air: a theoretical model and software system for music analysis and composition. Organised Sound, 3, 3-16. https://doi.org/ 10.1017/S1355771898009121
Normandeau, R. (2009). Timbre Spatialisation: The medium is the space. Organised Sound, 14(3), 277-285. https://doi.org/10.1017/S1355771809990094
Otondo, F. (2008). Contemporary trends in the use of space in electroacoustic music. Organised Sound, 13, 77-81. https://doi.org/10.1017/S1355771808000095
Ouzounian, G. (2020). Stereophonica: Sound and Space in Science, Technology, and the Arts. The MIT Press. https://lccn.loc.gov/2020003270
Parada-Cabaleiro, E./ Baird, A./ Cummins, N./ Schuller, B. W. (2017). Stimulation of psychological listener experiences by semi-automatically composed electroacoustic environments. 2017 IEEE International Conference on Multimedia and Expo (ICME),
Postrel, S. R. (1990). Competing networks and proprietary standards: The case of quadraphonic sound. The Journal of Industrial Economics, 169-185. https://doi.org/10.2307/2098492
Puronas, V. (2014). Sonic hyperrealism: illusions of a non-existent aural reality. The New Soundtrack, 4(2), 181-194.
Ranjan, R. (2016). 3D audio reproduction: natural augmented reality headset and next generation entertainment system using wave field synthesis.
Riedel, S./ Frank, M./ Zotter, F. (2023). Perceptual evaluation of listener envelopment using spatial granular synthesis. arXiv preprint arXiv:2301.10210. https://doi.org/10.48550/arXiv.2301.10210
Riedel, S./ Zotter, F. (2023). The Effect of Temporal and Directional Density on Listener Envelopment. Journal of the Audio Engineering Society, 71(7/8), 455-467.
Roads, C. (2016). Story of Then (2010-2016).
Roads, C. (2019). Then (2010 - 2016). On Flicker tone pulse - electronic music 2001-2016. https://www.schott-music.com/en/flicker-tone-pulse-no417796.html
Rossiter, M. L. (2020). Music–Bodies–Machines. Airea: Arts and Interdisciplinary Research(2), 5-22. https://doi.org/ 10.2218/airea.5041
Rumsey, F. (1998). Subjective assessment of the spatial attributes of reproduced sound. Audio Engineering Society Conference: 15th International Conference: Audio, Acoustics & Small Spaces,
Soulodre, G. A./ Lavoie, M. C./ Norcross, S. G. (2003). Objective measures of listener envelopment in multichannel surround systems. Journal of the Audio Engineering Society, 51(9), 826-840. https://aes2.org/publications /elibrary-page/?id=12205
Stefani, E./ Lauke, K. (2010). Music, Space and Theatre: Site-specific approaches to multichannel spatialisation. Organised Sound, 15, 251-259. https://doi.org/ 10.1017/S1355771810000270
Stockhausen, K. (1955–56). Gesang der Jünglinge.
Stone, K. (1963). Review of Karlheinz Stockhausen: Gesang der Jünglinge (1955/56), by K. Stockhausen [Karlheinz Stockhausen: Gesang der Jünglinge (1955/56)]. The Musical Quarterly, 49(4), 551-554. http://www.jstor.org/stable/740590
Timmermans, H. (2015). Sound spatialisation from a composer's perspective. 2015 IEEE 2nd VR Workshop on Sonic Interactions for Virtual Environments (SIVE), 1-5. https://doi.org/10.1109/SIVE.2015.7361286
Truax, B. (2012a). Music, soundscape and acoustic sustainability. Moebius Journal, 1(1), 1-16.
Truax, B. (2012b). Sound, listening and place: The aesthetic dilemma. Organised Sound, 17(3), 193-201.
Tutschku, H. (2024a). Hans Tutschku : Special Lecture.
Talk and concert at the University of the Arts Tokyo. University of the Arts Tokyo. Retrieved December 2, 2024 from https://gotolabedu.geidai.ac.jp/en/hans_tutschku_en/
Tutschku, H. (2024b). Provenance – émergence. Retrieved December 2, 2024 from https://tutschku.com/ provenance-emergence/#:~:text=This%20composition %20takes%20us%20on,medium%20between%20air%20and%20liquid.
Whalley, I. (2015). Developing Telematic Electroacoustic Music: Complex networks, machine intelligence and affective data stream sonification. Organised Sound, 20(1), 90-98.
Wilson, S./ Harrison, J. (2010). Rethinking the BEAST: Recent developments in multichannel composition at Birmingham ElectroAcoustic Sound Theatre. Organised Sound, 15, 239-250. https://doi.org/ 10.1017/S1355771810000312
Xenakis, I. (1971). Persepolis. On Persepolis.
Yamaguchi, K. (1978). Design of a new auditorium using electroacoustic technology. Journal of the Acoustical Society of America, 64. https://doi.org/10.1121 /1.2003649
Yardumian, A. (2024). The Iranian Context of Iannis Xenakis’s Persepolis. Meta-Xenakis.
1.
https://www.youtube.com/watch?v=LcCs6Muljmk
2.
https://www.youtube.com/watch?v=bUT5hONK7Bw
3.
https://tutschku.com
4.
https://www.curtisroads.net
5.
https://joachimheintz.net
6.
https://www.panayiotiskokoras.com
논문투고일: 2024년 09월20일
논문심사일: 2024년 10월26일, 11월14일
게재확정일: 2024년 11월20일