Dynamical infrastructures and multi-adaptivity: higher degrees of variety and complexity in autonomous music feedback systems

On the 25th of November this year, I was invited to give a short talk and demonstration at the Digital Arts Symposium in Edinburgh about my research and latest developments of my PhD. I presented my idea of dynamical infrastructures and multi-adaptivity - probably one of the most important aspects of my research - which I will briefly introduce in this blog post.

Generally speaking, the term “adaptive” refers to interacting agents that, individually or collectively, can change their state in response to variations in the environment or other interconnected agents. These changes can take place in the short-term, where the state of the agents is temporarily affected, or it can happen in the long-term with permanent or long-lasting variations in their states. [Mitchell 2016] In the field of complex adaptive systems, the term refers to a more specific behaviour. Namely, that of systems which are capable of changing their state in response to the environment or context to maintain a particular condition (to survive, for example) or to improve themselves (reaching a goal or target, for example).

Here, I will use the term with a more general sense, referring to systems which are able to change their state based on the specific context that they experience at any given time. I prefer to use the term “context” rather than “environment” to include systems which are structurally coupled with the environment, as well as closed systems which are coupled with themselves without an external environment. In both cases, I am referring to recursive systems, that is, systems which provide the context that, circularly, affects their states.

First of all, it is necessary to make a distinction between time-invariant and time-variant systems. In simple terms, a time-invariant system is a system which performs the same operation at all times. [Smith 2007] In the other case, we will have a system whose operation changes over time. Another important distinction, strictly related to the one above, is that between dynamical and adaptive systems. The output of a dynamical system changes over time, but the internal state of its agents may remain unaffected. On the other hand, an adaptive system implies that both its output and internal state change over time. As a practical example, we can consider an analogue mixer with a feedback configuration. Some specific set-up of the parameters may result in an output that, to some extent, changes over time, although the parameters of the mixer themselves will be static. On the contrary, a simple example of an analogue adaptive system could be a voltage-controlled filter with a feedback configuration: the output of the system (the context, in that case) will change the state of the filter, which, in turn, will affect the output.

While some interesting results can be achieved with dynamical systems, adaptive systems are more likely to generate behaviours which exhibit higher long-term variety and complexity. Digital signal processing and audio programming provide very versatile tools for the implementation of time-variant adaptive systems in the domain of sound: ideally, given that stability is taken into account, all variables in a DSP unit can be driven by audio signals and can thus vary at audio rates. That way, the generated sounds and the states of the components can affect each other, making the system adaptive and time-variant. Practitioners like Di Scipio, myself and others make extensive use of this approach for the implementation of such systems. A typical procedure is that of performing several kinds of analysis operations on the input such as RMS and brightness estimation to obtain infrasonic signals. These signals, often based on their perceptual characteristics and their relationship with the domains of the variables in the processing units, are mapped to certain ranges and then used to control the state of the components in a large network. (See Di Scipio’s seminal text from 2003 for a detailed discussion on this method.) Using infrasonic signals to pilot these variables is highly desirable if not necessary, for high-rate, sudden changes in the DSP parameters would produce an output with a continuously large spectral band, and it would not be possible to perceive the state variations in the long-term.

The sound analysis algorithms implemented, the specific connections between the control signals and the variables, the linear and nonlinear mapping strategies used, all these elements determine the infrastructure of a system. In a large network, these elements can already provide a high number of configurations and an even larger number of possible states that a system can reach. That, theoretically, could be considered as something that guarantees a good variety and complexity in the long-term behaviour of a system, albeit the practical case tends to be much different from the ideal scenario. In my experience, the realisation of an autonomous music system which exhibits convincing variety and complexity over a relatively long time span has been something difficult to achieve, even when implementing large and articulated networks. Variety and complexity are convincing when, in the long-term, there is a non-trivial interplay between order and disorder, redundancy and entropy, sound and silence, repetition and surprise, as well as homogeneity and heterogeneity in the sonic characteristics of the output. Ultimately, these all contribute to creating a behaviour which is expressive, musical and organic. And could an adaptive system with these features be considered alive and intelligent? This is an important question which I will discuss in my thesis and, perhaps, in another blog post too.

Partly inspired by the interface that I have implemented to perform my LIES (sysmap) 1, I thought that the autonomy of a system could have been improved by making its infrastructure dynamical, that is, time-variant, hence resulting in different adaptive modalities. Based on what characterises the infrastructure of a system, I decided to build a prototype where the ranges in the mapping functions between control signals and DSP variables change over time with regard to the input sound of each node. The prototype itself is based on the network of my LIES (sysmap) 1 project, although I had to reduce the number of nodes in the network to make up the extra CPU load given by the generation of the new control signals. I called the prototype SD/OS (impulse/dynadapt), and it is a work for machine solo performance implementing an impulse-triggered, self-oscillating, closed system.

The network has six nodes, each of them containing two cascaded units, carrying out the following processes: granulation, comb filtering, variable high-pass/low-pass filtering (basically raw IIR filters), pulse-width modulation, sampling and reverberation. We have a quasi-full network topology where the output of each node is routed to the input of all other nodes, and stability is obtained by using look-ahead limiters. It is beyond the scope of this post to discuss the technical details of the DSP units, so it is enough to say that the units have several structural counterbalancing mechanisms and that all varying variables are dependent on the control signals extracted from the input. Furthermore, as mentioned earlier, the ranges which determine how the variables change in relation to the control signals are themselves affected by the incoming sound.

For this system, the method used to generate the control signals is different from the one described above. Rather than calculating the RMS or brightness, I am simply low-passing the input signal to slow it down with a cutoff as low as ~0.1Hz. I am then normalising it to roughly keep it in the [-1;1] range, and finally using it to pilot the frequency of a phasor, an oscillator which linearly cycles through values in the [0;1] range. Moreover, the low-passed signal is raised to some relatively large power to force it around 0 and limit the amount of variation. This way, instead of having one and only one value being generated with a particular input, the output of this algorithm will also depend on time, namely on how long a certain input is kept. Of course, this introduces some degree of opacity in the way the system functions. One reason is what I just described, the fact that both input and time will affect the output; the other is that the operation of low-passing simply averages a signal, which does not exactly have a perceptual correlate such as brightness or loudness. This algorithm, indeed, could be considered as a black box, but the system is still entirely deterministic and dependent on the context. In fact, triggering the system with the same initial conditions each time would produce the same stream of samples, while slightly different inputs would result in entirely new formal developments.

Other implementations which I am planning to explore will make the infrastructures dynamical by reconfiguring the connections among variables and control signals when using perceptually-related analysis algorithms, or by interpolating among different analysis algorithms while keeping the same connections. The idea of meta-control signal processing, too, will be explored, which involves control signals affecting the variables in algorithms generating other control signals, thus making them time-variant. In general, the time-variant systems approach could be pushed even further, and it could extend the idea of dynamical infrastructures to that of dynamical nodes and dynamical topologies. It means that each node will be morphing through several processing techniques (reverberation, granulation, filtering, and so forth), and that the way they are interconnected will be varying. In this situation, all characterising elements of a system will be changing over time, realising the system of systems paradigm even more profoundly.

Currently, I am also working on a high-level audio analysis algorithm which provides a complexity estimation for extended sound events. It is for some aspects inspired by the recurrence quantification analysis technique and it processes and combines the output of four low-level analysis units: RMS, brightness, noisiness and transients amount. I will discuss this algorithm in the next blog post, and it will be integrated with the aforementioned autonomous systems to establish a perceptual correlation between control signals and time-variant agents.

At the following link, it is possible to listen to an audio example of the SD/OS (impulse/dynadapt) prototype described in this post. https://soundcloud.com/dario-sanfilippo/sdos-dynadapt-iir-1