Realtime Multiple Pitch Observation using Sparse Non-negative Constraints

MUTANT Synchronous Realtime Processing and Programming of Music Signals

Embedded and Real-time Systems

Algorithmics, Programming, Software and Architecture

http://repmus.ircam.fr/MuTant 2012 January 01 2013 January 01 CNRS Université Pierre et Marie Curie (Paris 6) Institut de Recherche et Coordination Acoustique/Musique - IRCAM Music Embedded Systems Interaction Real-time Machine Learning Arshia Cont Chercheur

Rocquencourt

Team leader,Ircam, Researcher oui Jean-Louis Giavitto Chercheur

Rocquencourt

CNRS, Senior Researcher oui Florent Jacquemard Chercheur

Rocquencourt

Inria, Researcher oui Thomas Coffy Technique

Rocquencourt

Inria, ADT Research Engineer until October 2014 Cindy Crossouard Assistant

Rocquencourt

Inria David Dinis Assistant

Rocquencourt

Inria, from May 2014 Assia Saadi Assistant

Rocquencourt

Inria Philippe Cuvillier PhD

Rocquencourt

Univ. Paris VI José Echeveste PhD

Rocquencourt

Univ. Paris VI Clément Poncelet PhD

Rocquencourt

Inria,DGA,Univ. Paris VI Julia Blondeau PhD

Rocquencourt

Université Paris Sorbonne Overall Objectives Overall Objectives

The research conducted in MuTant is devoted both to leveraging capabilities of musical interactions between humans and computers, and to the development of tools to foster the authoring of interaction and time in computer music. Our research program departs from Interactive music systems for computer music composition and performance introduced in mid-1980s at Ircam. Within this paradigm, the computer is brought into the cycle of musical creation as an intelligent performer and equipped with a listening machine capable of analyzing, coordinating and anticipating its own and other musicians' actions within a musically coherent and synchronous context. Figure illustrates this paradigm. The use of Interactive Music Systems have become universal ever since and their practice has not ceased to nourish multidisciplinary research. From a research perspective, an interactive music systems deals with two problems: realtime machine listening , or music information retrieval from musicians on stage, and music programming paradigms , reactive to the realtime recognition and extraction. Whereas each field has generated subsequent literature, few attempts have been made to address the global problem by putting the two domains in direct interaction.

In modern practices, the computer's role goes beyond rendering pre-recorded accompaniments and is replaced by concurrent, synchronous and realtime programs defined during the compositional phase by artists and programmers. This context is commonly referred to as Machine Musicianship where the computer does not blindly follow the human but instead has a high degree of musical autonomy and competence. In this project, we aim at developing computer systems and language to support real-time intelligent behavior for such interactions.

MuTant's research program lies at the intersection and union of two themes, often considered as disjoint but inseparable within a musical context:

Realtime music information retrieval and processing

Synchronous and realtime programming for computer music

Research Program Real-time Machine Listening

When human listeners are confronted with musical sounds, they rapidly and automatically find their way in the music. Even musically untrained listeners have an exceptional ability to make rapid judgments about music from short examples, such as determining music style, performer, beating, and specific events such as instruments or pitches. Making computer systems capable of similar capabilities requires advances in both music cognition, and analysis and retrieval systems employing signal processing and machine learning.

In a panel session at the 13th National Conference on Artificial Intelligence in 1996, Rodney Brooks (noted figure in robotics) remarked that while automatic speech recognition was a highly researched domain, there had been few works trying to build machines able to understand “non-speech sound”. He went further to name this as one of the biggest challenges faced by Artificial Intelligence . More than 15 years have passed. Systems now exist that are able to analyze the contents of music and audio signals and communities such as International Symposium on Music Information Retrieval (MIR) and Sound and Music Computing (SMC) have formed. But we still lack reliable Real-Time machine listening systems.

The first thorough study of machine listening appeared in Eric Scheirer's PhD thesis at MIT Media Lab in 2001 with a focus on low-level listening such as pitch and musical tempo, paving the way for a decade of research. Since the work of Scheirer, the literature has focused on task-dependent methods for machine listening such as pitch estimation, beat detection, structure discovery and more. Unfortunately, the majority of existing approaches are designed for information retrieval on large databases or off-line methods. Whereas the very act of listening is real-time, very little literature exists for supporting real-time machine listening. This argument becomes more clear while looking at the yearly Music Information Retrieval Evaluation eXchange (MIREX), with different retrieval tasks and submitted systems from international institutions, where almost no emphasis exists on real-time machine listening. Most MIR contributions focus on off-line approaches to information retrieval (where the system has access to future data) with less focus on on-line and realtime approaches to information decoding.

On another front, most MIR algorithms suffer from modeling of temporal structures and temporal dynamics specific to music (where most algorithms have roots in speech or biological sequence without correct adoption to temporal streams such as music). Despite tremendous progress using modern signal processing and statistical learning, there is much to be done to achieve the same level of abstract understanding for example in text and image analysis on music data. On another hand, it is important to notice that even untrained listeners are easily able to capture many aspects of formal and symbolic structures from an audio stream in realtime. Realtime machine listening is thus still a major challenge for artificial sciences that should be addressed both on application and theoretical fronts.

In the MuTant project, we focus on realtime and online methods of music information retrieval out of audio signals. One of the primary goals of such systems is to fill in the gap between signal representation and symbolic information (such as pitch, tempo, expressivity, etc.) contained in music signals. MuTant's current activities focus on two main applications: score following or realtime audio-to-score alignment , and realtime transcription of music signals with impacts both on signal processing using machine learning techniques and their application in real-world scenarios.

Synchronous and realtime programming for computer music

The second aspect of an interactive music system is to react to extracted high-level and low-level music information based on pre-defined actions. The simplest scenario is automatic accompaniment, delegating the interpretation of one or several musical voices to a computer, in interaction with a live solo (or ensemble) musician(s). The most popular form of such systems is the automatic accompaniment of an orchestral recording with that of a soloist in the classical music repertoire (concertos for example). In the larger context of interactive music systems, the “notes” or musical elements in the accompaniment are replaced by “programs” that are written during the phase of composition and are evaluated in realtime in reaction and relative to musicians' performance. The programs in question here can range from sound playback, to realtime sound synthesis by simulating physical models, and realtime transformation of musician's audio and gesture.

Such musical practice is commonly referred to as the realtime school in computer music, developed naturally with the invention of the first score following systems, and led to the invention of the first prototype of realtime digital signal processors and subsequents , and the realtime graphical programming environment Max for their control at Ircam. With the advent and availability of DSPs in personal computers, integrated realtime event and signal processing graphical language MaxMSP was developed at Ircam, which today is the worldwide standard platform for realtime interactive arts programming. This approach to music making was first formalized by composers such as Philippe Manoury and Pierre Boulez, in collaboration with researchers at Ircam, and soon became a standard in musical composition with computers.

Besides realtime performance and implementation issues, little work has underlined the formal aspects of such practices in realtime music programming, in accordance to the long and quite rich tradition of musical notations. Recent progress has convinced both the researcher and artistic bodies that this programming paradigm is close to synchronous reactive programming languages, with concrete analogies between both: parallel synchrony and concurrency is equivalent to musical polyphony, periodic sampling to rhythmic patterns, hierarchical structures to micro-polyphonies, and demands for novel hybrid models of time among others. Antescofo is therefore an early response to such demands that needs further explorations and studies.

Within the MuTant project, we propose to tackle this aspect of the research within two consecutive lines:

Development of a Timed and Synchronous DSL for Real Time Musician-Computer Interaction: The design of relevant time models and dedicated temporal interactions mechanisms are integrated in the ongoing and continuous development of the Antescofo language. The new tools are validated in the production of new musical pieces and other musical applications. This work is performed in strong coupling with composers and performers. The PhD works of José Echeveste (computer science) and Julia Blondeau (composer) take place in this context.

Formal Methods: Failure during an artistic performance should be avoided. This naturally leads to the use of formal methods, like static analysis, verification or test generation, to ensure formally that Antescofo programs will behave as expected on stage. The checked properties may also provide some assistance to the composer especially in the context of “non deterministic score” in an interactive framework. The PhD of Clément Poncelet is devoted to these problems.

Off-the-shelf Operating Systems for Real-time Audio

While operating systems shield the computer hardware from all other software, it provides a comfortable environment for program execution and evades offensive use of hardware by providing various services related to essential tasks. However, integrating discrete and continuous multimedia data demands additional services, especially for real-time processing of continuous-media such as audio and video. To this end interactive systems are sometimes referred to as off-the-shelf operating systems for real-time audio. The difficulty in providing correct real-time services has much to do with human perception. Correctness for real-time audio is more stringent than video because human ear is more sensitive to audio gaps and glitches than human eye is to video jitter . Here we expose the foundations of existing sound and music operating systems and focus on their major drawbacks with regards to today practices.

An important aspect of any real-time operating system is fault-tolerance with regards to short-time failure of continuous-media computation, delivery delay or missing deadlines. Existing multimedia operating systems are soft real-time where missing a deadline does not necessarily lead to system failure and have their roots in pioneering work in . Soft real-time is acceptable in simple applications such as video-on-demand delivery, where initial delay in delivery will not directly lead to critical consequences and can be compensated (general scheme used for audio-video synchronization), but with considerable consequences for Interactive Systems: Timing failure in interactive systems will heavily affect inter-operability of models of computation, where incorrect ordering can lead to unpredictable and unreliable results. Moreover, interaction between computing and listening machines (both dynamic with respect of internal computation and physical environment) requires tighter and explicit temporal semantics since interaction between physical environment and the system can be continuous and not demand-driven.

Fulfilling timing requirements of continuous media demands explicit use of scheduling techniques. As shown earlier, existing Interactive Music Systems rely on combined event/signal processing. In real-time, scheduling techniques aim at gluing the two engines together with the aim of timely delivery of computations between agents and components, from the physical environment, as well as to hardware components. The first remark in studying existing system is that they all employ static scheduling, whereas interactive computing demands more and more time-aware and context-aware dynamic methods. The scheduling mechanisms are neither aware of time, nor the nature and semantics of computations at stake. Computational elements are considered in a functional manner and reaction and execution requirements are simply ignored. For example, Max scheduling mechanisms can delay message delivery when many time-critical tasks are requested within one cycle . SuperCollider uses Earliest-Deadline-First (EDF) algorithms and cycles can be simply missed . This situation leads to non-deterministic behavior with deterministic components and poses great difficulties for preservation of underlying techniques, art pieces, and algorithms. The situation has become worse with the demand for nomad physical computing where individual programs and modules are available but no action coordination or orchestration is proposed to design integrated systems. System designers are penalized for expressivity, predictability and reliability of their design despite potentially reliable components.

Existing systems have been successful in programing and executing small system comprised of few programs. However, severe problems arise when scaling from program to system-level for moderate or complex programs leading to unpredictable behavior. Computational elements are considered as functions and reaction and execution requirements are simply ignored. System designers have uniformly chosen to hide timing properties from higher abstractions, and despite its utmost importance in multimedia computing, timing becomes an accident of implementation. This confusing situation for both artists and system designers, is quite similar to the one described in Edward Lee's seminal paper “Computing needs time” stating: “general-purpose computers are increasingly asked to interact with physical processes through integrated media such as audio. [...] and they don't always do it well. The technological basis that engineers have chosen for general-purpose computing [...] does not support these applications well. Changes that ensure this support could improve them and enable many others” .

Despite all shortcomings, one of the main advantages of environments such as Max and PureData to other available systems, and probably the key to their success, is their ability to handle both synchronous processes (such as audio or video delivery and processing) within an asynchronous environment (user and environmental interactions). Besides this fact, multimedia service scheduling at large has a tendency to go more and more towards computing besides mere on-time delivery. This brings in the important question of hybrid scheduling of heterogeneous time and computing models in such environments, a subject that has had very few studies in multimedia processing but studied in areas such simulation applications. We hope to address this issue scientifically by first an explicit study of current challenges in the domain, and second by proposing appropriate methods for such systems. This research is inscribed in the three year ANR project INEDIT coordinated by the team leader (started in September 2012).

Application Domains Authoring and Performing Interactive Music

The combination of both realtime machine listening systems and reactive programming paradigms has enabled the authoring of interactive music systems as well as their realtime performance within a coherent synchronous framework called Antescofo. The module, developed since 2008 by the team members, has gained increasing attention within the user community worldwide with more than 40 prestigious public performances yearly. The outcomes of the teams's research will enhance the interactive and reactive aspects of this emerging paradigm as well as creating novel authoring tool for such purposes.

The AscoGraph authoring environment, started in 2013 and shown in Figure , is the first step towards such authoring environments. The outcome of the ANR Project INEDIT (with LABRI and GRAME and coordinated by team leader), will further extend the use-cases of Antescofo for interactive multimedia pieces with more complex temporal structures and computational paradigms.

Realtime Music Information Retrieval

Realtime Music Information Retrieval is used as front-end for various applications requiring sonic interaction between software/hardware and the physical worlds. MuTant has focused on realtime machine listening since its inception and holds state-of-the-art algorithms for realtime alignment of audio to symbolic score, realtime tempo detection, realtime multiple-pitch extraction. Recent results have pushed our application to more generalised listening schemes beyond music signals as reported in .

Automatic Accompaniment/Creative Tools for Entertainment Industry

Technologies developed by MuTant can find their way with general public (besides professional musicians) and within the entertainment industry. Recent trends in music industry show signs of tendencies towards more intelligent and interactive interfaces for music applications. Among them is reactive and adaptive automatic accompaniment and performance assessment as commercialized by companies such as MakeMusic. Technologies developed around Antescofo can enhance interaction between user and the computer for such large public applications.

Highlights in 2014 include collaborations with Orchestre de Paris Archives that resulted in prototype demonstrated to public in June 2014.

We will pursue this by licensing our technologies to third-party companies.

New Software and Platforms Antescofo Arshia Cont Jean-Louis Giavitto Florent Jacquemard José Echeveste

Antescofo is a modular polyphonic Score Following system as well as a Synchronous Programming language for musical composition. The module allows for automatic recognition of music score position and tempo from a realtime audio Stream coming from performer(s), making it possible to synchronize an instrumental performance with computer realized elements. The synchronous language within Antescofo allows flexible writing of time and interaction in computer music.

A complete new version of Antescofo has been released in November 2014 on Ircam Forumnet. This version includes major improvements over the previous version as a result of MuTant's research and development. Its development has benefited from of intensive interactions with composers, especially Julia Blondeau, José-Miguel Fernandez, and Marco Stroppa.

One major and sensible improvement is a total review of Antescofo's realtime machine listening as a result of , which allows robust recognition in highly polyphonic and noisy environments with the help of novel notions of Probabilistic Time Coherency. This improvement allowed team members to participate in collaborative work with Paris Orchestra among others.

The 2014 release of Antescofo also includes new anticipative synchronization strategies. In the context of the PhD of José Echeveste, they are systematically studied with the help of the Orchestre de Paris and the composer Marco Stroppa. This work won the best presentation award at ICMC 2014.

The new internal architecture unifies the handling of external (musical) events and the handling of internal (logical) events in a framework able to manage multiple time frames (relative, absolute or computed). The notion of synchronization has been extended to be able to synchronize on the update of a variable in addition to the update of the listening machine. This mechanism offers new opportunities, especially in the context of open scores and improvised music .

The 2014 version targets the Max and PureData (Pd) environments on Mac, but also on Linux on Windows (Pd version). A standalone version is used to simulate a performance in Ascograph and to offer batch processing capabilities as well as a testing framework.

Ascograph: Antescofo Visual Editor Thomas Coffy ADT Arshia Cont

The Antescofo programming language can be extended to visual programing to better integrate existing scores and to allow users to construct complex and embedded temporal structures that are not easily integrated into text. This project is held since October 2012 thanks to Inria ADT Support.

AscoGraph, the Antescofo graphical score editor released in 2013, provides a autonomous Integrated Development Environment (IDE) for the authoring of Antescofo scores. Antescofo listening machine, when going forward in the score during recognition, uses the message passing paradigm to perform tasks such as automatic accompaniment, spatialization, etc. The Antescofo score is a text file containing notes (chord, notes, trills, ...) to follow, synchronization strategies on how to trigger actions, and electronic actions (the reactive language). This editor shares the same score parsing routines with Antescofo core, so the validity of the score is checked on saving while editing in AscoGraph, with proper parsing errors handling. Graphically, the application is divided in two parts (see Figure ). On the left side, a graphical representation of the score, using a timeline with tracks view. On the right side, a text editor with syntax coloring of the score is displayed. Both views can be edited and are synchronized on saving. Special objects such as "curves", are graphically editable: they are used to provide high-level variable automation facilities like breakpoints functions (BPF) with more than 30 interpolations possible types between points, graphically editable.

An important feature of AscoGraph is the score import from MusicXML or MIDI files, which make the complete workflow of the composition of a musical piece much easier than before.

AscoGraph is strongly connected with Antescofo core object (using OSC over UDP): when a score is edited and modified it is automatically reloaded in Antescofo, and on the other hand, when Antescofo follows a score (during a concert or rehearsal) both graphical and textual view of the score will scroll and show the current position of Antescofo.

AscoGraph is released under Open-Source MIT license and has been released publicly along with new Antescofo architecture during IRCAM Forum 2013. Recent development was published in .

Antescofo Timed Test Platform Florent Jacquemard Clément Poncelet

The frequent use of Antescofo in live and public performances with human musicians implies strong requirements of temporal reliability and robustness to unforeseen errors in input. To address these requirements and help the development of the system and authoring of pieces by users, we are developing a platform for the automation of testing the behavior of Antescofo on a given score, with of focus on timed behavior. It makes possible to automate the following main tasks:

(1) generation of relevant input data for testing, with the sake of exhaustiveness,

(2) computation of the corresponding expected output, according to a formal specification of the expected behavior of the system on a given mixed score,

(3) black-box execution of the input test data,

(4) comparison of expected and real output and production of a test verdict.

The input and output data are timed traces (sequences of events together with inter-event durations).

Our platform uses state of the art techniques and tools for model-based testing of embedded systems . Some models of The environment of the system (the musicians) and the expected behavior of the system are both represented by formal models. We have developed a compiler for producing automatically such models, in an intermediate representation language (IR), from mixed scores. The IR are in turn converted into Timed Automata and passed, to the model-checker Uppaal.

Uppaal is used, with its extension Cover, for the above generation Task (1). Following some coverage criteria, this tools makes a systematic exploration of the state space of the model. We propose also an alternative approach for the generation of input traces by fuzzing of an ideal trace obtained from the score (a trace represented a perfectly timed performance of the score).

Task (2) is also performed by Uppaal, by simulation, using the model of the system and the generated test input.

Moreover, we have implemented several tools for Tasks (3) and (4), corresponding to different boundaries for the implementation under test (black box): e.g. the interpreter of Antescofo's synchronous language alone, or with tempo detection, or the whole system.

New Results Highlights of the Year

Acoustical Society of America Best Paper Award for .

International Computer Music Conference (ICMC) Best Presentation Award for .

MuTant TEDx Talk in October 2014 on Human-Computer Musicianship that attracted more than 12 thousand podcasts according to organisers.

MuTant in CNRS's 2nd edition of “Les Fondamentales” Science and Society event in Grenoble, in a session dedicated to Science and Music on the same Score.

MuTant Participation in the 2014 edition of Futur en Seine festival and showcased collaboration with Orchestre de Paris in a public event.

Time-Coherency of Bayesian Priors for Sequential Alignment

In the context of Philippe Cuvillier's PhD project, we aim at increasing the robustness of machine listening in situations where observations from the external environment are extremely noisy or incoherent.

Recent results propose a novel insight to the problem of duration modeling for Information Retrieval problems where a discrete sequence of events is estimated from a time-signal using Bayesian models. Since the duration of each event is unknown, a major issue is setting the right Bayesian prior on each of them. Hidden Semi-Markov models (HSMM) allow choosing explicitly any probability distribution for the durations but learning these statistically is a non-parametric problem. In absence of huge training data sets, most algorithms rely on regularization techniques such as choosing parametric classes of distributions but the justifications of such techniques are often heuristics.

Among the numerous application domains of HMM-like paradigms, music-to-audio alignment brings two interesting properties. Firstly, a music score informs of the ordering among events. Secondly, it assigns to each event a nominal duration. For alignment tasks the Markov models conveniently model the ordering with transient chains. But the modeling of these nominal durations is a crucial and undermined problematic. This work investigates the relationship of this prior information of duration with the Bayesian priors of a HSMM. Theoretical insights are obtained through the study of the prior state probability of transient semi-Markov chains. Whereas ergodic chain and their convergence to an equilibrium probability are well studied, transient chains constitute an undermined case but of prime importance for real-time inference on HSMM.

On the first hand we prove that the non-asymptotical evolution of the state probability has some particular behaviors if the Bayesian priors fulfill several precise conditions, derived from statistical properties like the hazard rate and the tail decay. Then we say that a model is time-coherent if the evolution of the state probability respects the information of ordering and nominal lengths. This leads to several prescriptions on the design of HSMM Bayesian priors. On the other hand we get further prescriptions by comparing the Bayesian priors associated to different nominal lengths. This real-valued parameter comes with a natural ordering; we explain why this ordering among parameters is coherently modeled by some specific stochastic orderings among distributions that are standard in statistics.

Intermediate results have been reported in , . This worked allowed the development of Antescofo version 0.6 released in November 2014.

Online Methods for Audio Segmentation and Clustering

Audio segmentation is an essential problem in many audio signal processing tasks, which tries to segment an audio signal into homogeneous chunks. Rather than separately finding change points and computing similarities between segments, we focus on joint segmentation and clustering, using the framework of hidden Markov and semi-Markov models. We introduced a new incremental EM algorithm for hidden Markov models (HMMs) and showed that it compares favorably to existing online EM algorithms for HMMs. Early experimental results on musical note segmentation and environmental sound clustering are promising and will be pursued further in 2015.

This project was done in the context of Alberto Bietti's MS project under co-supervision of Arshia Cont (MuTant) and Francis Bach (SIERRA).

Model-based Testing an Interactive Music System

In the context of the Phd of Clément Poncelet, and in relation with the developments presented in Section , we have been studying the application of model-based timed testing techniques to interactive music systems like Antescofo.

Several formal methods have been developed for automatic conformance testing of critical embedded software. The principle is to execute a real implementation under test (IUT) in a testing framework, black-box, by sending it carefully selected inputs and then observing and analyzing its outputs. In conformance model-based testing (MBT), the input and corresponding expected outputs are generated according to formal models of the IUT and the environment. The models of timed automata with inputs and outputs, and tools like the the Uppaal suite have been developed for extending such techniques to realtime systems , . Several procedures have been designed for addressing the task described in Section .

The case of IMS presents important originalities compared to other MBT applications to realtime systems. On the one hand, the time model supports several time units, including the wall clock time, measured in seconds, and the time of music scores, measured in number of beats relatively to a tempo. This situation raised several new problems for the generation of test suites and their execution. On the other hand, the formal specification of the IUT’s behavior on a given score is produced automatically by a score compiler, using an intermediate representation. We rely on the realistic hypotheses that a mixed score specify completely the expected timed behavior of the IMS. Hence, our test method is fully automatic, in contrast with other approaches which generally require experts to write the specification manually. This workflow fits well in a music authoring workflow where scores in preparation are constantly evolving. We have been applying our tools to small benchmark made of characteristic scores, as well as to real mixed scores used in concerts, and some bugs in Antescofo have been identified. These results have been presented in the conference ICMC 2014 and will be presented during the 30th ACM/SIGAPP Symposium On Applied Computing, track Software Verification and Testing .

Antescofo Temporal Pattern

An important enhancement has been made by the introduction of an expressive temporal pattern language in Antescofo. Temporal patterns are used to define complex events that correspond to a combination of perceived events in the musical environment as well as arbitrary logical and metrical temporal conditions. The real time recognition of such event is used to trigger arbitrary actions in the style of event-condition-action rules.

The semantics of temporal pattern matching is defined to parallel the well-known notion of regular expression and Brzozowski’s derivatives but extended to handle an infinite alphabet, arbitrary predicates, elapsing time and inhibitory conditions.

Temporal patterns are implemented by translation into a core subset of the Antescofo domain specific language. This compilation has proven efficient enough to avoid the extension of the real-time runtime of the language and has been validated with composers in actual pieces.

OpenMusic reactive Model

In collaboration with Jean Bresson, we have extended the evaluation model of OpenMusic to integrate reactive capabilities . OpenMusic (OM) is a domain-specific visual programming language designed for computer-aided music composition based on Common Lisp. It allows composers to develop functional processes generating or transforming musical data. To extend OM towards reactive applications, we have proposed to integrate its demand-driven evaluation mechanism with reactive data-driven evaluations in a same and consistent visual programming framework. To this end, we have developped the first denotational semantics of the visual language, which gives account for its demand-driven evaluation mechanism and the incremental construction of programs. We then have extended this semantics to enable reactive computations in the functional graphs. The resulting language merges data-driven executions with the existing demand-driven mechanism. This integration allows for the propagation of changes in the programs, and the evaluation of graphically-designed functional expressions as a response to external events, a first step in bridging the gap between computer-assisted composition environments and real-time musical systems.

Representation of Rhythm and Quantization

Rhythmic data are commonly represented by tree structures (rhythms trees) in assisted music composition environments, such as OpenMusic, due to the theoretical proximity of such structures with traditional musical notation. We are studying the application in this context of techniques and tools for processing tree structure, which were originally developed for other areas such as natural language processing, automatic deduction, Web data ... We are particularly interested in two well established formalisms with solid theoretical foundations: term rewriting and tree automata.

The problem of rhythm transcription, or quantization, is to generate, from a timed sequence of notes (e.g. a file in MIDI format), a score in traditional music notation. The input events can come from an interpretation on a MIDI keyboard or be the result of a computation in OpenMusic. This problem arises immediately as insoluble unequivocally: we shall calibrate the system to fit the musical context, balancing constraints of precision, or of simplicity / readability of the generated scores. For this purpose, we are developing in collaboration with Slawek Staworko (LINKS, currently on leave at University of Edinburgh) for algorithms searching optimums in large sets of weighted trees (tree series), representing possible solutions to a problem quantification. A prototype has been developed and is under evaluation on real case studies. For the construction of appropriate tree series, we turn to semi-supervised systems, where the composer's interactions are predominant in the smooth process. These work have been presented in an invited talk in the workshop of the IFIP working group on term rewriting.

With Prof. Masahiko Sakai (Nagoya University, a specialist in term rewriting), we conduct a complementary work on the representation of rhythmic notation. The goal is to define a structural theory as equations on trees rhythms. This approach can be used for example to generate, by transformation, different notations possible the same rate, with the ability to select in accordance with certain constraints.

Partnerships and Cooperations National Initiatives ANR INEDIT

Title: Interactivity in the Authoring of Time and Interactions

Project acronym: INEDIT

Type: ANR Contenu et Interaction 2012 (CONTINT)

Instrument: ANR Grant

Duration: September 2012 - September 2015

Coordinator: IRCAM (France)

Other partners: Grame (Lyon, France), LaBRI (Bordeaux, France).

Abstract: The INEDIT project aims to provide a scientific view of the interoperability between common tools for music and audio productions, in order to open new creative dimensions coupling authoring of time and authoring of interaction. This coupling allows the development of novel dimensions in interacting with new media. Our approach lies within a formal language paradigm: An interactive piece can be seen as a virtual interpreter articulating locally synchronous temporal flows (audio signals) within globally asynchronous event sequence (discrete timed actions in interactive composition). Process evaluation is then to respond reactively to signals and events from an environment with heterogeneous actions coordinated in time and space by the interpreter. This coordination is specified by the composer who should be able to express and visualize time constraints and complex interactive scenarios between mediums. To achieve this, the project focuses on the development of novel technologies: dedicated multimedia schedulers, runtime compilation, innovative visualization and tangible interfaces based on augmented paper, allowing the specification and realtime control of authored processes. Among posed scientific challenges within the INEDIT project is the formalization of temporal relations within a musical context, and in particular the development of a GALS (Globally Asynchronous, Locally Synchronous) approach to computing that would bridge in the gap between synchronous and asynchronous constraints with multiple scales of time, a common challenge to existing multimedia frameworks.

Other National Initiatives

Jean-Louis Giavitto participates in the SynBioTIC ANR Blanc project (with IBISC, University of Evry, LAC University of Paris-Est, ISC - Ecole Polytechnique).

The team is also an active member of the ANR network CHRONOS (investigator Gérard Berry, Collège de France).

European Initiatives Collaborations in European Programs, except FP7 & H2020

Mutant has started a cooperation with the team of Christoph Kirsch at the University of Salzburg, Austria, around the application of the application of the Logical Execution Time realtime programming paradigm to computer music systems supporting advanced temporal structure in music and advanced dynamics in interactivity. We have settled a project LETITBE accepted in the program PHC Amadeus, and to be started in january 2015.

International Initiatives Inria International Partners Informal International Partners

Shlomo Dubnov (UCSD)

Edward Lee (UC Berkeley)

Miller Puckette (UCSD)

Masahiko Sakai (U. Nagoya)

Slawek Staworko (U. Edinburgh)

David Wessel (UC Berkeley)

International Research Visitors Visits of International Scientists

Masahiko Sakai (Professor at the University of Nagoya) visited MuTant for two weeks in April and October 2014, For collaborations on term rewriting techniques applied to the representations of rhythm in music notations.

Slawek Staworko (LINKS, on leave at U. of Edinburgh) visited MuTant for two weeks in June and July 2014, for collaborations on the problem of automatic rhythm transcriptions.

Visits to International Teams

MuTant team members Arshia Cont, Jean-Louis Giavitto and José Echeveste made a formal visit to M.I.T. MediaLab in May 2014 to showcase MuTant work and discuss further collaborations with several New Media teams at MIT.

Research stays abroad

José Echeveste stays during six weeks in several Universities of United States which enables collaborations with the following teams and centers:

Center for Hybrid and Embedded Software Systems (UC Berkeley)

The Center for New Music and Audio Technologies (UC Berkeley)

Center for Computer Research in Music and Acoustics (Stanford)

Roger Dannenberg's team (Carnegie Mellon University)

Computer Music Center (Columbia University)

This trip allows to share research experience with many people with different areas of expertise and to broadly disseminate the Mutant team work in the main computer music centers and other important computer research centers of United States.

José Echeveste (MuTant PhD students) undertook a Research Stay in UC Berkeley's EECS department, Center for Hybrid and Embedded Software Systems (CHESS) for two months between April and May 2014. His visit was highlighted by several master classes and workshops on MuTant research in diverse institutions such as UC Berkeley, Columbia University and MIT.

Dissemination Promoting Scientific Activities Scientific events selection Chair of conference program committee

Jean-Louis Giavitto was co-chair of the SCW 2014 the Spatial COmputing Workshop, co-located with AAMAS, May 2014, Paris.

Member of the conference program committee

Jean-Louis Giavitto has participated in the program committee of the following workshop and conferences: BIPC special track at 8th International Conference on Bio-inspired Information and Communications Technologies - BICT 2014, December, 2014, Boston, MA, USA; MeCBIC 2014 7th Workshop on Membrane Computing and Biologically Inspired Process Calculi, September 2014, Bucharest, Romania; SASO Eighth IEEE International Conference on Self-Adaptive and Self-Organizing Systems - London, September 2014; JIM 2014 Journée d'informatique Musicale - Bourges - mai 2014; Journées nationales du GDR GPL 2014, juin 2014, CNAM Paris.

Florent Jacquemard has participated in the program committee of the 6th International Symposium on Symbolic Computation in Software Science (SCSS 2014) and the conférence Journées d'informatique Musicale (JIM 2014).

Reviewer

The members of the team regularly participate as reviewers for IEEE ICASSP, ACM Multimedia Conferences, Sound and Music Computing, International Computer Music Conference (ICMC), Digital Audio Effects Conference (DAFx), European Joint Conferences on Theory and Practice of Software (ETAPS), Federated Logic Conference (FLoC, federating CADE, LICS, RTA), and more.

Journal Member of the editorial board

Jean-Louis Giavitto is the redactor-in-chief of TSI (Technique et Science Informatiques) published by Lavoisier.

Reviewer

The members of the team are regular reviewers for IEEE Transactions on topics related to Machine Learning in Signal Processing (IEEE TSALP, TPAMI, Multimedia journals), Theoretical Computer Science related journals (TCS, MFCS, LMCS, JLAP, IPL), and Computer Music related journals (CMJ, JNMR).

Teaching - Supervision - Juries Teaching

Licence : Clement Poncelet, Environnement de développement, 40h, L3, UPMC, France.

Licence : Clement Poncelet, Ateliers de Recherche Encadrée - ARE, 20h, L1, UPMC, France.

Licence : José Echeveste, Structures discrètes, 30h, L3, UPMC, France.

Licence : Arshia Cont, Sound Engineering and New Technologies - Paris Superior Conservatory, 2h/week.

Supervision

PhD in progress: José Echeveste, Accorder le temps de la machine et celui du musicien, started in october 2011, supervisor: Arshia Cont and Jean-Louis Giavitto.

PhD in progress: Clément Poncelet, Formal methods for analyzing human-machine interaction in complex timed scenario. Started in october 2013, supervisor: Florent Jacquemard.

PhD in progress: Philippe Cuvillier, Probabilistic Decoding of strongly-timed events in realtime, supervisor: Arshia Cont.

Juries

Jean-Louis Giavitto was reviewer of the Habilitation of Arnaud Banos (Sorbonne – Paris I, Pour des pratiques de modélisation et de simulation libérées en géographie et SHS) and examiners for two PhDs: Mariem Miladi (SupMeca, Modélisation géométrique et mécanique pour les systèmes mécatroniques) et Adrien Basso-Blandin (Université d'Evry, Conception d’un langage dédié à la conception de fonctions biologiques de synthèse par compilation de spécifications comportementales).

Arshia Cont has acted as Reporter for two Habilitation juries in 2014, for Cédric Févotte (University of Nice) on Non-Negative Matrix Factorization and Its Applications, and for Brunno Bossis (University of Rennes 2) on Musicological Tools for analysis of Live Electronic Music.

Institutional commitment

Jean-Louis Giavitto is in the management team of the GDR GPL (Genie de la programmation et du logiciel), responsible with Etienne Moreau of the “Languages and Vérification” pole of the GDR. He is also and expert for the ANR DEFI program and a reviewer for FET projects for the UC.

Florent Jacquemard is member of the IFIP WorkingGroup 1.6 on Term Rewriting.

Arshia Cont is the director of Research/Creativity Interfaces at Ircam, in charge of coordinating scientific and artistic activities of the institution and dissemination of its software and community through Ircam Forum.

Arshia Cont is board member of ICMA (International Computer Music Association) since 2014.

Popularization

Arshia Cont was invited to a TEDx Talk in October 2014 on Human-Computer Musicianship that attracted more than 12 thousand podcasts according to organisers.

Arshia Cont was invited to participate in CNRS's 2nd edition of “Les Fondamentales” Science and Society event in Grenoble, in a session dedicated to Science and Music on the same Score.

Arshia Cont was invitee to the BFM Business Program on Future of Sound.

Arshia Cont was featured in the June Edition of Usbek et Rica magazine.

MuTant team participated in the 2014 edition of Futur en Seine festival and showcased collaboration with Orchestre de Paris in a public event.

Article on Antescofo by Arshia Cont in the December 2014-January 2015 edition of the popular science magazine “Dossier de La Recherche”.

José Echeveste, Arshia Cont and Jean-Louis Giavitto participated to the colloquium “La musique en temps réel” in the festival Musica, Strasbourg, september 2014.

Jean-Louis Giavitto participated to the colloquium “Le calcul et le temps, colloque de philosophie de l’informatique” Université Jean Moulin (Lyon 3), novembre 2014. He was invited on several seminars outside computer sciences: EPFL - ArchiZoom, “computational morphogenesis” in the context of the architecture exhibition “Animal ?” (april 2014); the e|m|a|fructidor art school in Chalons, “space and the formalization of musical processes” (april 2014); CNSMD Lyon and the Ecole Normale Lyon, “Du temps écrit au temps produit” with Julia Blondeau (april 2014); CISEC (club Inter-associations – AAAF, SEE, SIA – des Systèmes Embarqués Critiques), “Temps Réel en musique : Antescofo” (Toulouse June 2014).

Florent Jacquemard participated to the Inria seminar "1/2 hour of science" in July 2014, with a talk on Testing and Verification of interactive music systems.

Realtime Multiple Pitch Observation using Sparse Non-negative Constraints Arshia Cont A. International Symposium on Music Information Retrieval (ISMIR) Victoria, Canada 2006 http://hal.inria.fr/hal-00723223 A coupled duration-focused architecture for realtime music to score alignment Arshia Cont A. IEEE Transactions on Pattern Analysis and Machine Intelligence 32 6 2010 974-987 http://articles.ircam.fr/textes/Cont09a/ On the creative use of score following and its impact on research Arshia Cont A. Sound and Music Computing Padova, Italy July 2011 http://articles.ircam.fr/textes/Cont11a/ On the Information Geometry of Audio Streams with Applications to Similarity Computing Arshia Cont A. Shlomo Dubnov S. Gérard Assayag G. IEEE Transactions on Audio, Speech and Language Processing 19 4 May 2011 Correct Automatic Accompaniment Despite Machine Listening or Human Errors in Antescofo Arshia Cont A. José Echeveste J. Jean-Louis Giavitto J.-L. Florent Jacquemard F. ICMC 2012 - International Computer Music Conference Ljubljana, Slovenia IRZU - the Institute for Sonic Arts Research September 2012 http://hal.inria.fr/hal-00718854 Real-time polyphonic music transcription with non-negative matrix factorization and beta-divergence Arnaud Dessein A. Arshia Cont A. Guillaume Lemaitre G. Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR) Utrecht, Netherlands August 2010 Operational semantics of a domain specific language for real time musician-computer interaction José Echeveste J. Arshia Cont A. Jean-Louis Giavitto J.-L. Florent Jacquemard F. Discrete Event Dynamic Systems 23 4 August 2013 343-383 http://hal.inria.fr/hal-00854719 A Unified Approach to Real Time Audio-to-Score and Audio-to-Audio Alignment Using Sequential Montecarlo Inference Techniques Nicola Montecchio N. Arshia Cont A. Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP) Prague, Czech Republic May 2011 Accelerating the Mixing Phase in Studio Recording Productions by Automatic Audio Alignment Nicola Montecchio N. Arshia Cont A. 12th International Symposium on Music Information Retrieval (ISMIR) Miami, Florida October 2011 Editorial 3 Laurence Duchien L. Jean-Louis Giavitto J.-L. 33 Technique et Science Informatiques May 2014 4 https://hal.inria.fr/hal-01055909 A Reactive Extension of the OpenMusic Visual Programming Language Jean Bresson J. Jean-Louis Giavitto J.-L. 1045-926X Journal of Visual Languages and Computing 25 4 2014 363-375 https://hal.archives-ouvertes.fr/hal-00965747 Du temps écrits au temps produit en informatique musicale Jean-Louis Giavitto J.-L. Hugues Vinet H. Produire le temps Hermann April 2014 73-106 https://hal.archives-ouvertes.fr/hal-00960989 238p. Contributeurs : Yves André, Gérard Berry, Antoine Bonnet, Nicolas Donin, Laurent Feneyrou, Patrick Flandrin, Jean-Louis Giavitto, Philippe Manoury, François Nicolas, Thierry Paul, François Regnault, Pierre-André Valade, Hugues Vinet AscoGraph: A User Interface for Sequencing and Score Following for Interactive Music Thomas Coffy T. Jean-Louis Giavitto J.-L. Arshia Cont A. ICMC 2014 - 40th International Computer Music Conference Athens, Greece September 2014 https://hal.inria.fr/hal-01024865 International Conference on Computer Music 40 ICMC The Cyber-Physical System Approach for Automatic Music Accompaniment in Antescofo Arshia Cont A. José Echeveste J. Jean-Louis Giavitto J.-L. Acoustical Society Of America Providence, Rhode Island, United States May 2014 https://hal.inria.fr/hal-00997842 Meeting of the Acoustical Society Of America 2014 Best Paper Award for Students and Young Presenters Coherent Time Modeling of semi-Markov Models with Application to Real-Time Audio-to-Score Alignment Philippe Cuvillier P. Arshia Cont A. Jan Larsen J. Kevin Guelton K. MLSP 2014 - IEEE International Workshop on Machine Learning for Signal Processing (2014) Reims, France IEEE Mboup, Mamadou September 2014 https://hal.inria.fr/hal-01058366 IEEE International Workshop on Machine Learning for Signal Processing 2014 MLSP Time-coherency of Bayesian priors on transient semi-Markov chains for audio-to-score alignment Philippe Cuvillier P. MaxEnt 2014 Amboise, France SEE September 2014 https://hal.inria.fr/hal-01080235 International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering 2014 MAXENT Towards an Equational Theory of Rhythm Notation Pierre Donat-Bouillud P. Florent Jacquemard F. Masahiko Sakai M. Music Encoding Conference 2015 Florence, Italy May 2015 https://hal.inria.fr/hal-01105418 Music Encoding Conference 2015 2015 Real-Time Matching of Antescofo Temporal Patterns Jean-Louis Giavitto J.-L. José Echeveste J. PPDP 2014 - 16th International Symposium on Principles and Practice of Declarative Programming Canterbury, United Kingdom ACM September 2014 https://hal.archives-ouvertes.fr/hal-01054667 ACM SIGPLAN International Conference on Principles and Practice of Declarative Programming 16 PPDP Planning Human-Computer Improvisation Jérôme Nika J. José Echeveste J. Marc Chemillier M. Jean-Louis Giavitto J.-L. International Computer Music Conference Athens, Greece September 2014 8 https://hal.archives-ouvertes.fr/hal-01053834 International Conference on Computer Music 2010 ICMC Model Based Testing of an Interactive Music System Clément Poncelet C. Florent Jacquemard F. ACM SAC Salamanca, Spain April 2015 https://hal.archives-ouvertes.fr/hal-01097345 ACM Symposium on Applied Computing 30 SAC Test Methods for Score-Based Interactive Music Systems Clément Poncelet Sanchez C. Florent Jacquemard F. ICMC SMC 2014 Athen, Greece September 2014 https://hal.inria.fr/hal-01021617 International Conference on Computer Music 40 ICMC Real Time Tempo Canons with Antescofo Christopher Trapani C. José Echeveste J. International Computer Music Conference Athens, Greece September 2014 5 https://hal.archives-ouvertes.fr/hal-01053836 International Conference on Computer Music 2010 ICMC Specification of a reactive computation model for OpenMusic Jean Bresson J. Jean-Louis Giavitto J.-L. IRCAM 2014 17 https://hal.archives-ouvertes.fr/hal-00959312 Research Report Antescofo Intermediate Representation Florent Jacquemard F. Clément Poncelet Sanchez C. RR-8520 2014 13 https://hal.inria.fr/hal-00979359 Research Report FO2(<,+1, ) on data trees, data tree automata and an branching vector addition systems Florent Jacquemard F. Luc Segoufin L. Jérémie Dimino J. Inria Saclay January 2015 32 https://hal.inria.fr/hal-00769249 Research Report Online learning for audio clustering and segmentation Alberto Bietti A. ENS Cachan September 2014 https://hal.inria.fr/hal-01064672 Masters thesis Suivi de partition pour l'alignement de la voix chantée Rong Gong R. Université Pierre et Marie Curie, Paris September 2014 https://hal.inria.fr/hal-01066603 Masters thesis New Computational Paradigms for Computer Music Sciences de la musique Gérard Assayag G. Andrew Gerzso A. Editions Delatour 6 2009 Real-time detection of overlapping sound events with non-negative matrix factorization Arnaud Dessein A. Arshia Cont A. Guillaume Lemaitre G. Frank Nielsen F. Rajendra Bhatia R. Matrix Information Geometry Springer 2013 341-371 http://hal.inria.fr/hal-00708805 Système 4X : processeur numérique de signal en temps réel Giuseppe Di Giugno G. Jean Kott J. Ircam 1981 http://articles.ircam.fr/textes/DiGiugno81a/ Testing real-time systems using UPPAAL Anders Hessel A. Kim G. Larsen K. G. Marius Mikucionis M. Brian Nielsen B. Paul Pettersson P. Arne Skou A. Robert M. Hierons R. M. Jonathan P. Bowen J. P. Mark Harman M. Formal methods and testing Berlin, Heidelberg Springer-Verlag 2008 77–117 http://dl.acm.org/citation.cfm?id=1806209.1806212 Black-box conformance testing for real-time systems Moez Krichen M. Stavros Tripakis S. In 11th International SPIN Workshop on Model Checking of Software (SPIN'04), volume 2989 of LNCS Springer 2004 109–126 Computing Needs Time Edward A. Lee E. A. Communications of the ACM 52 5 May 2009 70-79 http://chess.eecs.berkeley.edu/pubs/615.html The Ircam Musical Workstation : hardware Overview and Signal Processing Features Eric Lindemann E. Michel Starkier M. François Déchelle F. ICMC: International Computer Music Conference Glasgow 1990 http://articles.ircam.fr/textes/Lindemann90a Programming languages for computer music synthesis, performance, and composition G. Loy G. C. Abbott C. ACM Computing Surveys (CSUR) 17 2 1985 235–265 SuperCollider: a new real time synthesis language James McCartney J. Proceedings of the International Computer Music Conference 1996 http://www.audiosynth.com/icmc96paper.html The Patcher Miller Puckette M. Proceedings of International Computer Music Conference (ICMC) 1988 420-429 Combining Event and Signal Processing in the MAX Graphical Programming Environment Miller Puckette M. Computer Music Journal 15 1991 68–77 Interactive music systems: machine listening and composing Robert Rowe R. MIT Press

Cambridge, MA, USA

1992 Music listening systems Eric D. Scheirer E. D. MIT Media Lab 2000 http://web.media.mit.edu/~tristan/Classes/MAS.945/Papers/Technical/Scheirer_Thesis.pdf Ph. D. Thesis Challenge Problems for Artificial Intelligence (Panel Statements) Bart Selman B. Rodney A. Brooks R. A. Thomas Dean T. Eric Horvitz E. Tom M. Mitchell T. M. Nils J. Nilsson N. J. AAAI/IAAI, Vol. 2 1996 1340-1345 Analyzing the multimedia operating system R. Steinmetz R. MultiMedia, IEEE 2 1 1995 68 -84 http://dx.doi.org/10.1109/93.368605 Human perception of jitter and media synchronization R. Steinmetz R. Selected Areas in Communications, IEEE Journal on 14 1 1996 61–72 The Synthetic Performer in the Context of Live Performance Barry Vercoe B. Proceedings of the ICMC 1984 199–200