In recent years, neural network-based black-box modeling of nonlinear audio effects has improved considerably. Present convolutional and recurrent models can model audio effects with long-term dynamics, but the models require many parameters, thus increasing the processing time. In this paper, we propose KLANN, a Koopman-Linearised Audio Neural Network structure that lifts a one-dimensional signal (mono audio) into a high-dimensional approximately linear state-space representation with nonlinear mapping, and then uses differentiable biquad filters to predict linearly within the lifted state-space. Results show that the proposed models match the high performance of the state-of-the-art neural models while having a more compact architecture, reducing the number of parameters by tenfold, and having interpretable components.
| Face Bender | MCompressor | LA-2A | |
|---|---|---|---|
| Input | |||
| Target | |||
| small parallel KLANN | |||
| large parallel KLANN | |||
| small parallel-series KLANN | |||
| large parallel-series KLANN | |||
| GCNTF-3 | |||
| GCNTF-2500 |
Audio samples of a small parallel-series model trained on the Face Bender. Stage I is the output after the first stage of training and stage II is the final output of the model.
| Face Bender | |
|---|---|
| Input | |
| Target | |
| Stage I | |
| Stage II |