05-19-2020, 08:31 PM
(05-19-2020, 12:48 PM)Mattias Westlund Wrote: I didn't think neural networks were "there" yet; I've seen the trippy AI-generated images of course, as well as the pictures of people who don't actually exist. But I assumed doing the same thing for audio was still a ways off. I believe bigcat posted some stuff here on the forum a year or two ago with orchestral samples generated by some Google AI (IIRC), and that wasn't exactly hi fidelity.
Still, using neural networks would still require the use of samples, no? I mean, the AI can't possibly generate lifelike instruments in realtime on a regular PC, can it?
The stuff Bigcat posted were samples (of unknown providence) used to train a neural network, not the output of it. The output can be as high quality as the training is capable of, it's just that using 32 kHz 8-bit mono samples or whatever is faster to train.
Once the neural network is trained, the training data (e.g. a 30 minute clip of someone performing a few songs on said instrument, with corresponding MIDI reference files) is no longer needed and can be left behind. For example, I use a neural network based image upscaler for work; the dataset used to train it was gigabytes upon gigabytes, but the entire package now is only a few MB at most.
The big thing is, every year papers come out offering doublings or even order of magnitude improvements in processing time. Consider NVIDIA's new neural network upscaling, which allows a game to render at a lower resolution (e.g. 720p) and upscale that image to a higher resolution (1080p, 1440p, 4K, etc.), saving tons of resources. By comparison, the neural network upscaler I use takes several minutes to process a photo on my GTX 1060.
They're already doing this stuff with VIDEO, which is several dimensions larger and orders of magnitude more intensive than audio is. It's just a matter of people with the right expertise deciding to refine the technology in that direction, and given how saturated and expensive current sample library development is, I think any big company in this field worth their salt is going to be looking for that lifeline real soon; we can't just keep sampling ad infinitum like this, it is just becoming uneconomical. Eventually someone will put out yet another giant orchestral library that no one wants or needs or can justify and the market will probably crash.
Sample library developer, composer, and amateur organologist at Versilian Studios.