I’ve been wondering—why don’t we have an AI model that can take any piece of music, compress it into a super small “musical script” with parameters, and then generate it back so it sounds almost identical to the original? Kind of like MIDI or sheet music but way more detailed, capturing all the nuances. With modern AI, it seems like this should be possible. Is it a technical limitation, or are we just not thinking about it?
It depends on why you wanna do it.
Because smaller files are easier to handle and send? Sure, but that means lossy compression. Fundamentally auto encoders kind of do what you are saying but it turns out only compression has not been that useful recently, so they get used in a lot of other ways, for example encoding and decoding from and to latent space.
Or do you wanna do it so AI can make awesome music if you give it just some melody? Currently there are other ways to do it. Essentially there are ways to incorporate this functionality directly into music generation models, which is what some AI models like those from suno AI are doing already (afaik).
TL;DR because every solution needs a problem, and what you are describing hasn’t been a big enough issue / priority to implement.