I assume phonemes would be the tokens. We can already computer generate the audio of spoken language, seems like the tough part here is figuring out what the dolphin sounds actually mean. Especially when we don’t have native speakers available to correct the machine outputs as the model is trained.
I assume phonemes would be the tokens. We can already computer generate the audio of spoken language, seems like the tough part here is figuring out what the dolphin sounds actually mean. Especially when we don’t have native speakers available to correct the machine outputs as the model is trained.