This feature is planned. I am currently expanding out the word audio with one of our male voice actors, after which I would like to have him go back and overlap the female voices on the words. At a later point,I would like to go and do this with the sentences as well.
The purpose of this discussion is to talk about the visual (UI) aspect of this feature.
1. (Decided) In settings, you will be able to prioritize male or female voices (or random). If we ever somehow get more than 2 voices per word/sentence, you'll be able to choose specific speakers. (How amazing would that be!)
2. Visually, there is currently a single audio icon next to content that has audio. I am open to ideas as to how to cleanly offer a way for users to hear multiple voices, etc. This icon appears SO much in renshuu that I am extremely hesitant to just add a second icon, for example.
Maybe you could have a radial menu on long press (on the audio icon) that let's you change the audio source? Or just handle everything from the settings without changing the UI?