I'm really enjoying this concept! In order to address your issues then I think there should be a moving line that tells the player what vowel to press at what time (kind of like FNF). While this does make it so the player no longer has to work out for themselves where a vowel is included in a word, it would allow each level to be charted (again, FNF style) so that problem vowels (e.g. s(y)ringe) could be skipped altogether. Or what if the player got a bonus for locating and playing the problem vowels themselves? A lot to think about with this one, but that doesn't stop it from being an incredibly creative concept!