On Friday, April 30, 2021 3:34:00 PM CEST, W P wrote:
Following Stefan's offtopic: do you know what kdenlive uses for speech recognition? Have you tried it? From what I know speech recognition is very tricky to perform accurately.
After I found the post I researched it. It is Kaldi. <https://kaldi-asr.org/>
I know vosk (https://github.com/alphacep/vosk-api) is quite good but I have no idea if it can be used in Cinelerra (and the training models can be quite heavy so probably better if user downloads them separately but then I am not sure how user friendly it is..)
I think having an actually (even partial correct) annotation what is happening on the timeline could give a great non-visual hint on what to edit. This kind of stuff should be in audacity too. From my phonetic background I know that presenting it is tricky too, because you want to have 'grouped tracks'. praat.org has some visuals how they have implemented this for academic research. -- Stefan