Autophon (beta) is a free prototype web app that uses
forced alignment to convert an audio file and corresponding
transcript into a time-aligned phonetic annotation that is
Forced alignment is technology that uses neural networks to determine, for each phonetic segment of the transcript, the time interval in the audio file that contains the spoken segment. The backend is built on The Montreal Forced Aligner, and the language-specific models have been predominantly trained on naturally-occurring spontaneous speech.
A language-specific write-up with metrics can be accessed by clicking on the relevant language in the list.