Forced alignment (FA) refers to the automatic process by which speech recordings are phonetically time-stamped with the help of Hidden Markov models or Deep Neural Networks. Autophon uses the latter by means of the The Montreal Forced Aligner. The software outputs a time-stamped phonetic annotation that is readable in Praat, based on an optimization of two user inputs: (1) the speech recording and (2) a corresponding orthographic transcription. For an FA tool to work for a particular language, an acoustic model must be trained and an accompanying pronunciation lexicon must be built that covers every word in the language. FA is important because it automates something that is resource-intensive when done manually. A typical phonetic annotation can take between 250 and 400 minutes per recorded minute. In a place like Scandinavia – where labor costs are high – this cost has presented a barrier for linguists.
Everything you upload is encrypted and sent to a server in Frankfurt, Germany, that is run by Digital Ocean (server FRA1). The app is free of charge to researchers. To increase security and reduce the chance of a data breach, sound files are also immediately deleted after alignment. On the other hand, finished TextGrids are stored in your account for as long as you like. Once, however, you delete them, they will also be removed from our server permanently.
Autophon is in beta, which means it is still working through a number of bugs. Contact me if you encounter any issues or have any feedback.
Accuracy metrics of the four languages supported – Danish, Norwegian, Swedish and English – can be accessed in the user guides below.
Danish user guide
Norwegian user guide
Swedish user guide
UK English user guide
Autophon is owned and managed by Nate Young. Initially started with private means, it has now grown in scope with a grant from the Swedish Academy and a grant from the Department of Linguistics and Scandinavian Studies at The University of Oslo.