Forced alignment technology

Forced alignment (FA) refers to the automatic process by which speech recordings are phonetically time-stamped with the help of Hidden Markov models or Deep Neural Networks. Autophon uses the latter by means of the The Montreal Forced Aligner. The software outputs a time-stamped phonetic annotation that is readable in Praat, based on an optimization of two user inputs: (1) the speech recording and (2) a corresponding orthographic transcription. For an FA tool to work for a particular language, an acoustic model must be trained and an accompanying pronunciation lexicon must be built that covers every word in the language. FA is important because it automates something that is resource-intensive when done manually. A typical phonetic annotation can take between 250 and 400 minutes per recorded minute. In a place like Scandinavia where labor costs are high, this has presented a barrier for linguists.

Data security

Everything you upload is encrypted and sent to a server in Frankfurt, Germany, that is run by Digital Ocean (server FRA1). The app is free of charge to researchers. To increase security and reduce the chance of a data breach, sound files are also immediately deleted after alignment. On the other hand, finished TextGrids are stored in your account for as long as you like. Once, however, you delete them, they will also be removed from our server permanently.

Platform stability

Autophon is in beta, which means it is still working through a number of bugs. Contact me if you encounter any issues or have any feedback.

Sign up
Video demo


Accuracy metrics of the four languages supported – Danish, Norwegian, Swedish and English – can be accessed in the user guides below.

 Danish (DanFA 3.0) user guide
 Danish (DanFA 4.0 ⚠️ developer mode) user guide
 UK English user guide
 Norwegian Bokmål user guide
 Swedish user guide


Autophon was founded and is managed by Nate Young and run by the essential stakeholders listed on ourTeam page. Initially started with private means, it has now grown in scope with a grant from the Swedish Academy and a grant from the Department of Linguistics and Scandinavian Studies at The University of Oslo.