AI Tools.

Search

automatic speech recognition

mms-300m-1130-forced-aligner

MMS-300M-1130-forced-aligner is Meta's 300M parameter wav2vec2-based model fine-tuned for forced phoneme-level alignment across 1,130 languages. It takes audio and a text transcript as input and outputs word- or phoneme-level timestamps, enabling subtitle synchronization and linguistic documentation at scale. The CC-BY-NC-4.0 license restricts commercial deployment.

Last reviewed

Use cases

  • Automated subtitle timestamp generation from existing transcripts
  • Phoneme-level alignment for low-resource language documentation
  • Speech data annotation for multilingual TTS training corpus creation
  • Linguistic research on timing patterns across diverse language families

Pros

  • Supports 1,130 languages, far exceeding other forced alignment tools
  • Produces fine-grained word and phoneme-level timestamps
  • wav2vec2 backbone integrates directly with HuggingFace ecosystem tooling

Cons

  • CC-BY-NC-4.0 license prohibits commercial deployment
  • Requires a pre-existing text transcript as input — not a standalone ASR model
  • Accuracy drops significantly on noisy or heavily accented audio recordings

FAQ

What is mms-300m-1130-forced-aligner used for?

Automated subtitle timestamp generation from existing transcripts. Phoneme-level alignment for low-resource language documentation. Speech data annotation for multilingual TTS training corpus creation. Linguistic research on timing patterns across diverse language families.

Is mms-300m-1130-forced-aligner free to use?

mms-300m-1130-forced-aligner is an open-source model published on HuggingFace. License terms vary by model — check the model card for the specific license.

How do I run mms-300m-1130-forced-aligner locally?

Most HuggingFace models can be loaded with transformers or the appropriate framework library. See the model card for framework-specific instructions and hardware requirements.

Tags

transformerspytorchsafetensorswav2vec2automatic-speech-recognitionmmsaudiovoicespeechforced-alignmentabafakamarasavayazba