AI Tools.

Search

feature extraction

multilingual-e5-large

Multilingual-E5-Large is a 560-million-parameter multilingual embedding model from Microsoft Research, supporting 100+ languages via an XLM-RoBERTa backbone. Trained with E5's instruction-following approach (prepending 'query:' or 'passage:' prefixes), it achieves strong MTEB multilingual retrieval scores. MIT licensed with ONNX and OpenVINO export.

Last reviewed

Use cases

  • Multilingual semantic search across 100-language corpora
  • Cross-lingual retrieval where query and documents are in different languages
  • Multilingual RAG pipeline embedding for international content
  • Dense retrieval for low-resource language content with cross-lingual transfer
  • Multilingual text clustering and classification via embeddings

Pros

  • MIT license for commercial use
  • 100+ language coverage with strong multilingual retrieval performance
  • Instruction prefix support ('query:'/'passage:') for asymmetric retrieval
  • ONNX and OpenVINO export; text-embeddings-inference compatible

Cons

  • 560M parameters make it significantly heavier than lighter multilingual models (BGE-M3-small)
  • Larger model size requires more VRAM for batch inference than BGE-M3 or paraphrase-multilingual-MiniLM
  • Quality varies for low-resource languages despite 100+ coverage
  • Instruction prefix is required for best performance — models without the prefix produce degraded embeddings
  • Less adopted than BGE-M3 in the multilingual embedding community

FAQ

What is multilingual-e5-large used for?

Multilingual semantic search across 100-language corpora. Cross-lingual retrieval where query and documents are in different languages. Multilingual RAG pipeline embedding for international content. Dense retrieval for low-resource language content with cross-lingual transfer. Multilingual text clustering and classification via embeddings.

Is multilingual-e5-large free to use?

multilingual-e5-large is an open-source model published on HuggingFace. License terms vary by model — check the model card for the specific license.

How do I run multilingual-e5-large locally?

Most HuggingFace models can be loaded with transformers or the appropriate framework library. See the model card for framework-specific instructions and hardware requirements.

Tags

sentence-transformerspytorchonnxsafetensorsopenvinoxlm-robertamtebSentence Transformerssentence-similarityfeature-extractionmultilingualafamarasazbebgbnbr