AI Tools.

Search

sentence similarity

nomic-embed-text-v1.5

Nomic Embed Text v1.5 is a matryoshka-capable English embedding model from Nomic AI, built on a custom nomic-BERT architecture trained with contrastive learning on large-scale text pairs. Matryoshka Representation Learning allows truncating embeddings to shorter dimensions (e.g. 64, 128, 256) without retraining, enabling flexible precision-cost tradeoffs. The model is transformers.js-compatible for browser-side inference.

Last reviewed

Use cases

  • RAG pipeline text embedding with flexible dimension budget
  • Semantic search where embedding size can be tuned to vector store cost
  • Browser-side embedding inference via transformers.js without a server
  • MTEB benchmark comparison against other embedding models
  • Building efficient embedding pipelines where 768 dims is over-budget

Pros

  • Matryoshka dimensions allow truncating to smaller sizes without significant accuracy loss
  • Transformers.js compatibility enables client-side or edge inference
  • Apache 2.0 license; ONNX and safetensors supported
  • MTEB retrieval scores competitive with larger models
  • Custom nomic-BERT architecture trained specifically for retrieval

Cons

  • English-only; no cross-lingual capability
  • Custom nomic_bert architecture requires custom_code flag — less standard than BERT-based models
  • Smaller adoption footprint than sentence-transformers standard models
  • Performance at smallest dimensions (64d) degrades on hard retrieval tasks
  • Requires trusting third-party custom model code on load

FAQ

What is nomic-embed-text-v1.5 used for?

RAG pipeline text embedding with flexible dimension budget. Semantic search where embedding size can be tuned to vector store cost. Browser-side embedding inference via transformers.js without a server. MTEB benchmark comparison against other embedding models. Building efficient embedding pipelines where 768 dims is over-budget.

Is nomic-embed-text-v1.5 free to use?

nomic-embed-text-v1.5 is an open-source model published on HuggingFace. License terms vary by model — check the model card for the specific license.

How do I run nomic-embed-text-v1.5 locally?

Most HuggingFace models can be loaded with transformers or the appropriate framework library. See the model card for framework-specific instructions and hardware requirements.

Tags

sentence-transformersonnxsafetensorsnomic_bertfeature-extractionsentence-similaritymtebtransformerstransformers.jscustom_codeenarxiv:2402.01613arxiv:2205.13147license:apache-2.0model-indexeval-resultstext-embeddings-inferenceendpoints_compatibleregion:us