Use cases
- RAG pipeline text embedding with flexible dimension budget
- Semantic search where embedding size can be tuned to vector store cost
- Browser-side embedding inference via transformers.js without a server
- MTEB benchmark comparison against other embedding models
- Building efficient embedding pipelines where 768 dims is over-budget
Pros
- Matryoshka dimensions allow truncating to smaller sizes without significant accuracy loss
- Transformers.js compatibility enables client-side or edge inference
- Apache 2.0 license; ONNX and safetensors supported
- MTEB retrieval scores competitive with larger models
- Custom nomic-BERT architecture trained specifically for retrieval
Cons
- English-only; no cross-lingual capability
- Custom nomic_bert architecture requires custom_code flag — less standard than BERT-based models
- Smaller adoption footprint than sentence-transformers standard models
- Performance at smallest dimensions (64d) degrades on hard retrieval tasks
- Requires trusting third-party custom model code on load
FAQ
What is nomic-embed-text-v1.5 used for?
RAG pipeline text embedding with flexible dimension budget. Semantic search where embedding size can be tuned to vector store cost. Browser-side embedding inference via transformers.js without a server. MTEB benchmark comparison against other embedding models. Building efficient embedding pipelines where 768 dims is over-budget.
Is nomic-embed-text-v1.5 free to use?
nomic-embed-text-v1.5 is an open-source model published on HuggingFace. License terms vary by model — check the model card for the specific license.
How do I run nomic-embed-text-v1.5 locally?
Most HuggingFace models can be loaded with transformers or the appropriate framework library. See the model card for framework-specific instructions and hardware requirements.