Use cases
- Default English semantic search where bge-small is insufficient
- RAG pipeline embedding with reasonable compute budget
- Sentence-level clustering for content analysis
- Ranking-style retrieval where 768-dim precision is adequate
- Embedding generation for knowledge bases with moderate latency requirements
Pros
- MIT license for commercial use
- 768-dim balances quality vs. cost vs. bge-small and bge-large
- ONNX and text-embeddings-inference compatible for production
- Part of well-benchmarked BAAI BGE family
Cons
- English-only; no cross-lingual capability
- Outperformed by instruction-following embedding models on asymmetric retrieval
- 768-dim adds storage cost vs. smaller variants without proportional accuracy gain on easy tasks
- Does not support instruction prefix — newer BGE models do
- MTEB benchmarks do not reflect all real-world retrieval difficulty levels
FAQ
What is bge-base-en-v1.5 used for?
Default English semantic search where bge-small is insufficient. RAG pipeline embedding with reasonable compute budget. Sentence-level clustering for content analysis. Ranking-style retrieval where 768-dim precision is adequate. Embedding generation for knowledge bases with moderate latency requirements.
Is bge-base-en-v1.5 free to use?
bge-base-en-v1.5 is an open-source model published on HuggingFace. License terms vary by model — check the model card for the specific license.
How do I run bge-base-en-v1.5 locally?
Most HuggingFace models can be loaded with transformers or the appropriate framework library. See the model card for framework-specific instructions and hardware requirements.