Google's original BERT base model in uncased form, pre-trained on BookCorpus and English Wikipedia via masked language modeling. Tokens are lowercased before processing, making it insensitive to capitalization. It remains a standard fine-tuning base for classification, NER, and extractive QA, though newer encoders outperform it on most benchmarks.
59,598,776 ↓ · 2,641 ♡
RoBERTa base from Facebook AI, trained with the same architecture as BERT base but significantly more data, longer training schedules, larger batch sizes, and dynamic masking. Pre-trained on BookCorpus, Wikipedia, CC-News, OpenWebText, and Stories — substantially more data than the original BERT. MIT licensed with multi-framework support.
18,684,651 ↓ · 595 ♡
RoBERTa large, the 355M-parameter version of Facebook AI's strongly trained BERT variant, offering doubled hidden size and additional attention heads over RoBERTa base. It provides stronger NLU accuracy at roughly 4x the inference compute cost of the base variant. Used where task accuracy on complex English language understanding outweighs latency constraints.
18,627,609 ↓ · 283 ♡
XLM-RoBERTa base from Facebook AI, pre-trained on 2.5TB of filtered CommonCrawl text across 100 languages using the RoBERTa training procedure. Enables cross-lingual transfer — models fine-tuned on labeled English data can infer on other languages without parallel annotations. The standard starting point for multilingual classification and token-level tasks.
18,605,818 ↓ · 822 ♡
DistilBERT-base-uncased is a distilled version of BERT-base-uncased, 40% smaller and 60% faster while retaining approximately 97% of BERT's language understanding performance on the GLUE benchmark. Trained via knowledge distillation from BERT using BookCorpus and Wikipedia. Commonly used when BERT's performance is needed but inference speed or resource constraints are limiting factors.
13,940,511 ↓ · 872 ♡
XLM-RoBERTa Large, the 560-million-parameter multilingual encoder from Facebook AI, trained on 2.5TB of CommonCrawl data across 100 languages. It offers stronger multilingual language understanding than the base variant across classification, NER, and cross-lingual tasks, at roughly 4x the compute cost. MIT licensed with multi-framework support.
6,758,270 ↓ · 511 ♡
Google's BERT base model in cased form, pre-trained on BookCorpus and English Wikipedia with original case preserved. Unlike bert-base-uncased, this model maintains distinctions between 'bert' and 'BERT' — essential for tasks where capitalization carries semantic information, such as named entity recognition. Same architecture as bert-base-uncased but with case-sensitive tokenization.
4,439,406 ↓ · 357 ♡
BERT-base-multilingual-cased is Google's multilingual BERT trained on 104-language Wikipedia data with case preserved, making it better suited than the uncased variant for named entity recognition and tasks where capitalization carries semantic meaning. It shares the same 12-layer Transformer architecture and 768-dimensional embedding space as BERT-base-uncased. Despite its age, it remains a common transfer learning starting point for multilingual tasks.
4,221,839 ↓ · 587 ♡
BERT-base-multilingual-uncased is Google's multilingual BERT trained on Wikipedia text from 104 languages with all text lowercased before tokenization. Lowercasing simplifies processing but removes capitalization signals that help named entity recognition. It produces 768-dimensional embeddings shared across all supported languages.
3,878,218 ↓ · 156 ♡
deberta-v3-base is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
2,522,455 ↓ · 418 ♡
bert-large-portuguese-cased is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
2,271,959 ↓ · 72 ♡
Bio_ClinicalBERT is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
2,236,514 ↓ · 427 ♡
esm2_t33_650M_UR50D is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
1,840,998 ↓ · 78 ♡
ModernBERT-base is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
1,499,008 ↓ · 1,035 ♡
mdeberta-v3-base is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
1,474,787 ↓ · 220 ♡
BiomedNLP-BiomedBERT-base-uncased-abstract is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
1,433,984 ↓ · 91 ♡
distilbert-base-multilingual-cased is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
1,311,322 ↓ · 239 ♡
bert-large-uncased is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
1,159,848 ↓ · 147 ♡
bert-base-chinese is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
1,149,002 ↓ · 1,417 ♡
distilroberta-base is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
1,099,593 ↓ · 177 ♡
deberta-v3-large is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
1,004,903 ↓ · 277 ♡
esm2_t6_8M_UR50D is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
859,624 ↓ · 34 ♡
camembert-base is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
838,611 ↓ · 100 ♡
roberta-base is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
799,186 ↓ · 48 ♡
deberta-v3-small is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
737,237 ↓ · 77 ♡
albert-base-v2 is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
723,642 ↓ · 141 ♡
esm2_t36_3B_UR50D is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
614,701 ↓ · 28 ♡
bert-base-arabertv02 is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
608,455 ↓ · 44 ♡
dummy-unknown is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
599,280 ↓ · 1 ♡
bert-base-spanish-wwm-uncased is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
504,984 ↓ · 74 ♡
Clinical-Longformer is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
493,280 ↓ · 69 ♡
bert-base-german-cased is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
486,192 ↓ · 82 ♡
Bio_Discharge_Summary_BERT is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
460,633 ↓ · 38 ♡
juribert-base is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
417,574 ↓ · 0 ♡
ModernBERT-large is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
403,262 ↓ · 467 ♡
kcbert-base is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
390,214 ↓ · 30 ♡
prot_bert is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
352,050 ↓ · 132 ♡
graphcodebert-base is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
333,574 ↓ · 87 ♡
bert-base-japanese-whole-word-masking is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
321,796 ↓ · 76 ♡
mmBERT-base is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
318,854 ↓ · 205 ♡
bert-base-portuguese-cased is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
300,876 ↓ · 229 ♡
chinese-bert-wwm-ext is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
298,910 ↓ · 193 ♡
esm2_t12_35M_UR50D is an open-source fill-mask model available on HuggingFace. Details are sourced from the public model registry.
292,604 ↓ · 21 ♡