Model type: Transformer-based masked language model

Training data: No additional pretraining, merges two existing models

Languages: 100+ languages

Architecture:

  • Base architectures:
  • XLM-RoBERTa base (multilingual)
  • BERT base cased (multilingual)

Custom merging technique to combine weights from both base models into one unified model

Downloads last month
9
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support