XLM-RoBERTa OVOS intent classifier (base-sized model)
XLM-RoBERTa model pre-trained on 2.5TB of filtered CommonCrawl data containing 100 languages. It was introduced in the paper Unsupervised Cross-lingual Representation Learning at Scale by Conneau et al. and first released in this repository.
This model was fine-tuned to classify intents based on the dataset Jarbas/ovos_intents_train
Intended uses & limitations
You can use the raw model for intent classification in the Open Voice OS project context.
Usage
from transformers import AutoModelForSequenceClassification, AutoTokenizer, AutoConfig
model = AutoModelForSequenceClassification.from_pretrained("fdemelo/xlm-roberta-ovos-intent-classifier")
tokenizer = AutoTokenizer.from_pretrained("fdemelo/xlm-roberta-ovos-intent-classifier")
config = AutoConfig.from_pretrained("fdemelo/xlm-roberta-ovos-intent-classifier")
# preprocess dataset
def tokenize_function(examples):
examples["label"] = list(map(lambda x: config.label2id[x], examples["label"]))
return tokenizer(examples["sentence"], padding="max_length", truncation=True)
tokenized_dataset = dataset.map(tokenize_function, batched=True)
prediction = model.predict(tokenized_dataset)
- Downloads last month
- 71
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support