XLM-RoBERTa OVOS intent classifier (base-sized model)

XLM-RoBERTa model pre-trained on 2.5TB of filtered CommonCrawl data containing 100 languages. It was introduced in the paper Unsupervised Cross-lingual Representation Learning at Scale by Conneau et al. and first released in this repository.

This model was fine-tuned to classify intents based on the dataset Jarbas/ovos_intents_train

Intended uses & limitations

You can use the raw model for intent classification in the Open Voice OS project context.

Usage

from transformers import AutoModelForSequenceClassification, AutoTokenizer, AutoConfig
model = AutoModelForSequenceClassification.from_pretrained("fdemelo/xlm-roberta-ovos-intent-classifier")
tokenizer = AutoTokenizer.from_pretrained("fdemelo/xlm-roberta-ovos-intent-classifier")
config = AutoConfig.from_pretrained("fdemelo/xlm-roberta-ovos-intent-classifier")

# preprocess dataset
def tokenize_function(examples):
examples["label"] = list(map(lambda x: config.label2id[x], examples["label"]))
return tokenizer(examples["sentence"], padding="max_length", truncation=True)

tokenized_dataset = dataset.map(tokenize_function, batched=True)
prediction = model.predict(tokenized_dataset)

fdemelo
/

xlm-roberta-ovos-intent-classifier

XLM-RoBERTa OVOS intent classifier (base-sized model)

Intended uses & limitations

Usage

Model tree for fdemelo/xlm-roberta-ovos-intent-classifier

Dataset used to train fdemelo/xlm-roberta-ovos-intent-classifier