arxiv:2407.16382

TookaBERT: A Step Forward for Persian NLU

Published on Jul 23, 2024

Authors:

Abstract

Two new BERT models trained on Persian data outperform existing models across multiple NLU tasks, showcasing their effectiveness in Persian language processing.

AI-generated summary

The field of natural language processing (NLP) has seen remarkable advancements, thanks to the power of deep learning and foundation models. Language models, and specifically BERT, have been key players in this progress. In this study, we trained and introduced two new BERT models using Persian data. We put our models to the test, comparing them to seven existing models across 14 diverse Persian natural language understanding (NLU) tasks. The results speak for themselves: our larger model outperforms the competition, showing an average improvement of at least +2.8 points. This highlights the effectiveness and potential of our new BERT models for Persian NLU tasks.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 2

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2407.16382 in a dataset README.md to link it from this page.

Spaces citing this paper 1

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.