|
<!doctype html> |
|
<html lang="en"> |
|
<head> |
|
<meta charset="utf-8" /> |
|
<meta name="viewport" content="width=device-width" /> |
|
<title>Iqra’Eval Shared Task</title> |
|
<style> |
|
|
|
:root { |
|
--navy-blue: #001f4d; |
|
--coral: #ff6f61; |
|
--light-gray: #f5f7fa; |
|
--text-dark: #222; |
|
} |
|
|
|
body { |
|
font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif; |
|
background-color: var(--light-gray); |
|
color: var(--text-dark); |
|
margin: 20px; |
|
line-height: 1.6; |
|
} |
|
|
|
h1, h2, h3 { |
|
color: var(--navy-blue); |
|
font-weight: 700; |
|
margin-top: 1.2em; |
|
} |
|
|
|
h1 { |
|
text-align: center; |
|
font-size: 2.8rem; |
|
margin-bottom: 0.3em; |
|
} |
|
|
|
h2 { |
|
border-bottom: 3px solid var(--coral); |
|
padding-bottom: 0.3em; |
|
} |
|
|
|
h3 { |
|
color: var(--coral); |
|
margin-top: 1em; |
|
} |
|
|
|
p { |
|
max-width: 900px; |
|
margin: 0.8em auto; |
|
} |
|
|
|
strong { |
|
color: var(--navy-blue); |
|
} |
|
|
|
ul { |
|
max-width: 900px; |
|
margin: 0.5em auto 1.5em auto; |
|
padding-left: 1.2em; |
|
} |
|
|
|
ul li { |
|
margin: 0.4em 0; |
|
} |
|
|
|
code { |
|
background-color: #eef4f8; |
|
color: var(--navy-blue); |
|
padding: 2px 6px; |
|
border-radius: 4px; |
|
font-family: Consolas, monospace; |
|
font-size: 0.9em; |
|
} |
|
|
|
pre { |
|
max-width: 900px; |
|
background-color: #eef4f8; |
|
color: var(--navy-blue); |
|
padding: 1em; |
|
border-radius: 8px; |
|
overflow-x: auto; |
|
font-family: Consolas, monospace; |
|
font-size: 0.95em; |
|
margin: 0.8em auto; |
|
} |
|
|
|
a { |
|
color: var(--coral); |
|
text-decoration: none; |
|
} |
|
|
|
a:hover { |
|
text-decoration: underline; |
|
} |
|
|
|
.card { |
|
max-width: 1200px; |
|
background: white; |
|
margin: 0 auto 40px auto; |
|
padding: 2em 2.5em; |
|
box-shadow: 0 4px 14px rgba(0,0,0,0.1); |
|
border-radius: 12px; |
|
} |
|
|
|
|
|
div img { |
|
display: block; |
|
margin: 20px auto; |
|
max-width: 100%; |
|
height: auto; |
|
border-radius: 8px; |
|
box-shadow: 0 4px 8px rgba(0,31,77,0.15); |
|
} |
|
|
|
.centered p { |
|
text-align: center; |
|
font-style: italic; |
|
color: var(--navy-blue); |
|
margin-top: 0.4em; |
|
} |
|
|
|
.highlight { |
|
color: var(--coral); |
|
font-weight: 700; |
|
} |
|
|
|
|
|
p > ul { |
|
margin-top: 0.3em; |
|
} |
|
|
|
</style> |
|
</head> |
|
<body> |
|
<div class="card"> |
|
<h1>Iqra’Eval Shared Task</h1> |
|
|
|
<div> |
|
<img src="IqraEval.png" alt="IqraEval Logo" /> |
|
</div> |
|
|
|
|
|
<h2>Overview</h2> |
|
<p> |
|
<strong>Iqra'Eval</strong> is a shared task aimed at advancing <strong>automatic assessment of Qur’anic recitation pronunciation</strong> by leveraging computational methods to detect and diagnose pronunciation errors. The focus on Qur’anic recitation provides a standardized and well-defined context for evaluating Modern Standard Arabic (MSA) pronunciation. |
|
</p> |
|
<p> |
|
Participants will develop systems capable of detecting mispronunciations (e.g., substitution, deletion, or insertion of phonemes). |
|
</p> |
|
|
|
|
|
<h2>Timeline</h2> |
|
<ul> |
|
<li><strong>June 1, 2025</strong>: Official announcement of the shared task</li> |
|
<li><strong>June 10, 2025</strong>: Release of training data, development set (QuranMB), phonetizer script, and baseline systems</li> |
|
<li><strong>July 24, 2025</strong>: Registration deadline and release of test data</li> |
|
<li><strong>July 27, 2025</strong>: End of evaluation cycle (test set submission closes)</li> |
|
<li><strong>July 30, 2025</strong>: Final results released</li> |
|
<li><strong>August 15, 2025</strong>: System description paper submissions due</li> |
|
<li><strong>August 22, 2025</strong>: Notification of acceptance</li> |
|
<li><strong>September 5, 2025</strong>: Camera-ready versions due</li> |
|
</ul> |
|
|
|
|
|
<h2>Task Description: Quranic Mispronunciation Detection System</h2> |
|
<p> |
|
The aim is to design a model to detect and provide detailed feedback on mispronunciations in Quranic recitations. |
|
Users read aloud vowelized Quranic verses; this model predicts the phoneme sequence uttered by the speaker, which may contain mispronunciations. |
|
Models are evaluated on the <strong>QuranMB.v2</strong> dataset, which contains human‐annotated mispronunciations. |
|
</p> |
|
|
|
<div class="centered"> |
|
<img src="task.png" alt="System Overview" /> |
|
<p>Figure: Overview of the Mispronunciation Detection Workflow</p> |
|
</div> |
|
|
|
<h3>1. Read the Verse</h3> |
|
<p> |
|
The user is shown a <strong>Reference Verse</strong> (What should have been said) in Arabic script along with its corresponding <strong>Reference Phoneme Sequence</strong>. |
|
</p> |
|
<p><strong>Example:</strong></p> |
|
<ul> |
|
<li><strong>Arabic:</strong> إِنَّ الصَّفَا وَالْمَرْوَةَ مِنْ شَعَائِرِ اللَّهِ</li> |
|
<li> |
|
<strong>Phoneme:</strong> |
|
<code>< i n n a SS A f aa w a l m a r w a t a m i n $ a E a a < i r i l l a h i</code> |
|
</li> |
|
</ul> |
|
|
|
<h3>2. Save Recording</h3> |
|
<p> |
|
The user recites the verse aloud; the system captures and stores the audio waveform for subsequent analysis. |
|
</p> |
|
|
|
<h3>3. Mispronunciation Detection</h3> |
|
<p> |
|
The stored audio is fed into a <strong>Mispronunciation Detection Model</strong>. |
|
This model predicts the phoneme sequence uttered by the speaker, which may contain mispronunciations. |
|
</p> |
|
<p><strong>Example of Mispronunciation:</strong></p> |
|
<ul> |
|
<li><strong>Reference Phoneme Sequence (What should have been said):</strong> <code>< i n n a SS A f aa w a l m a r w a t a m i n $ a E a a < i r i l l a h i</code></li> |
|
<li><strong>Model Phoneme Prediction (What is predicted):</strong> <code>< i n n a SS A f aa w a l m a r w a t a m i n s a E a a < i r u l l a h i</code></li> |
|
<li> |
|
<strong>Annotated Phoneme Sequence (What is said):</strong> |
|
<code>< i n n a SS A f aa w a l m a r w <span class="highlight">s</span> a E a a < i <span class="highlight">r u</span> l l a h i</code> |
|
</li> |
|
</ul> |
|
<p> |
|
In this case, the phoneme <code>$</code> was mispronounced as <code>s</code>, and <code>i</code> was mispronounced as <code>u</code>. |
|
</p> |
|
<p> |
|
The annotated phoneme sequence indicates that the phoneme <code>ta</code> was omitted, but the model failed to detect it. |
|
</p> |
|
|
|
<h2>Training Dataset: Description</h2> |
|
<p> |
|
All data are hosted on Hugging Face. Two main splits are provided: |
|
</p> |
|
<ul> |
|
<li> |
|
<strong>Training set:</strong> 79 hours of Modern Standard Arabic (MSA) Quran recitations (5,167 audio files) |
|
</li> |
|
<li> |
|
<strong>Evaluation set:</strong> QuranMB.v2 dataset with phoneme-level mispronunciation annotations, which includes: |
|
<ul> |
|
<li>QuranMB-Train: 9 hours (1,218 files) for development</li> |
|
<li>QuranMB-Test: 8 hours (1,018 files) for evaluation</li> |
|
</ul> |
|
</li> |
|
</ul> |
|
|
|
<h2>Submission Guidelines</h2> |
|
<p> |
|
Participants should submit their predicted phoneme sequences on the test set by the deadline (July 27, 2025). Submissions will be automatically evaluated using the official scoring scripts. |
|
</p> |
|
|
|
<h2>Evaluation Metrics</h2> |
|
<p> |
|
Systems will be evaluated based on phoneme error rates (PER) computed over the test set, measuring accuracy in detecting and localizing mispronunciations. |
|
</p> |
|
|
|
<h2>Contact and Support</h2> |
|
<p> |
|
For inquiries and support, reach out to the task coordinators at |
|
<a href="mailto:support@iqraeval.org">support@iqraeval.org</a>. |
|
</p> |
|
|
|
</div> |
|
</body> |
|
</html> |
|
|
|
|