BAREC Shared Task 2025
Arabic Readability Assessment
The Third Arabic Natural Language Processing Conference (ArabicNLP 2025) @ EMNLP 2025
Suzhou, China
The Third Arabic Natural Language Processing Conference (ArabicNLP 2025) @ EMNLP 2025
Suzhou, China
Shared Task Description
The BAREC Shared Task 2025 focuses on fine-grained readability classification across 19 levels using the Balanced Arabic Readability Evaluation Corpus (BAREC), a dataset of over 1 million words. Participants will build models for both sentence- and document-level classification.
Task 1: Sentence-level Readability Assessment
Given an Arabic sentence, predict its readability level on a scale from 1 to 19, indicating the degree of reading difficulty. Here are some examples of sentences 19 BAREC readability levels:
Task 2: Document-level Readability Assessment
Given a document consisting of multiple sentences, predict its readability level on a scale from 1 to 19, where the hardest (i.e., highest readability) sentence in the document determines the overall document readability level. Here are some examples of documents along with their readability levels:
Data:
The BAREC Corpus: The BAREC Corpus (Elmadani et al., 2025) consists of 1,922 documents and 69,441 sentences classified into 19 readability levels.
The SAMER Corpus: The SAMER Corpus (Alhafni et al., 2024) consists of 4,289 documents and 20,358 fragments classified into three readability levels.
The SAMER Lexicon: The SAMER Lexicon (Al Khalil et al., 2020) is a 40K-lemma leveled readability lexicon. The lexicon consists of 40K lemma and part-of-speech pairs annotated into five readability levels.
The download links are provided above for each dataset.
Shared Task Tracks:
Participants can compete in one or more of the following tracks, each imposing different resource constraints:
Strict Track: Models must be trained exclusively on the BAREC Corpus.
Sentence-level Readability Assessment: CodaBench Link
Document-level Readability Assessment: CodaBench Link
Constrained Track: Participants may use the BAREC Corpus, SAMER Corpus (including document, fragment, and word-level annotations), and the SAMER Lexicon.
Sentence-level Readability Assessment: CodaBench Link
Document-level Readability Assessment: CodaBench Link
Open Track: No restrictions on external resources, allowing the use of any publicly available data.
Sentence-level Readability Assessment: CodaBench Link
Document-level Readability Assessment: CodaBench Link
Refer to our GitHub repository if you'd like to setup the evaluation locally.
Shared Task Phases:
Development Phase: This phase will run until July 25, 2025. Participants will build their models and submit predictions on the BAREC Test set, which is publicly available (i.e, Open Test). Participants must submit their predictions using the respective CodaBench competition for each track. Submitting predictions in this phase does not require registration for the shared task. However, please note that doing so does not make you an official participant in the shared task. To be officially considered, you must register and submit your predictions during the Testing Phase.
Testing Phase: This phase will run from July 20, 2025 to July 25, 2025. Participants will upload their predictions on the Official Blind Test set (henceforth Blind Test). The Blind Test set will only be available to participants who registered to participate in the shared task. Participants must submit their predictions using the respective CodaBench competition for each track.
By registering to participate in the shared task and receiving access to the Official Blind Test set, you commit to submitting a description paper. Participants who register but fail to submit a paper may be disqualified from future shared tasks.
Metrics:
We define the Readability Assessment task as an ordinal classification task. We use the following metrics for evaluation:
Accuracy (Acc): The percentage of cases where reference and prediction classes match in the 19-level scheme (Acc19). We also consider three variants, Acc7, Acc5, and Acc3, that respectively collapse the 19-levels into the 7, 5, and 3-levels.
Adjacent Accuracy (±1 Acc19): Also known as off-by-1 accuracy. It allows some tolerance for predictions that are close to the true labels. It measures the proportion of predictions that are either exactly correct or off by at most one level.
Average Distance (Dist): Also known as Mean Absolute Error (MAE), it measures the average absolute difference between predicted and true labels.
Quadratic Weighted Kappa (QWK): An extension of Cohen’s Kappa measuring the agreement between predicted and true labels, but applies a quadratic penalty to larger misclassifications, meaning that predictions farther from the true label are penalized more heavily.
Awards:
Top-performing Systems:
We will recognize the top-performing system in each of the two tasks + track combinations (2 tasks × 3 tracks), with a $100 prize per winning team.
Best System Description Papers:
We will award one or two prizes for Best System Description Papers. These will recognize clarity, reproducibility, and insight, regardless of leaderboard ranking.
Best Paper: $250
Runner-up or Honorable Mention: $150
Important Dates
June 10, 2025: Release of training, dev and open test data, and evaluation scripts.
July 20, 2025: Registration deadline and release of test data.
July 25, 2025: End of evaluation cycle (test set submission closes).
July 30, 2025: Final results released.
August 15, 2025: System description paper submissions due.
August 25, 2025: Notification of acceptance.
September 5, 2025: Camera-ready versions due.
Shared Task Paper Submission
Please check the paper submission guidelines.
Organizers
Khalid N. Elmadani: New York University Abu Dhabi
Bashar Alhafni: New York University Abu Dhabi and Mohamed bin Zayed University of Artificial Intelligence
Hanada Taha-Thomure: Zayed University
Nizar Habash: New York University Abu Dhabi
Contact
For any questions related to this task, check out the FAQs. Feel free to post your questions on our Slack workspace. You are also welcome to contact the organizers directly at this email address: barec25.organizers@camel-lab.com.