AUTHORS: Sania Aftar Luca Gagliardelli Amina El Ganadi Federico Ruozzi Sonia Bergamaschi
WORK PACKAGE: WP 5 – DIGITAL MAKTABA
URL
https://ceur-ws.org/Vol-3643/paper12.pdf
Keywords: Topic Modeling, Hadith, Neural Topic Model
Abstract
In this paper, we present our preliminary work on developing a novel neural-based approach named
RoBERT2VecTM, aimed at identifying topics within the “Matn” of “Hadith”. This approach focuses on
semantic analysis, showing potential to outperform current state-of-the-art models. Despite the avail
ability of various models for topic identification, many struggle with multilingual datasets. Furthermore,
some models have limitations in discerning deep semantic meanings, not trained for languages such as
Arabic. Considering the sensitive nature of Hadith texts, where topics are often complexly interleaved,
careful handling is imperative. We anticipate that RoBERT2VecTM will offer substantial improvements
in understanding contextual relationships within texts, a crucial aspect for accurately identifying topics
in such intricate religious documents.