A Novel Methodology for Topic Identification in Hadith

AUTHORS: Sania Aftar Luca Gagliardelli Amina El Ganadi Federico Ruozzi Sonia Bergamaschi

WORK PACKAGE: WP 5 – DIGITAL MAKTABA

URL

https://ceur-ws.org/Vol-3643/paper12.pdf

Keywords: Topic Modeling, Hadith, Neural Topic Model

Abstract

In this paper, we present our preliminary work on developing a novel neural-based approach named
RoBERT2VecTM, aimed at identifying topics within the “Matn” of “Hadith”. This approach focuses on
semantic analysis, showing potential to outperform current state-of-the-art models. Despite the avail
ability of various models for topic identification, many struggle with multilingual datasets. Furthermore,
some models have limitations in discerning deep semantic meanings, not trained for languages such as
Arabic. Considering the sensitive nature of Hadith texts, where topics are often complexly interleaved,
careful handling is imperative. We anticipate that RoBERT2VecTM will offer substantial improvements
in understanding contextual relationships within texts, a crucial aspect for accurately identifying topics
in such intricate religious documents.

Leave a comment