Call for Applications RESILIENCE: Transnational Access Fellowships 2025-2026
RESILIENCE, the European cross-disciplinary research infrastructure serving the study of religion, launches its fifth call for applications for Transnational Access Fellowships. The call will be open from March 13 to May 1, 2025. Scholars from all career levels can apply for a short research visit to the host of their choice and on the conditions of that host.
The RESILIENCE TNA program aims to provide scholars with direct, fast, and effective access to research materials. Host institutions grant access to manuscripts, rare books, and documents, while also offering a unique match-making service that connects researchers with experts in their field.
TNA users can visit institutions in a country other than their own, benefiting from free access to collections, tailored institutional support, and guidance from local scholars. This enables faster resource access and more efficient research.
A typical TNA visit lasts two weeks, allowing researchers to meet experts, curators, conservators, and restorers, etc.
Benefits for TNA Users @ITSERR
The five ITSERR consortium partners, TNA hosts at RESILIENCE, offer attractive conditions for a TNA fellowship, among which coverage of cost for flights and accommodation, with full board or half board. Read more about ITSERR’s offering here.
Keywords: Generative AI, Computer Graphics, Denoising Diffusion Probabilistic Model, Gaussian Splatting, NeRF, Signed Distance Field, Video Reconstruction, Deep Learning, Machine Learning, Artificial Intelligence, Text-to-3D, Image-to-3D, Urban Environment, Score Distillation Sampling
Abstract The reconstruction of large-scale real outdoor environments is crucial for promoting the adoption of Extended Reality (XR) in industrial and entertainment sectors. This task often requires significant resources such as depth cameras, LiDAR sensors, drones, and others, alongside traditional data processing pipelines like Structure-from-Motion (SfM), which demand extensive computational resources, thus preventing real-time processing. Additional constraints arise from the limited accessibility to the aforementioned resources. While 3D laser scanners (e.g., LiDAR) are precise and fast, they are expensive, often bulky especially the high-quality models and their effectiveness is contingent on the type of environment being scanned. Depth sensors offer a more affordable and compact alternative; however, due to their limited range, they are ideal only for indoor settings. Photogrammetry, while capable of producing high-quality results at a lower cost, can be time consuming and computationally intensive. It also suffers from limited accuracy, strong dependence on lighting conditions, and the need for numerous photos from various angles that can be not always easily accessible. (…)
Spatio-Temporal 3D Reconstruction from Frame Sequences and Feature Points
Abstract Reconstructing a large real environment is a fundamental task to promote eXtended Reality adoption in industrial and entertainment fields. However, the short range of depth cameras, the sparsity of LiDAR sensors, and the huge computational cost of Structure-from-Motion pipelines prevent scene replication in near real time. To overcome these limitations, we introduce a spatio-temporal diffusion neural architecture, a generative AI technique that fuses temporal information (i.e., a short temporally-ordered list of color photographs, like sparse frames of a video stream) with an approximate spatial resemblance of the explored environment. Our aim is to modify an existing 3D diffusion neural model to produce a Signed Distance Field volume from which a 3D mesh representation can be extracted. Our results show that the hallucination approach of diffusion models is an effective methodology where a fast reconstruction is a crucial target.
Optimizing LLMs for Italian: Reducing Token Fertility and Enhancing Efficiency Through Vocabulary Adaptation
Abstract An increasing number of pretrained Large Language Models (LLMs) are being released, though the majority are predominantly designed for English. While they can often handle other languages due to contamination or some degree of multilingual pretraining data, English-centric LLMs are not optimized for non-English languages. This leads to inefficient encoding (high token ‘fertility’) and slower inference times for those languages. In this work, we explore various vocabulary adaptation techniques to tailor English LLMs for the Italian language. We introduce Semantic Alignment Vocabulary Adaptation (SAVA), a novel method that learns neural mapping to accomplish vocabulary substitution, which achieve state-of-the-art performances on several downstream tasks. We adapted two LLMs: Mistral-7b-v0.1, reducing token fertility by 25%, and Llama-3.1-8b, optimizing the vocabulary and reducing the number of parameters by 1 billion. We show that, after the adaptation of the vocabulary, these models can recover their performances with a relatively limited stage of continual training on the target language. Finally, we test the adapted models’ capabilities on several multi-choice and generative tasks.
Wordnet and Word Ladders: Climbing the abstraction taxonomy with LLMs
Abstract WordNet has long served as a benchmark for approximating the mechanisms of semantic categorization in the human mind, particularly through its hierarchical structure of word synsets, most notably the IS-A relation. How ever, these semantic relations have traditionally been curated manually by expert lexicographers, relying on external resources like dictionaries and corpora. In this paper, we explore whether large language models (LLMs) can be leveraged to approximate these hierarchical semantic relations, potentially offering a scalable and more dynamic alternative for maintaining and updating the WordNet taxonomy. This investigation addresses the feasibility and implications of automating this process with LLMs by testing a set of prompts encoding different sociodemographic traits and finds that adding age and job information to the prompt affects the model ability to generate text in agreement with hierarchical semantic relations while gender does not have a statistically significant impact.
The Invalsi Benchmarks: measuring the Linguistic and Mathematical understanding of Large Language Models in Italian
Abstract While Italian is a high-resource language, there are few Italian-native benchmarks to evaluate generative Large Language Models (LLMs) in this language. This work presents three new benchmarks: Invalsi MATE to evaluate models performance on mathematical understanding in Italian, Invalsi ITA to evaluate language under standing in Italian and Olimpiadi MATE for more complex mathematical understanding. The first two benchmarks are based on the Invalsi tests, which are administered to students of age between 6 and 18 within the Italian school system and have been validated by several experts in teaching and pedagogy, the third one comes from the Italian highschool math Olympics. We evaluate 10 powerful language models on these benchmarks and we find that they are bound by 71% accuracy on Invalsi MATE, achieved by Llama 3.1 70b instruct and by 88% on Invalsi ITA. For both Invalsi MATE and Invalsi ITA we compare LLMs with the average performance of Italian students to show that Llama 3.1 is the only one to outperform them on Invalsi MATE while most models do so on Invalsi ITA, we then show that Olimpiadi MATE is more challenging than Invalsi MATE and the highest accuracy, achieved by Llama 3.1 405b instruct accuracy is 45%.
ABRICOT – ABstRactness and Inclusiveness in COntexT: A CALAMITA Challenge
Keywords: Abstraction, Inclusiveness, Context, LLM evaluation, Italian Language Models
Abstract The ABRICOT Task is designed to evaluate Italian language models on their ability to understand and assess the abstractness and inclusiveness of language, two nuanced features that humans naturally convey in everyday communication. Unlike binary categorizations such as abstract/concrete or inclusive/exclusive, these features exist on a continuous spectrum with varying degrees of intensity. The task is based on a manual collection of sentences that present the same noun phrase (NP) in different contexts, allowing its interpretation to vary between the extremes of abstractness and inclusiveness. This challenge aims to verify the how LLMs perceive subtle linguistic variations and their implications in natural language.
INVALSI – Mathematical and Language Understanding in Italian: A CALAMITA Challenge
Keywords: Mathematical Understanding, Language Understanding, Invalsi, Large Language Models, Italian Language Models
Abstract While Italian is a high resource language, there are few Italian-native benchmarks to evaluate Language Models (LMs) generative abilities in this language. This work presents two new benchmarks: Invalsi MATE to evaluate models performance on mathematical understanding in Italian and Invalsi ITA to evaluate language understanding in Italian. These benchmarks are based on the Invalsi tests, which are administered to students of age between 6 and 18 within the Italian school system. These tests are prepared by expert pedagogists and have the explicit goal of testing average students’ performance over time across Italy. Therefore, the questions are well written, appropriate for the age of the students, and are developed with the goal of assessing students’ skills that are essential in the learning process, ensuring that the benchmark proposed here measures key knowledge for undergraduate students. Invalsi MATE is composed of 420 questions about mathematical understanding, these questions range from simple money counting problems to Cartesian geometry questions, e.g. determining if a point belongs to a given line. They are divided into 4 different types: scelta multipla (multiple choice), vero/falso (true/false), numero (number), completa frase (fill the gap). Invalsi ITA is composed of 1279 questions regarding language understanding, these questions involve both the ability to extract information and answer questions about a text passage as well as questions about grammatical knowledge. They are divided into 4 different types: scelta multipla (multiple choice), binaria (binary), domanda aperta (open question), altro (other). We evaluate 4 powerful language models both English-first and tuned for Italian to see that best accuracy on Invalsi MATE is 55% while best accuracy on Invalsi ITA is 80%.
AI ‘News’ Content Farms Are Easy to Make and Hard to Detect: A Case Study in Italian
Abstract Large Language Models (LLMs) are increasingly used as ‘content farm’ models (CFMs), to generate synthetic text that could pass for real news articles. This is already happening even for languages that do not have high-quality monolingual LLMs. We show that fine-tuning Llama (v1), mostly trained on English, on as little as 40K Italian news articles, is sufficient for producing news-like texts that native speakers of Italian struggle to identify as synthetic.We investigate three LLMs and three methods of detecting synthetic texts (log-likelihood, DetectGPT, and supervised classification), finding that they all perform better than human raters, but they are all impractical in the real world (requiring either access to token likelihood information or a large dataset of CFM texts). We also explore the possibility of creating a proxy CFM: an LLM fine-tuned on a similar dataset to one used by the real ‘content farm’. We find that even a small amount of fine-tuning data suffices for creating a successful detector, but we need to know which base LLM is used, which is a major challenge.Our results suggest that there are currently no practical methods for detecting synthetic news-like texts ‘in the wild’, while generating them is too easy. We highlight the urgency of more NLP research on this problem.
Is CLIP the main roadblock for fine-grained open-world perception?
Abstract Modern applications increasingly demand flexible computer vision models that adapt to novel concepts not encountered during training. This necessity is pivotal in emerging domains like extended reality, robotics, and autonomous driving, which require the ability to respond to open-world stimuli. A key ingredient is the ability to identify objects based on free-form textual queries defined at inference time – a task known as open-vocabulary object detection. Multimodal backbones like CLIP are the main enabling technology for current open-world perception solutions. Despite performing well on generic queries, recent studies highlighted limitations on the fine-grained recognition capabilities in open-vocabulary settings – i.e., for distinguishing subtle object features like color, shape, and material. In this paper, we perform a detailed examination of these openvocabulary object recognition limitations to find the root cause. We evaluate the performance of CLIP, the most commonly used vision-language backbone, against a fine-grained objectmatching benchmark, revealing interesting analogies between the limitations of open-vocabulary object detectors and their backbones. Experiments suggest that the lack of fine-grained understanding is caused by the poor separability of object characteristics in the CLIP latent space. Therefore, we try to understand whether fine-grained knowledge is present in CLIP embeddings but not exploited at inference time due, for example, to the unsuitability of the cosine similarity matching function, which may discard important object characteristics. Our preliminary experiments show that simple CLIP latent-space re-projections help separate fine-grained concepts, paving the way towards the development of backbones inherently able to process fine-grained details. The code for reproducing these experiments is available at https://github.com/lorebianchi98/FG-CLIP.
Same or Different? Diff-Vectors for Authorship Analysis
Keywords: deep learning, machine learning, information retrieval, computer science, data mining, support vector, logistic regression, artificial intelligence, supervised learning
Abstract In this article, we investigate the effects on authorship identification tasks (including authorship verification, closed-set authorship attribution, and closed-set and open-set same-author verification) of a fundamental shift in how to conceive the vectorial representations of documents that are given as input to a supervised learner. In “classic” authorship analysis, a feature vector represents a document, the value of a feature represents (an increasing function of) the relative frequency of the feature in the document, and the class label represents the author of the document. We instead investigate the situation in which a feature vector represents an unordered pair of documents, the value of a feature represents the absolute difference in the relative frequencies (or increasing functions thereof) of the feature in the two documents, and the class label indicates whether the two documents are from the same author or not. This latter (learner-independent) type of representation has been occasionally used before, but has never been studied systematically. We argue that it is advantageous, and that, in some cases (e.g., authorship verification), it provides a much larger quantity of information to the training process than the standard representation. The experiments that we carry out on several publicly available datasets (among which one that we here make available for the first time) show that feature vectors representing pairs of documents (that we here call Diff-Vectors) bring about systematic improvements in the effectiveness of authorship identification tasks, and especially so when training data are scarce (as it is often the case in real-life authorship identification scenarios). Our experiments tackle same-author verification, authorship verification, and closed-set authorship attribution; while DVs are naturally geared for solving the 1st, we also provide two novel methods for solving the 2nd and 3rd that use a solver for the 1st as a building block. The code to reproduce our experiments is open-source and available online.
A Simple Method for Classifier Accuracy Prediction Under Prior Probability Shift
Abstract The standard technique for predicting the accuracy that a classifier will have on unseen data (classifier accuracy prediction – CAP) is cross-validation (CV). However, CV relies on the assumption that the training data and the test data are sampled from the same distribution, an assumption that is often violated in many real-world scenarios. When such violations occur (i.e., in the presence of dataset shift), the estimates returned by CV are unreliable. In this paper we propose a CAP method specifically designed to address prior probability shift (PPS), an instance of dataset shift in which the training and test distributions are characterized by different class priors. By solving a system of n2 independent linear equations, with n the number of classes, our method estimates the n2 entries of the contingency table of the test data, and thus allows estimating any specific evaluation measure. Since a key step in this method involves predicting the class priors of the test data, we further observe a connection between our method and the field of “learning to quantify”. Our experiments show that, when combined with state-of-the-art quantification techniques, under PPS our method tends to outperform existing CAP methods.
The Questio de aqua et terra: A Computational Authorship Verification Study
Abstract The Questio de aqua et terra is a cosmological treatise traditionally attributed to Dante Alighieri. However, the authenticity of this text is controversial, due to discrepancies with Dante’s established works and to the absence of contemporary references. This study investigates the authenticity of the Questio via computational authorship verification (AV), a class of techniques which combine supervised machine learning and stylometry. We build a family of AV systems and assemble a corpus of 330 13th- and 14th-century Latin texts, which we use to comparatively evaluate the AV systems through leave-one-out cross-validation. Our best-performing system achieves high verification accuracy (F1=0.970) despite the heterogeneity of the corpus in terms of textual genre. The key contribution to the accuracy of this system is shown to come from Distributional Random Oversampling (DRO), a technique specially tailored to text classification which is here used for the first time in AV. The application of the AV system to the Questio returns a highly confident prediction concerning its authenticity. These findings contribute to the debate on the authorship of the Questio, and highlight DRO’s potential in the application of AV to cultural heritage.
«Vi fui presente e vidi». Contributi per uno studio dei diari di pellegrinaggio a Roma tra genere letterario e documento storico
Keywords: Pilgrimage, Diaries, Rome, Literature, History
Abstract The document analyzes pilgrimage diaries to Rome, focusing on their value as both a literary genre and historical documents. By examining various texts, including medieval itineraries and travelers’ accounts, the author explores the unique characteristics of these diaries, highlighting how they combine elements of personal narrative with geographical and historical descriptions. The concept of pilgrimage is examined not only as a physical journey but also as a spiritual path, and the importance of religious destinations such as Rome and Jerusalem is discussed. The work raises questions about the nature and evolution of pilgrimage diaries, seeking to understand the motivations and experiences of pilgrims through the centuries.
ABBIE: Attention-Based BI-Encoders for Predicting Where to Split Compound Sanskrit Words
Keywords: Word Segmentation, Sanskrit Language, Sandhi Rule, Bi-Encoders, Attention.
Abstract
Sanskrit is a highly composite language, morphologically and phonetically complex. One of the major challenges in processing Sanskrit is the splitting of compound words that are merged phonetically. Recognizing the exact location of splits in a compound word is difficult since several possible splits can be found, but only a few of them are semantically meaningful. This paper proposes a novel deep learning method that uses two bi-encoders and a multi-head attention module to predict the valid split location in Sanskrit compound words. The two bi-encoders process the input sequence in direct and reverse order respectively. The model learns the character-level context in which the splitting occurs by exploiting the correlation between the direct and reverse dynamics of the characters sequence. The results of the proposed model are compared with a state-of-the-art technique that adopts a bidirectional recurrent network to solve the same task. Experimental results show that the proposed model correctly identifies where the compound word should be split into its components in 89.27% of cases, outperforming the state-of-the-art technique. The paper also proposes a dataset developed from the repository of the Digital Corpus of Sanskrit (DCS) and the University of Hyderabad (UoH) corpus.
Benchmarking BERT-based Models for Latin: A Case Study on Biblical References in Ancient Christian Literature
Abstract Transformer-based language models like BERT have revolutionized Natural Language Processing (NLP) research, but their application to historical languages remains underexplored. This paper investigates the adaptation of BERT-based embedding models for Latin, a language central to the study of the sacred texts of Christianity. Focusing on Jerome’s Vulgate, pre-Vulgate Latin translations of the Bible, and patristic commentaries such as Augustine’s De Genesi ad litteram, we address the challenges posed by Latin’s complex syntax, specialized vocabulary, and historical variations at the orthographic, morphological, and semantic levels. In particular, we propose fine-tuning existing BERT-based embedding models on annotated Latin corpora, using self-generated hard negatives to improve performance in detecting biblical references in early Christian literature in Latin. Experimental results demonstrate the ability of BERT-based models to identify citations of and allusions to the Bible(s) in ancient Christian commentaries while highlighting the complexities and challenges of this field. By integrating NLP techniques with humanistic expertise, this work provides a case study on intertextual analysis in Latin patristic works. It underscores the transformative potential of interdisciplinary approaches, advancing computational tools for sacred text studies and bridging the gap between philology and computational analysis.
Data extraction from 3D scanning: post-processing filtering for analytic and informative models of small archaeological finds
Abstract Actual 3D scanners based on the structured-light principle are opening to possibilities for creating detailed models (polygon populations) with micrometric resolutions. Consequently, highly detailed models allow specific investigations. This work focuses on 3D scanning and post-processing analysis/filtering of Ancient Near East finds, especially seals and cuneiform clay tablets, fragile artefacts that can hold a lot of semantic information beyond transliteration: e.g. seal impressions (figurative and textual sealings), fingerprint evidence, retracing and erased text. Behind the ease of use of portable structured-light scanners, hides the enormous potential for feature extraction and processing. Metric analysis (e.g. deviation analysis) coupled with the application of MSII (Multi-Scale Integral Invariant) filter enhance data extraction, changing the overall perception on details of the archaeological artefact.
Medieval Sanctuaries and Miraculous Images and Relics: Tracing the Gaze through Eye Trackers
Abstract This article is part of the research activities of the PNRR ITSERR project, which seeks to apply new digital technologies to religious studies. Specifically focusing on gaze studies, we utilised Aria eye trackers provided by Meta to the team of computer engineers at the University of Modena and Reggio Emilia (Italy), with whom this study is being carried out. These devices can record the gaze of users who wear them, as well as identify the objects or spatial elements being observed, the user’s location, and the duration of their focus. Adopting an interdisciplinary approach, the article explores the application of this technology to Catholic sacred spaces, specifically two sanctuaries in the Tuscan-Emilian Apennines: the Sanctuary of Our Lady of Bismantova (Reggio Emilia) and that of Saints Pellegrino and Bianco in Alpe (Modena). By observing and analysing the gaze patterns of ten users – varying in age, profession, and religious orientation – the study examines how individuals engage with these sacred contexts, with particular attention to the Marian image and the relics of the saints.
The Devil is in the Fine-Grained Details: Evaluating Open-Vocabulary Object Detectors for Fine-Grained Understanding
Abstract Recent advancements in large vision-language models enabled visual object detection in open-vocabulary scenarios where object classes are defined in free-text formats during inference. In this paper we aim to probe the state-of-the-art methods for open-vocabulary object detection to determine to what extent they understand fine-grained properties of objects and their parts. To this end we introduce an evaluation protocol based on dynamic vocabulary generation to test whether models detect discern and assign the correct fine-grained description to objects in the presence of hard-negative classes. We contribute with a benchmark suite of increasing difficulty and probing different properties like color pattern and material. We further enhance our investigation by evaluating several state-of-the-art open-vocabulary object detectors using the proposed protocol and find that most existing solutions which shine in standard open-vocabulary benchmarks struggle to accurately capture and distinguish finer object details. We conclude the paper by highlighting the limitations of current methodologies and exploring promising research directions to overcome the discovered drawbacks. Data and code are available at https://lorebianchi98.github.io/FG-OVD .
Ubiquity. Il design della comunicazione nel progetto ITSERR
Abstract Within the Italian Strengthening of ESFRI RI Resilience ITSERR project, Ubiquity is a research platform developed for detecting literal and non-literal quotations of the Bible and the Quran in later exegetic Greek, Latin and Arab commentaries. The objective of Ubiquity’s team, which is made up of humanists, computer scientists and designers, is to study and visualize data of sacred texts and interact with them thanks to visual components belonging to analogue and digital infographic systems. This widespread availability of skills for designing material and immaterial artefacts could be a great support for religious studies and scientific research.
μgat: Improving Single-Page Document Parsing by Providing Multi-Page Context
Abstract Regesta are catalogs of summaries of other documents and, in some cases, are the only source of information about the content of such full-length documents. For this reason, they are of great interest to scholars in many social and humanities fields. In this work, we focus on Regesta Pontificum Romanum, a large collection of papal registers. Regesta are visually rich documents, where the layout is as important as the text content to convey the contained information through the structure, and are inherently multi-page documents. Among Digital Humanities techniques that can help scholars efficiently exploit regesta and other documental sources in the form of scanned documents, Document Parsing has emerged as a task to process document images and convert them into machine-readable structured representations, usually markup language. However, current models focus on scientific and business documents, and most of them consider only single-paged documents. To overcome this limitation, in this work, we propose {\mu}gat, an extension of the recently proposed Document parsing Nougat architecture, which can handle elements spanning over the single page limits. Specifically, we adapt Nougat to process a larger, multi-page context, consisting of the previous and the following page, while parsing the current page. Experimental results, both qualitative and quantitative, demonstrate the effectiveness of our proposed approach also in the case of the challenging Regesta Pontificum Romanorum.
Alfie: Democratising RGBA Image Generation With No $$$
Abstract Designs and artworks are ubiquitous across various creative fields, requiring graphic design skills and dedicated software to create compositions that include many graphical elements, such as logos, icons, symbols, and art scenes, which are integral to visual storytelling. Automating the generation of such visual elements improves graphic designers’ productivity, democratizes and innovates the creative industry, and helps generate more realistic synthetic data for related tasks. These illustration elements are mostly RGBA images with irregular shapes and cutouts, facilitating blending and scene composition. However, most image generation models are incapable of generating such images and achieving this capability requires expensive computational resources, specific training recipes, or post-processing solutions. In this work, we propose a fully-automated approach for obtaining RGBA illustrations by modifying the inference-time behavior of a pre-trained Diffusion Transformer model, exploiting the prompt-guided controllability and visual quality offered by such models with no additional computational cost. We force the generation of entire subjects without sharp croppings, whose background is easily removed for seamless integration into design projects or artistic scenes. We show with a user study that, in most cases, users prefer our solution over generating and then matting an image, and we show that our generated illustrations yield good results when used as inputs for composite scene generation pipelines. We release the code at this https URL.
Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas
Abstract Diffusion models have become the State-of-the-Art for text-to-image generation, and increasing research effort has been dedicated to adapting the inference process of pretrained diffusion models to achieve zero-shot capabilities. An example is the generation of panorama images, which has been tackled in recent works by combining independent diffusion paths over overlapping latent features, which is referred to as joint diffusion, obtaining perceptually aligned panoramas. However, these methods often yield semantically incoherent outputs and trade-off diversity for uniformity. To overcome this limitation, we propose the Merge-Attend-Diffuse operator, which can be plugged into different types of pretrained diffusion models used in a joint diffusion setting to improve the perceptual and semantical coherence of the generated panorama images. Specifically, we merge the diffusion paths, reprogramming self- and cross-attention to operate on the aggregated latent space. Extensive quantitative and qualitative experimental analysis, together with a user study, demonstrate that our method maintains compatibility with the input prompt and visual quality of the generated images while increasing their semantic coherence. We release the code at https://github.com/aimagelab/MAD.
Binarizing Documents by Leveraging both Space and Frequency
Abstract Document Image Binarization is a well-known problem in Document Analysis and Computer Vision, although it is far from being solved. One of the main challenges of this task is that documents generally exhibit degradations and acquisition artifacts that can greatly vary throughout the page. Nonetheless, even when dealing with a local patch of the document, taking into account the overall appearance of a wide portion of the page can ease the prediction by enriching it with semantic information on the ink and background conditions. In this respect, approaches able to model both local and global information have been proven suitable for this task. In particular, recent applications of Vision Transformer (ViT)-based models, able to model short and long-range dependencies via the attention mechanism, have demonstrated their superiority over standard Convolution-based models, which instead struggle to model global dependencies. In this work, we propose an alternative solution based on the recently introduced Fast Fourier Convolutions, which overcomes the limitation of standard convolutions in modeling global information while requiring fewer parameters than ViTs. We validate the effectiveness of our approach via extensive experimental analysis considering different types of degradations.
VATr++: Choose Your Words Wisely for Handwritten Text Generation
Abstract Styled Handwritten Text Generation (HTG) has received significant attention in recent years, propelled by the success of learning-based solutions employing GANs, Transformers, and, preliminarily, Diffusion Models. Despite this surge in interest, there remains a critical yet understudied aspect – the impact of the input, both visual and textual, on the HTG model training and its subsequent influence on performance. This work extends the VATr (Pippi et al. 2023) Styled-HTG approach by addressing the pre-processing and training issues that it faces, which are common to many HTG models. In particular, we propose generally applicable strategies for input preparation and training regularization that allow the model to achieve better performance and generalization capabilities. Moreover, in this work, we go beyond performance optimization and address a significant hurdle in HTG research – the lack of a standardized evaluation protocol. In particular, we propose a standardization of the evaluation protocol for HTG and conduct a comprehensive benchmarking of existing approaches. By doing so, we aim to establish a foundation for fair and meaningful comparisons between HTG strategies, fostering progress in the field.
Dreams, Texts, and Truths: Augustine on Hermeneutics and Oneirocriticism
Keywords: Augustine, dreams, oneirocriticism, oneirology, Bible, biblical exegesis, allegory, Tertullian, Origen, Philo of Alexandria, Passio Perpetuae et Felicitatis, early Christian literature, Artemidorus of Daldis
Abstract In the Greek and Roman worlds, oneirocriticism is hermeneutics and presupposes an epistemology – these and other cognate fields of inquiry being involved in a continuous process of social, political, and religious change. The present paper explores the relationship between dreams and hermeneutics in a meaningful passage of Augustine’s twelve-book commentary On the Literal Meaning of Genesis (De Genesi ad litteram) – a work rightly considered the most important testimony to the Christian cosmology of antiquity and the Middle Ages – in which the greatest of the Latin Church Fathers establishes a parallel between the interpretation of dreams and that of sacred texts. By elucidating the cultural background of Augustine’s understanding of dream images as cognitive phenomena that underlie both crucial passages of the Bible and the common experience of humans – both the soul and the body, both natural and supernatural powers – this paper sheds new light upon Augustine’s reaction to the materialism and literalism of Tertullian and early Christian communities, his reception of the allegorical method of Origen and the Alexandrian school, and his mystical embracing of Neoplatonic theories of knowledge. Indeed, Augustine turns out to be perfectly aware of many Greco-Roman and early Christian debates on oneirology and hermeneutical methods, and while he fiercely warns against the belief that the revelation of the Bible can be superseded or contradicted by the individual revelations of dreams, he strives to put together an original paradigm of natural philosophy, cognitive psychology, and symbolic interpretation, in an attempt to give dreams a definite place in the order of things.
Aëriae Animae: Souls and Elements from the Roman Cosmos to the Christian Afterworld
Keywords: Augustine, early Christian psychology, soul, body, four elements, M. Terentius Varro, Antiochus of Ascalon, Middle Platonism, Neoplatonism, Bible, Stoicism, demonology, philosophy of nature, theology
Abstract It has been widely recognized that until the fourth century AD Christians discussed freely about the source and the nature of the soul – the cases of Origen and Tertullian being emblematic of this situation in the East and in the West, respectively. It was only in the fourth century AD – after the so-called conversion of Constantine, with the Church’s increasing entanglement with political and social power and the emergence of a new generation of Platonizing intellectuals from the ranks of the upper class – that Christian bishops and theologians inaugurated a new discourse on the soul, its transcendent origin, immaterial constitution, and immortal destiny, which entailed the banishment and repression of earlier alternative visions. In the present paper, I shall be exploring an episode in this crucial historical transition, which, though limited in scope, can shed light upon the long-standing interactions between Greco-Roman theories of matter, elements, and principles, on the one hand, and Christian ideas of the soul and the afterworld, on the other. I am going to focus on the treatise On the City of God (De Civitate Dei) by Augustine of Hippo, who is usually regarded as one of the most decisive and influential figures in what can be called the Neoplatonic turn of fourth-century AD Christian eschatology. It is too often forgotten that throughout his long engagement with the issue of the nature and origin of the soul Augustine maintained an agnostic position, which is faithfully mirrored in all his writings. Indeed, I shall attempt to show that Augustine’s troubled reflection on the soul – on what he repeatedly terms as the ‘extremely obscure question of the soul’ (obscurissimam de anima quaestionem) – includes a meaningful dialogue with Book 16 of Varro’s Divine Antiquities (Antiquitates Rerum Divinarum) and its theory that the four elements of the cosmos host four different kinds of souls. I will investigate the philosophical pedigree of Varro’s cosmological-cum-psychological doctrine, with its recognizable mixture of Platonic and Stoic notions, arguing that Varro’s teacher, the Middle Platonist philosopher Antiochus of Ascalon, is its most likely source. However, far from restricting myself to an exercise in Quellenforschung, I shall claim that the Varronian theory reported in Book 7 of Augustine’s City of God should be read in light of Augustine’s sustained reception of the Platonic tradition in Book 8 of the same work, where the view that the body of demons is made up of air is endorsed by Augustine and attests to his serious pondering of the role of the natural elements in the emergence of a creature’s essence.
Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization
Abstract The conventional training approach for image captioning involves pre-training a network using teacher forcing and subsequent fine-tuning with Self-Critical Sequence Training to maximize hand-crafted captioning metrics. However, when attempting to optimize modern and higher-quality metrics like CLIP-Score and PAC-Score, this training method often encounters instability and fails to acquire the genuine descriptive capabilities needed to produce fluent and informative captions. In this paper, we propose a new training paradigm termed Direct CLIP-Based Optimization (DiCO). Our approach jointly learns and optimizes a reward model that is distilled from a learnable captioning evaluator with high human correlation. This is done by solving a weighted classification problem directly inside the captioner. At the same time, DiCO prevents divergence from the original model, ensuring that fluency is maintained. DiCO not only exhibits improved stability and enhanced quality in the generated captions but also aligns more closely with human preferences compared to existing methods, especially in modern metrics. Additionally, it maintains competitive performance in traditional metrics.
The Revolution of Multimodal Large Language Models: A Survey
Abstract Connecting text and visual modalities plays an essential role in generative intelligence. For this reason, inspired by the success of large language models, significant research efforts are being devoted to the development of Multimodal Large Language Models (MLLMs). These models can seamlessly integrate visual and textual modalities, while providing a dialogue-based interface and instruction-following capabilities. In this paper, we provide a comprehensive review of recent visual-based MLLMs, analyzing their architectural choices, multimodal alignment strategies, and training techniques. We also conduct a detailed analysis of these models across a wide range of tasks, including visual grounding, image generation and editing, visual understanding, and domain-specific applications. Additionally, we compile and describe training datasets and evaluation benchmarks, conducting comparisons among existing models in terms of performance and computational requirements. Overall, this survey offers a comprehensive overview of the current state of the art, laying the groundwork for future MLLMs.
Pixels of Faith: Exploiting Visual Saliency to Detect Religious Image Manipulation
Keywords: Gaze-assisted AI Human Attention Deepfake Detection Religious Studies
Abstract The proliferation of generative models has revolutionized various aspects of daily life, bringing both opportunities and challenges. This paper tackles a critical problem in the field of religious studies: the automatic detection of partially manipulated religious images. We address the discrepancy between human and algorithmic capabilities in identifying fake images, particularly those visually obvious to humans but challenging for current algorithms. Our study introduces a new testing dataset for religious imagery and incorporates human-derived saliency maps to guide deep learning models toward perceptually relevant regions for fake detection. Experiments demonstrate that integrating visual attention information into the training process significantly improves model performance, even with limited eye-tracking data. This human-in-the-loop approach represents a significant advancement in deepfake detection, particularly for preserving the integrity of religious and cultural content. This work contributes to the development of more robust and human-aligned deepfake detection systems, addressing critical challenges in the era of widespread generative AI technologies.
Isometric Sets of Words and Generalizations of the Fibonacci Cubes
Abstract The hypercube Qn is a graph whose 2n vertices can be associated to all binary words of length n in a way that adjacent vertices get words that differ only in one symbol. Given a word f, the subgraph Qn(f) is defined by selecting all vertices not containing f as a factor. A word f is said to be isometric if Qn(f) is an isometric subgraph of Qn, i.e., keeping the distances between the remaining nodes. Graphs Qn(f) were defined and studied as a generalization of Fibonacci cubes Qn(11). Isometric words have been completely characterized using combinatorial methods for strings.
We introduce the notion of isometric sets of words with the aim of capturing further interesting cases in the scenario of isometric subgraphs of the hypercubes. We prove some combinatorial properties and study special interesting cases.
Density of Ham- and Lee- non-isometric k-ary Words
Keywords: Isometric words, Overlap with errors, Hamming and Lee distance, Density
Abstract Isometric k-ary words have been defined referring to the Hamming and the Lee distances. A word is non-isometric if and only if it has a prefix at distance 2 from the suffix of same length; such a prefix is called 2-error overlap. The limit density of isometric binary words based on the Hamming distance has been evaluated by Klavz ˇar and Shpectorov, obtaining that about 8% of all binary words are isometric. In this paper, the issue is addressed for k-ary words and referring to the Hamming and the Lee distances. Actually, the only meaningful case of Lee-isometric k-ary words is when k = 4. It is proved that, when the length of words increases, the limit density of quaternary Ham-isometric words is around 17%, while the limit density of quaternary Lee-isometric words is even bigger, it is about 30%. The results are obtained using combinatorial methods and algorithms for counting the number of k-ary isometric words.
Using large language models to create narrative events
Abstract Narratives play a crucial role in human communication, serving as a means to convey experiences, perspectives, and meanings across various domains. They are particularly significant in scientific communities, where narratives are often utilized to explain complex phenomena and share knowledge. This article explores the possibility of integrating large language models (LLMs) into a workflow that, exploiting the Semantic Web technologies, transforms raw textual data gathered by scientific communities into narratives. In particular, we focus on using LLMs to automatically create narrative events, maintaining the reliability of the generated texts. The study provides a conceptual definition of narrative events and evaluates the performance of different smaller LLMs compared to the requirements we identified. A key aspect of the experiment is the emphasis on maintaining the integrity of the original narratives in the LLM outputs, as experts often review texts produced by scientific communities to ensure their accuracy and reliability. We first perform an evaluation on a corpus of five narratives and then on a larger dataset comprising 124 narratives. LLaMA 2 is identified as the most suitable model for generating narrative events that closely align with the input texts, demonstrating its ability to generate high-quality narrative events. Prompt engineering techniques are then employed to enhance the performance of the selected model, leading to further improvements in the quality of the generated texts.
Sensitive Topics Retrieval in Digital Libraries: A Case Study of ḥadīṯ collections
Abstract The advent of Large Language Models (LLMs) has led to the development of new Question-Answering (QA) systems based on Retrieval-Augmented Generation (RAG) to incorporate query-specific knowledge at inference time. In this paper, the trustworthiness of RAG systems is investigated, particularly focusing on the performance of their retrieval phase when dealing with sensitive topics. This issue is particularly relevant as it could hinder a user’s ability to analyze sections of the available corpora, effectively biasing any following research. To mimic a specialised library possibly containing sensitive topics, a ḥādīṯ dataset has been curated using an ad-hoc framework called Question-Classify-Retrieve (QCR), which automatically assesses the performance of document retrieval by operating in three main steps: Question Generation, Passage Classification, and Passage Retrieval. Different sentence embedding models for document retrieval were tested showing significant performance gap between sensitive and non-sensitive topics compared to baseline. In real-world applications this would mean relevant documents placed lower in the retrieval list leading to the presence of irrelevant information or the absence of relevant one in case of a lower cut-off.
Text-to-SQL with Large Language Models: Exploring the Promise and Pitfalls
Keywords: Large Language Models, Text-to-SQL, Relational Databases, SQL
Abstract The emergence of Large Language Models (LLMs) represents a fundamental change in the ever-evolving field of natural language processing (NLP). Over the past few years, the enhanced capabilities of these models have led to their widespread use across various fields, in both practical applications and research contexts. In particular, as data science intersects with LLMs, new research opportunities and insights emerge, notably in translating text into Structured Query Language (Text-to-SQL). The application of this technology to such task poses a unique set of opportunities and related issues that have significant implications for information retrieval. This discussion paper delves into these intricacies and limitations, focusing on challenges that jeopardise efficacy and reliability. This research investigates the scalability, accuracy, and concerning issue of hallucinated responses, questioning the trustworthiness of LLMs. Furthermore, we point out the limits of the current usage of test dataset created for research purposes in capturing real-world complexities. Finally, we consider the performance of Text-to-SQL with LLMs from different perspectives. Our investigation identifies the key challenges faced by LLMs and proposes viable solutions to facilitate the exploitation of these models to advance data retrieval, bridging the gap between academic researcher and real-world application scenarios.
Automatic Lemmatization of Old Church Slavonic Language Using A Novel Dictionary-Based Approach
Abstract Old Church Slavonic (OCS) is an ancient language, and it has unique challenges and hurdles in natural language processing. Currently, there is a lack of Python libraries devised for the analysis of OCS texts. This research is not just filling the crucial gap in the computational treatment of OCS language but also producing valuable resources for scholars in historical linguistics, cultural studies, and humanities for the development of further research in the field of ancient language processing. The main contribution of this research work is the development of an algorithm for the lemmatization of OCS texts based on a learned dictionary. The approach can deal with ancient languages without the need for prior linguistic knowledge. Preparing a dataset of more than 330K words of OCS and their corresponding lemmas, this approach integrates the algorithm and dictionary efficiently to achieve accurate lemmatization on test data.
Unveiling the Truth: Exploring Human Gaze Patterns in Fake Images
Abstract Creating high-quality and realistic images is now possible thanks to the impressive advancements in image generation. A description in natural language of your desired output is all you need to obtain breathtaking results. However, as the use of generative models grows, so do concerns about the propagation of malicious content and misinformation. Consequently, the research community is actively working on the development of novel fake detection techniques, primarily focusing on low-level features and possible fingerprints left by generative models during the image generation process. In a different vein, in our work, we leverage human semantic knowledge to investigate the possibility of being included in frameworks of fake image detection. To achieve this, we collect a novel dataset of partially manipulated images using diffusion models and conduct an eye-tracking experiment to record the eye movements of different observers while viewing real and fake stimuli. A preliminary statistical analysis is conducted to explore the distinctive patterns in how humans perceive genuine and altered images. Statistical findings reveal that, when perceiving counterfeit samples, humans tend to focus on more confined regions of the image, in contrast to the more dispersed observational pattern observed when viewing genuine images.
RoBERT2VecTM: ANovel Approach for Topic Extraction in IslamicStudies
Abstract Investigating “Hadith” texts, crucial for theological studies and Islamic jurisprudence, presents challenges due to the linguistic complexity of Arabic, such as its complex morphology. In this paper, we propose an innovative approach to address the challenges of topic modeling in Hadith studies by utilizing the Contextualized Topic Model (CTM). Our study introduces RoBERT2VecTM, a novel neural-based approach that combines the RoBERTa transformer model with Doc2Vec, specifically targeting the semantic analysis of “Matn” (the actual content). The methodology outperforms many traditional state-of-the-art NLP models by generating more coherent and diverse Arabic topics. The diversity of the generated topics allows for further categorization, deepening the understanding of discussed concepts. Notably, our research highlights the critical impact of lemmatization and stopwords in enhancing topic modeling. This breakthrough marks a significant stride in applying NLP to non-Latin languages and opens new avenues for the nuanced analysis of complex religious texts.
The Impact of Generative AI on Islamic Studies: Case Analysis of “Digital Muhammad ibn Isma’il Al-Bukharī”
The emergence of large language models (LLMs) such as ChatGPT, LLaMA, Gemini, and Claude has transformed natural language processing (NLP) tasks by demonstrating remarkable capabilities in generating fluent and contextually appropriate responses. This paper examines the current state of LLMs, their applications, inherent challenges, and potential future directions necessitating multidisciplinary collaboration. A key focus is the application of generative AI in Islamic studies, particularly in managing sensitive content such as the Ahadith (corpus of sayings, actions, and approvals attributed to the Prophet Muḥammad). We detail the customization and refinement of the AI model, “Digital Muḥammad ibn Ismail Al-Bukhari,” designed to provide accurate responses based on the Sahih Al-Bukhari collection. Our methodology includes rigorous dataset curation, preprocessing, model customization, and evaluation to ensure the model’s reliability. Strategies to mitigate hallucinations involve implementing context-aware constraints, regular audits, and continuous feedback loops to maintain adherence to authoritative texts and correct biases. Findings indicate a significant reduction in hallucinations, though challenges such as residual biases and handling ambiguous queries persist. This research underscores the importance of recognizing LLMs’ limitations and highlights the need for collaborative efforts in fine-tuning these models with authoritative texts. It offers a framework for the cautious application of generative AI in Islamic studies, emphasizing continuous improvements to enhance AI reliability.
A Novel Methodology for Topic Identification in Hadith
Keywords: Topic Modeling, Hadith, Neural Topic Model
Abstract
In this paper, we present our preliminary work on developing a novel neural-based approach named RoBERT2VecTM, aimed at identifying topics within the “Matn” of “Hadith”. This approach focuses on semantic analysis, showing potential to outperform current state-of-the-art models. Despite the avail ability of various models for topic identification, many struggle with multilingual datasets. Furthermore, some models have limitations in discerning deep semantic meanings, not trained for languages such as Arabic. Considering the sensitive nature of Hadith texts, where topics are often complexly interleaved, careful handling is imperative. We anticipate that RoBERT2VecTM will offer substantial improvements in understanding contextual relationships within texts, a crucial aspect for accurately identifying topics in such intricate religious documents.
Trends, Applications, and Challenges in Human Attention Modelling
Keywords: Humans and AI: General, Humans and AI: HAI: Applications
Abstract Human attention modelling has proven, in recent years, to be particularly useful not only for understanding the cognitive processes underlying visual exploration, but also for providing support to artificial intelligence models that aim to solve problems in various domains, including image and video processing, vision-and-language applications, and language modelling. This survey offers a reasoned overview of recent efforts to integrate human attention mechanisms into contemporary deep learning models and discusses future research directions and challenges. For a comprehensive overview of the ongoing research, refer to our dedicated repository available at https://github.com/aimagelab/awesome-human-visual-attention.
«Verranno giorni…» nel “Vangelo di Luca”: l’influenza di “Geremia LXX” sulle profezie di Gesù riguardanti la distruzione di Gerusalemme
Keywords: Gospel of Luke, Septuagint, Jeremiah, Lamentations, Destruction of Jerusalem, Prophetic, Language and Literature, Intertextuality
Abstract This study investigates the prophecies of Jesus on the destruction of Jerusalem and the Temple as they appear in the Gospel of Luke (13:34–35, and especially 19:41–44; 21:5–6, 20–24; 23:28–31) in light of their intertextual relationship with passages or texts from Scripture. The analysis focuses on how certain terms or expressions of the prophetic language of Jeremiah, and to a lesser extent of Lamentations, are borrowed through the Septuagint version (e.g., ἡμέραι ἔρχονται), recombined, and modified by Luke. This research, however, is not only lexical and comparative but also enters the exegetical field. It explores the reasons for and meaning of the use of LXX Jeremiah in these particular passages of the Gospel of Luke, where Jesus himself is speaking in the midst of the impending catastrophe.
Intertestualità tra Bibbie e antichi commentari cristiani: l’esempio di simul nel De Genesi ad litteram di Agostino
Keywords: Intertextuality; Biblical Quotations; Augustine; De Genesi ad litteram; Genesis (OT Book); Patristic Exegesis
Abstract This contribution presents a case study that, on the basis of some occurrences of the adverb simul in Augustine’s De Genesi ad litteram, allows us to illustrate the classification system we adopt to map the intertextual relationships between known Greek and Latin versions of the Bible and some patristic texts. This taxonomy has been set up within the framework of two research projects, joint together within European research infrastructure for Religious Studies “Resilience-RI”. After a methodological introduction based on the state of the art, the workflow will be explained and finally the concrete example of the adverb simul will be shown focusing on the use of some passages from Genesis 1 and Sirach 18:1 in Augustine’s commentary.
Digital Dark Ages: The Role of Medieval Corpora in the Context of the Digital Humanities and Religious Studies
KEYWORDS:Middle Ages, Digital Humanities, Religious Studies
Abstract In recent years, the debate on the role and methodologies of the digital humanities has seen considerable development, including in the specific – but disciplinarily vast – domain of Religious Studies. Even if it is a recent debate, its premises are based on epistemological questions and assumptions whose history it’s important to outline. In this context, a great contribution could be provided by the research conducted on medieval textual corpora. Through the study of some cases, starting from Roberto Busas’ Index Thomisticus up to ongoing research projects, this contribution presents some trends and specificities of the analysis and publication of medieval sources in the digital environment. Aiming at discussing innovations and limits of this research field, and what can be its contribution to the ongoing debate on digital religious studies.
Moving beyond the Content: 3D Scanning and Post-Processing Analysis of the Cuneiform Tablets of the Turin Collection
KEY WORDS: 3D scanning; cuneiform tablets; digital imaging; fingerprints; MSII; sealings
Abstract
This work and manuscript focus on how 3D scanning methodologies and post-processing analyses may help us to gain a deeper investigation of cuneiform tablets beyond the written content. The dataset proposed herein is a key part of the archaeological collection preserved in the Musei Reali of Turin in Italy; these archaeological artefacts enclose further important semantic information extractable through detailed 3D documentation and 3D model filtering. In fact, this scanning process is a fundamental tool for better reading of sealing impressions beneath the cuneiform text, as well as for understanding micrometric evidence of the fingerprints of scribes. Most of the seal impressions were made before the writing (like a watermark), and thus, they are not detectable to the naked eye due to cuneiform signs above them as well as the state of preservation. In this regard, 3D scanning and post-processing analysis could help in the analysis of these nearly invisible features impressed on tablets. For this reason, this work is also based on how 3D analyses may support the identification of the unperceived and almost invisible features concealed in clay tablets. Analysis of fingerprints and the depths of the signs can tell us about the worker’s strategies and the people beyond the artefacts. Three-dimensional models generated inside the Artec 3D ecosystem via Space Spider scanner and Artec Studio software were further investigated by applying specific filters and shaders. Digital light manipulation can reveal, through the dynamic displacement of light and shadows, particular details that can be deeply analysed with specific post-processing operations: for example, the MSII (multi-scale integral invariant) filter is a powerful tool exploited for revealing hidden and unperceived features such as fingerprints and sealing impressions (stratigraphically below cuneiform signs). Finally, the collected data will be handled twofold: in an open-access repository and through a common data environment (CDE) to aid in the data exchange process for project collaborators and common users.
Call for Applications for ITSERR TNA Fellowships 2024-2025
ITSERR launches its first call for applications for the Transnational Access (TNA) programme!
Foreign scholars (European and non-European) and PhD students – individuals or teams – within the fieldwork of Religious Studies or within the fieldwork of IT for DH can apply for a funded research fellowship to visit the special libraries, collections, archives and IT laboratories of the ITSERR TNA network of hosts.
ABOUT
ITSERR is the Italian project aimed at enhancing the RESILIENCE Research Infrastructure, whose high interdisciplinarity is finalized to meet the evolving needs of the Religious Studies scientific community by enriching the diversity, the quality and innovation of the knowledge in this specific domain. The project is developed thanks to the collaboration between humanists (experts on History, Philology and Christian, Islamic and Jewish Studies, but also archaeologists working on topics and sites related to the domain of Religious Studies) and IT engineers.
Through its Transnational Access Programme ITSERR offers support and expertise for research stays at leadingItalian Universities and research centres holding relevant collections within libraries and archives for Religious Studies together with outstanding IT laboratories for the increasing of the development of Digital Humanities.
European and non-European scholars and PhD students are invited to apply for a funded research fellowship to visit the special libraries, collections, archives and IT laboratories of ITSERR TNA hosts.
The call will be open from June 30th to August 19th, 2024.
HOW ITSERR TNA WORKS?
ITSERR TNA funded programme foresees ingoing and outgoingmobility.
Foreign scholars (European and non-European) and PhD students – individuals or teams – within the fieldwork of Religious Studies or within the fieldwork of IT for DH, can apply for funded TNA within the nodes of the ITSERR network (ingoing mobility).
ITSERR scholars and PhD students from programs of the ITSERR consortium can apply for funded TNA within the framework of the RESILIENCE TNA calls to stay in one of the TNA Hosts of the RESILIENCE RI (outgoing mobility). More detailed information on the next RESILIENCE TNA Call for Applications can be found here: https://www.resilience-ri.eu/cfa-tna/
Funded TNA for research teams are a specific feature of the ITSERR project as it ensure the collaboration and the development of the research leaded by each team. Please note that this opportunity is dedicated exclusively to ingoing mobility within the Italian network of hosts of ITSERR.
Please note that this first call for application is dedicated exclusively to ingoing mobility.
ITSERR TNAfellowships are funded by NextGenerationEU – National Recovery and Resilience Plan (Piano Nazionale di Ripresa e Resilienza, PNRR) and awarded on a competitive basis, based on specific eligibility criteria (see below).
WHAT OFFERS ITSERR TNA
Beyond the effective access, both physical and virtual, to the unique collections, IT laboratories and expertise of the leading Italian Universities and research Institutions ITSERRTNA covers the following costs:
Economy flights
Accommodation close to the Institutions (by booking full board accommodations also the cost of daily subsistence is covered)
Free access to collections and services
Our founded TNA is featured by the following duration of the period of research at ITSERR TNA hosts:
one/two weeks according to the kind of research leaded and to the kind of instruments to use.
Until one month for individual fellows or teams participating to the technical development of the project and in the making of the services.
SELECTION AND ELIGIBLE CRITERIA
To benefit from the support of funded ITSERR TNA, applicants are requested to submit proposals that will be evaluated by the Peer Review Committee based on the following selection criteria:
Quality of the research conducted by the applicant
Scientific merit of the proposed project
Feasibility of the proposed project
Personal statement
Proposals for accessing our programme are eligible if they meet the following criteria:
Transnationality: to promote transnational research the TNA fellows have to be affiliated to Institutions from countries other than that of the TNA site
All TNA fellows must ensure open access to publications arising from their stay during TNA
Balance of the heterogeneity of the applicants according to age, gender, provenance, research/professional status.
Team applications: the interdisciplinarity and heterogeneity of the composition of the team members will be encouraged
To apply for funded ITSERR TNA is necessary to follow the procedure described at the bottom of this web page.
Please note that: You will need an ORCID number. If you don’t already have one, please go to orcid.org to obtain your own in a few clicks.
ITSERR TNA HOSTS
ITSERR is a consortium of 5 partners including 4 Italian Universities and the National Research Council (CNR).
All these centres and libraries grant access to their collections , documents and rare books as well as to their virtual sources and to excellence IT laboratories. Within the consortium each partner participates with some specific Institutes and Departments to the development of the activities of the WP of the projects.To discover more about each WP seeour website: https://www.itserr.it/index.php/itserr-its-work-packages/
Italian National Research Council (CNR)
TheItalian National Research Council (CNR) is the largest public research Institution in Italy performing multidisciplinary activities of excellence and dedicated to knowledge. As leading member of the Consortium, the CNR is involved within all WPs through the Institute “A. Faedo” (Pisa), particularly within the WP2, WP3, WP4, WP5, WP7, WP8 and WP12 of the ITSERR project.
Institute of Information Science and Technologies (ISTI) “A. Faedo” – Pisa
The Institute of Information Science and Technologies (ISTI) is an institute located in the CNR Research Area of Pisa, committed to scientific excellence and to playing an active role in technology transfer. Here a crucial role is played by the Laboratory of Artificial Intelligence for Media and Humanities, which investigates the advances in the state of the art in the Artificial Intelligence field, specifically addressing applications to digital media and digital humanities, taking into account issues related to scalability.
UNIMORE is a public university, a primary venue for free research and education and a place for the development and critical processing of knowledge. Moreover, it offers facilities with internationally sponsored research activities through its libraries, collections and laboratories.
Two departments of this prestigious university are involved within the ITSERR project with WP4, WP5; WP6; WP8; WP12.
This Department is featured by a wide transversality of its disciplines on the model of the most prestigious international School of Education through the advancement and dissemination of scientific knowledge. Its structure is open to the new challenges within the fieldwork of education by cooperating with the main organizations at a local, regional, national and international level.
This Department represents a prestigious center of research whose activities covers all the scientific and fundamental areas of Engineering. Particularly, some of the Department activities are dedicated to Computer Vision and Multimedia and to the analysis of digital images of manuscripts and other sources of cultural heritage.
UniPa is the fifth University in Italy, covering all fields of studies, fostering an interdisciplinary approach. With its 16 Departments, 21 libraries and a museum system together with a campus close to the downtown Unipa offers many facilities to the researchers.
As one of the leading member of the Consortium, Unipa is involved within all WPs and particularly five Departments of UniPA are involved with the research led by WP3, WP4, WP5, WP6, WP7, and WP8 of the ITSERR project.
The Department is focused on the awareness of the need to broaden the research pathways in the “humanistic” field and to identify areas of intervention by strengthening the synergy between different disciplinary traditions. Fieldwork area are “classical” (archeology, antiquity sciences, art history) and ethno-anthropological (cultural anthropology, ethnology, ethnomusicology), with a strong connection to local territory and to the processes of conservation and enhancement of Cultural Heritage.
The Department of Law (DiGi) is one of the Excellence departments of UniPA. The DiGi collects approximately 120 structured research projects, whose fields extend to all areas of law study, including bordering areas (political economy, sociology) whose study is crucial for legal disciplines.
The Department has developed research on text analysis, data driven AI, machine learning, deep learning, natural language processing, interaction design, human computer interaction, combinatorics on words, algorithms on strings, data structures and algorithms.
The Department of Engineering is focused on the implementation of inter- and multi-disciplinary research, responding to the challenges posed by complex real life scenarios, by efficiently leveraging on the multi-disciplinary expertise of its staff.
The Department has a long tradition of studies in Industrial Design, on theories and methods of Design, including techniques and tools for the representation of its functional and formal characteristics. A great attention is given to the development of professional competences.
UniOr is the home of research teams working in the fields of antiquity studies, literature, philosophy, pedagogy, psychology, law, economics and social sciences with highly relevant libraries and collectionsthe fields of antiquity studies, literature, philosophy, pedagogy, psychology, law, economics and social sciences with highly relevant libraries and collections.
This excellence Department is focused working in both education and research activities. Moreover, its specialist libraries based at the sixteenth-century Palazzo Corigliano in Piazza San Domenico boast truly precious collections, including the ancient books of the original Chinese College. This department is involved within WP10 of the ITSERR project.
UniTo is one of the largest Italian Universities, open to international research and training. There are 22 libraries spread over 32 locations, together with 2 own museums. Unito is also connected to the network of local museums, whose subjects range from Egypt to contemporary Art.
Department of Historical Studies
This Department promotes an interdisciplinary approach to learning and research, being ranked top 5% of the national departmental structures. Teaching and research explore a wide range of areas of historical studies, from archaeology to the history of religions, with a global reach covering the ancient Middle East, the Mediterranean area, the origins of Europe, the Americas, Asia and Africa. The Department is involved within WP9 of the ITSERR project.
ITSERR TNA fellows should aim to publish the results of their researches within a clear timeline and preferably in open access ISI or SCOPUS refereed journals that have substantial academic impact. Moreover, the support of the EU and of the National Recovery and Resilience Plan (PNNR) as well the use of ITSERR services must be clearly acknowledged in the academic publications.
WHO CAN APPLY?
European and non-European scholars and PhD students whose research is leaded within the fieldwork of Religious Studies or within the IT fieldwork, especially for the development of softwares and tools for Digital Humanities, can apply for a fellowship regardless of their academic status.
APPLICATIONS
ITSERR TNA programme foresees to launch three calls for proposals on:
June 30th, 2024
December 10th, 2024
June 10th, 2025
The application process of the first call for proposals is closed per 19th of August, 2024.
Please note all applications will be processed after September 1st, 2024.
HOW TO APPLY
All the documents (application form and 2 pages CV – European pass template – all 1 PDF file) to be submitted to tna@itserr.it
To apply for our calls proposals, please download the application form:
Abstract Isometric words combine the notion of edit distance together with properties of words not appearing as factors in other words. An edit distance is a metric between words that quantifies how two words differ by counting the number of edit operations needed to transform one word into the other one. A word f is said isometric with respect to an edit distance if, for any pair of f-free words u and v, there exists a transformation of minimal length from u into v via the related edit operations such that all the intermediate words are also f-free. The adjective “isometric” comes from the fact that, if the Hamming distance is considered (i.e., only replacement operations are used), then isometric words are connected with the definitions of isometric subgraphs of hypercubes. We discuss known results and some interesting generalizations and open problems.
Hypercubes and IsometricWords Based on Swap and Mismatch Distance
Keywords: Swap and mismatch distance, Isometric words, Hypercube
Abstract The hypercube of dimension n is the graph whose vertices are the 2nbinary words of length n, and there is an edge between two of them if they have Hamming distance 1. We consider an edit distance based on swaps and mismatches, to which we refer as tilde-distance, and define the tilde-hypercube with edges linking words at tilde-distance 1. Then, we introduce and study some isometric subgraphs of the tilde-hypercube obtained by using special words called tilde-isometric words. The subgraphs keep only the vertices that avoid a given tilde-isometric word as a factor. An infinite family of tilde-isometric words is described; they are isometric with respect to the tilde-distance, but not to the Hamming distance. In the case of word 11, the subgraph is called tilde-Fibonacci cube, as a generalization of the classical Fibonacci cube. The tilde-hypercube and the tilde-Fibonacci cube can be recursively defined; the same holds for the number of their edges. This allows an asymptotic estimation of the number of edges in the tilde-Fibonacci cube, in comparison to the total number in the tilde-hypercube.
IsometricWords Based on Swap and Mismatch Distance
Keywords: Swap and mismatch distance, Isometric words, Overlap with errors
Abstract An edit distance is a metric between words that quantifies how two words differ by counting the number of edit operations needed to transform one word into the other one. A word f is said isometric with respect to an edit distance if, for any pair of f-free words u and v, there exists a transformation of minimal length from u to v via the related edit operations such that all the intermediate words are also f-free. The adjective “isometric” comes from the fact that, if the Hamming distance is considered (i.e., only mismatches), then isometric words define some isometric subgraphs of hypercubes. We consider the case of edit distance with swap and mismatch. We compare it with the case of mismatch only and prove some properties of isometric words that are related to particular features of their overlaps.
Measuring fairness under unawareness of sensitive attributes: A quantification-based approach
Keywords: Algorithms, Models, Decision Making, Group Fairness, Demographic Attributes, Data Minimisation, Privacy, Fairness Measurement, Sensitive Attributes, Quantification, Supervised Learning, Prevalence Estimates, Distribution Shifts, Demographic Parity, Classifier Fairness
Abstract Algorithms and models are increasingly deployed to inform decisions about people, inevitably affecting their lives. As a consequence, those in charge of developing these models must carefully evaluate their impact on different groups of people and favour group fairness, that is, ensure that groups determined by sensitive demographic attributes, such as race or sex, are not treated unjustly. To achieve this goal, the availability (awareness) of these demographic attributes to those evaluating the impact of these models is fundamental. Unfortunately, collecting and storing these attributes is often in conflict with industry practices and legislation on data minimisation and privacy. For this reason, it can be hard to measure the group fairness of trained models, even from within the companies developing them. In this work, we tackle the problem of measuring group fairness under unawareness of sensitive attributes, by using techniques from quantification, a supervised learning task concerned with directly providing group-level prevalence estimates (rather than individual-level class labels). We show that quantification approaches are particularly suited to tackle the fairness-under-unawareness problem, as they are robust to inevitable distribution shifts while at the same time decoupling the (desirable) objective of measuring group fairness from the (undesirable) side effect of allowing the inference of sensitive attributes of individuals. More in detail, we show that fairness under unawareness can be cast as a quantification problem and solved with proven methods from the quantification literature. We show that these methods outperform previous approaches to measure demographic parity in five experimental protocols, corresponding to important challenges that complicate the estimation of classifier fairness under unawareness.
Volumetric Fast Fourier Convolution for Detecting Ink on the Carbonized Herculaneum Papyri
Abstract Recent advancements in Digital Document Restoration (DDR) have led to significant breakthroughs in analyzing highly damaged written artifacts. Among those, there has been an increasing interest in applying Artificial Intelligence techniques for virtually unwrapping and automatically detecting ink on the Herculaneum papyri collection. This collection consists of carbonized scrolls and fragments of documents, which have been digitized via X-ray tomography to allow the development of ad-hoc deep learningbased DDR solutions. In this work, we propose a modification of the Fast Fourier Convolution operator for volumetric data and apply it in a segmentation architecture for ink detection on the challenging Herculaneum papyri, demonstrating its suitability via deep experimental analysis. To encourage the research on this task and the application of the proposed operator to other tasks involving volumetric data, we will release our implementation (https://github.com/aimagelab/vffc).
How to Choose Pretrained Handwriting Recognition Models for Single Writer Fine-Tuning
Work Package : All ITSERR WPs using Artificial Intelligence
Keywords: Document synthesis, Historical document analysis, Handwriting recognition, Synthetic data
Abstract Recent advancements in Deep Learning-based Handwritten Text Recognition (HTR) have led to models with remarkable performance on both modern and historical manuscripts in large benchmark datasets. Nonetheless, those models struggle to obtain the same performance when applied to manuscripts with peculiar characteristics, such as language, paper support, ink, and author handwriting. This issue is very relevant for valuable but small collections of documents preserved in historical archives, for which obtaining sufficient annotated training data is costly or, in some cases, unfeasible. To overcome this challenge, a possible solution is to pretrain HTR models on large datasets and then fine-tune them on small single-author collections. In this paper, we take into account large, real benchmark datasets and synthetic ones obtained with a styled Handwritten Text Generation model. Through extensive experimental analysis, also considering the amount of fine-tuning lines, we give a quantitative indication of the most relevant characteristics of such data for obtaining an HTR model able to effectively transcribe manuscripts in small collections with as little as five real fine-tuning lines.
Handwritten Text Generation from Visual Archetypes
Abstract Generating synthetic images of handwritten text in a writer-specific style is a challenging task, especially in the case of unseen styles and new words, and even more when these latter contain characters that are rarely encountered during training. While emulating a writer’s style has been recently addressed by generative models, the generalization towards rare characters has been disregarded. In this work, we devise a Transformer-based model for Few-Shot styled handwritten text generation and focus on obtaining a robust and informative representation of both the text and the style. In particular, we propose a novel representation of the textual content as a sequence of dense vectors obtained from images of symbols written as standard GNU Unifont glyphs, which can be considered their visual archetypes. This strategy is more suitable for generating characters that, despite having been seen rarely during training, possibly share visual details with the frequently observed ones. As for the style, we obtain a robust representation of unseen writers’ calligraphy by exploiting specific pre-training on a large synthetic dataset. Quantitative and qualitative results demonstrate the effectiveness of our proposal in generating words in unseen styles and with rare characters more faithfully than existing approaches relying on independent one-hot encodings of the characters.
Bridging Islamic Knowledge and AI: Inquiring ChatGPT on Possible Categorizations for an Islamic Digital Library (full paper)
Keywords: Libraries and Archives in CH, Digital Libraries and Religious Archives, ChatGPT, Islamic studies, Arabic script languages, Islamic knowledge classification, Islamic subjects
Abstract This research evaluates the capabilities of ChatGPT in assisting with the categorization of an Islamic digital library exploiting incremental Machine Learning and Transfer Learning techniques. Noticeably, ChatGPT showcased a remarkable familiarity with Islamic knowledge, evident in its ability to classify subjects hierarchically based on their importance, from Qur’anic Studies to Modern Islamic Thought. The library aimed to cater to a diverse Arabic Islamic audience with collections sourced from varied digital donations. Despite ChatGPT’s commendable proficiency, challenges arose. In light of ChatGPT’s significant performance, several challenges arose, with interpretability, generalization, and the hallucination issue standing out as the most critical obstacles.
Knowledge extraction, management and long-term preservation of non-Latin cultural heritages-Digital Maktaba project presentation
Keywords: Cultural heritages, Non-Latin alphabets, Knowledge extraction, Machine Learning, Natural Language Processing, Big data management, Long-term preservation, Big data integration, Named Entity Recognition
Abstract The services provided by today’s cutting-edge digital library systems may benefit from new technologies that can improve cataloguing efficiency and cultural heritages preservation and accessibility. Below, we introduce the recently started Digital Maktaba (DM) project, which suggests a new model for the knowledge extraction and semi-automatic cataloguing task in the context of digital libraries that contain documents in non-Latin scripts (e.g. Arabic). Since DM involves a large amount of unorganized data from several sources, particular emphasis will be placed on topics such as big data integration, big data analysis and long-term preservation. This project aims to create an innovative workflow for the automatic extraction of information and metadata and for a semi-automated cataloguing process by exploiting Machine Learning, Natural Language Processing, Artificial Intelligence and data management techniques to provide a system that is capable of speeding up, enhancing and supporting the librarian’s work. We also report on some promising results that we obtained through a preliminary proof of concept experimentation. (Short paper, discussion paper)
Knowledge Extraction and Cross-Language Data Integration in Digital Libraries
Keywords: Data Integration, Cross-Language Record Linkage, Knowledge Extraction, Long-term Preservation
Abstract Digital Humanities (DH) is an interdisciplinary field that has grown rapidly in recent years, requiring the creation of an efficient and uniform platform capable of managing various types of data in several languages. This paper presents the research objectives and methodologies of my PhD project: the creation of a novel framework for Knowledge Extraction and Multilingual Data Integration in the context of digital libraries in non-Latin languages, in particular Arabic, Persian and Azerbaijani. The research began with the Digital Maktaba (DM) project and continued within the PNRR ITSERR infrastructure, in which the DBGroup1 participates. The project aims to develop a two-component framework consisting of a Knowledge Extraction Subsystem and a Data Integration Subsystem. The case study is based on the DM project, which seeks to create a flexible and efficient digital library for preserving and analyzing multicultural heritage documents by exploiting the available and ad-hoc created datasets, Explainable Machine Learning , Natural Language Processing (NLP) technologies and Data Integration approaches. Key challenges and future developments in Knowledge Extraction and Data Integration are examined, which involve leveraging the MOMIS system for Data Integration tasks and adopting a microservices-based architecture for the effective implementation of the system. The goal is to provide a versatile platform for organizing and integrating various data sources and languages, thereby fostering a more inclusive and accessible global perspective on cultural and historical artefacts that encourage collaboration in building an expanding knowledge base.
A tool for semiautomatic cataloguing of an islamic digital library: a use case from the Digital Maktaba project (short paper)
Keywords: Cultural heritage, Digital Library, Islamic sciences, Arabic script OCR, Information extraction, Output alignment, Page layout analysis, Semiautomatic cataloguing, Software tool usage demo.
Abstract Digital Maktaba (DM) is an interdisciplinary project to create a digital library of texts in non-Latin alphabets (Arabic, Persian, Azerbaijani). The dataset is made available by the digital library heritage of the ”La Pira” library in the history and doctrines of Islam based in Palermo, which is the hub of the Foundation for Religious Sciences (FSCIRE, Bologna). Establishing protocols for the creation, maintenance and cataloguing of historical content in non-Latin alphabets is the long-term goal of DM. The first step of this project was to create an innovative workflow for automatic extraction of information and metadata from title pages of Arabic script texts. The Optical Character Recognition (OCR) tool uses various recognition systems, text processing techniques and corpora in order to provide accurate extraction and metadata of document content. In this paper we address the ongoing development of this novel tool and, for the first time, we present a demo of the current version that we have designed for the extraction and cataloguing process by showing a use case on an Arabic book frontispiece. In particular, we delve into the details of the tool workflow for automatically converting and uploading PDFs from the digital library, for the automatic extraction of cataloguing metadata and the semiautomatic (at the current stage) process of cataloguing. We also shortly discuss future prospects and the many additional features that we are planning to develop.
Novel Perspectives for the Management of Multilingual and Multialphabetic Heritages through Automatic Knowledge Extraction: The DigitalMaktaba Approach
Keywords: digital libraries; minority languages; humanistic informatics; computer archiving; intercultural communication
Abstract The linguistic and social impact of multiculturalism can no longer be neglected in any sector, creating the urgent need of creating systems and procedures for managing and sharing cultural heritages in both supranational and multi-literate contexts. In order to achieve this goal, text sensing appears to be one of the most crucial research areas. The long-term objective of the DigitalMaktaba project, born from interdisciplinary collaboration between computer scientists, historians, librarians, engineers and linguists, is to establish procedures for the creation, management and cataloguing of archival heritage in non-Latin alphabets. In this paper, we discuss the currently ongoing design of an innovative workflow and tool in the area of text sensing, for the automatic extraction of knowledge and cataloguing of documents written in non-Latin languages (Arabic, Persian and Azerbaijani). The current prototype leverages different OCR, text processing and information extraction techniques in order to provide both a highly accurate extracted text and rich metadata content (including automatically identified cataloguing metadata), overcoming typical limitations of current state of the art approaches. The initial tests provide promising results. The paper includes a discussion of future steps (e.g., AI-based techniques further leveraging the extracted data/metadata and making the system learn from user feedback) and of the many foreseen advantages of this research, both from a technical and a broader cultural-preservation and sharing point of view.
Structured-Light Scanning and Metrological Analysis for Archaeology: Quality Assessment of Artec 3D Solutions for Cuneiform Tablets
Abstract This paper deals with a metrological and qualitative evaluation of the Artec 3D structured-light scanners: Micro and Space Spider. As part of a larger European project called ITSERR, these scanners are tested to reconstruct small archaeological artefacts, in particular cuneiform tablets with different dimensions. For this reason, Micro and Space Spider are compared in terms of the entire workflow, from preparatory work to post-processing. In this context, three cuneiform replica tablets will serve as examples on which the Artec scanners will have to prove their worth. Metric analyses based on distance maps, RMSe calculations and density analyses will be carried out to understand metrological differences between these tools. The creation of 3D models of cuneiform tablets is the first step in developing a virtual environment suitable for sharing the archaeological collection with collaborators and other users. The inclusion of semantic information through specific ontologies will be the next step in this important project.
Preserving and conserving culture: first steps towards a knowledge extractor and cataloguer for multilingual and multi-alphabetic heritages
Abstract Managing and sharing cultural heritages also in supranational and multi-literate contexts is a very hot research topic. In this paper we discuss the research we are conducting in the DigitalMaktaba project, presenting the first steps for designing an innovative workflow and tool for the automatic extraction of knowledge from documents written in multiple non-Latin languages (Arabic, Persian and Azerbaijani languages). The tool leverages different OCR, text processing techniques and linguistic corpora in order to provide both a highly accurate extracted text and a rich metadata content, overcoming typical limitations of current state-of-the-art systems; this will enable in the near future the development of an automatic cataloguer which we hope will ultimately help in better preserving and conserving culture in such a demanding scenario.