The Medieval times spanning from the 9th century to the 15th century ushered in thousands of artistic writings which are now fondly referred to as “Manuscripts”.
A Medieval Manuscript is a codex (pl. codices), meaning a book made of pages bound between two boards. Ancient scribes wrote on scrolls that were stored in boxes.
These ancient scrolls only survive in occasional fragments, as a scroll is especially vulnerable to physical degradation. The pages of codices, on the other hand, are protected by their covers and have a much greater chance for survival. Thus, medieval books survive in large numbers. More medieval books survive from the Middle Ages than any other artistic medium.
Scholars refer to the handmade books of the Middle Ages as Manuscripts. Books that contain artistic decoration are called Illuminated Manuscripts. Manuscripts that survive from the European Middle Ages are generally religious books that reflect the canon, doctrine and practices of Christianity, though there are Jewish and Muslim books and other types of books that survive from this time period as well.
As brought to light by the increasing number and variety of research papers reporting the results of scientific analyses of Medieval, African and Renaissance manuscripts, the role of analytical methods as indispensable tools for the comprehensive study of manuscripts is no longer in question. The complex mathematical methods used by Calatroni et al.  for the digital ‘restoration’ and interpretation of manuscripts— and their provision of additional multi-media content linked to the published article—serve to bring cross-disciplinary research on medieval manuscripts fully into the digital era.
With the growing numbers of analytical methods at our disposal, and the promise of more numerous and increasingly sensitive ones to come, the integration of scientific analysis into the study of illuminated manuscripts seems destined to become the standard.
Optical Character Recognition (OCR) is one of the earliest areas of artificial intelligence research, that analyses written text (typed, handwritten, or printed) on a page into machine-readable text data for processing information. With this technology, scanned copies of delicate western medieval manuscripts are first subjected to certain computer vision technologies which will detect characters one by one.
Afterwards, an image classification algorithm will be utilised to group and classify clusters of characters into meaningful words and sentences.
To correct mistakes in the character recognition phase, a Natural Language Processing (NLP) algorithm will be deployed using probabilistic approaches to estimate correct words and sentences despite missing characters in the character recognition phase.
Furthermore, Deep learning algorithms will be implemented to leverage the performance of the OCR technology, by using training samples to train the model in recognising texts – with different font, or font sizes, coloured backgrounds, skewed or non-oriented documents as in the case of numerous medieval manuscripts; with greater accuracy and detects errors and correct them simultaneously.
Here are some repositories for digitised medieval manuscripts for you to explore:
Universiteit Leiden – Digital Manuscripts in Classrooms Universiteit Leiden
Biblissima – IIIF Collections of Manuscripts and Rare Books Biblissima
 Calatroni L, d’Autume M, Hocking R, Panayotova S, Parisotto S, Ricciardi P, Schönlieb C-B. Unveiling the invisible: mathematical methods for restoring and interpreting illuminated manuscripts. Herit Sci. 2018;6:56. doi.org/10.1186/s40494-018-0216-z