Innovation Award for AI-Based Processing of János Arany's Official Documents
One of the first and most significant achievements of the National Laboratory for Digital Heritage (DH-LAB) is the development of a handwriting recognition model that has made it possible to search János Arany's official documents. The project team received the Social Innovation Award from the Ministry of Culture and Innovation for this groundbreaking development.

The development by the Digital Heritage National Laboratory (DH-LAB) - recognized with the Social Innovation Award - facilitates the searchability of János Arany's official documents, housed in the Library of the Hungarian Academy of Sciences. Thanks to DH-LAB's innovation, this invaluable collection is now accessible to both researchers and the general public. The participants in the project received the award of the Ministry of Culture and Innovation on November 13.
At the event, Balázs Balogh, Director General of the HUN-REN Research Centre for the Humanities (HUN-REN BTK), accepted the award on behalf of the institution from Deputy Minister Róbert Zsigó. Key contributors to the winning development included HUN-REN BTK staff members Gábor Palkó, Zsófia Fellegi, and Barbara Bobák, as well as Norbert Fekete and István Szekrényes from the Laboratory team. Alongside the DH-LAB project, this year’s other award recipient was the Open-Air Ethnographic Museum's dementia program.
The DH-LAB, in collaboration with the Institute for Literary Studies at HUN-REN BTK, the University of Miskolc, and the Department of Digital Humanities at ELTE BTK TI, as well as students from the Department of Artificial Intelligence at ELTE IK, is developing methodologies for applying AI tools optimized for the Hungarian language in public collections. This work aligns with the principles of open science and is conducted within their own hardware environment. One of the first and most significant outcomes of their joint efforts was the development of the handwriting recognition model, which earned the innovation award.

Results and Impact of the Handwriting Recognition Project
The development is unique and innovative because it marks the first large-scale, AI-based handwriting recognition initiative in Hungary, carried out by Hungarian digital humanities experts and AI specialists, using local expertise and computational tools. The project resulted in the creation of a general handwriting recognition model that public collections can freely utilize. This means that in the future, an almost unlimited number of 19th-century Hungarian handwritten documents can be processed - texts that have so far not been fully integrated into the nation's cultural heritage.
The developed methodology could serve as a foundation for further innovations in various market sectors, such as AI-based corporate document processing. The technology is being integrated into the workflows of outstanding research infrastructures with NKFIH certification, including the DH-LAB-QULTO joint research infrastructure and the HUN-REN BTK's EtnoLab project.
"In the first decades of the 21st century, two closely related and parallel trends can be observed in the fields of culture and science. On one hand, artificial intelligence (AI) is transforming and replacing traditional cultural practices to an extent that was previously unimaginable. On the other hand, due to both the digitization of cultural heritage and the vast volume of digitally created materials, databases and data networks are being generated on an unprecedented scale," said Gábor Palkó, Project Leader and Senior Research Fellow at the HUN-REN BTK Institute for Literary Studies.
"In the discourse on digital heritage, handwritten manuscripts - the 'real' manuscripts - are being overshadowed by printed or digitally created materials that are easier to process and publish. These manuscripts cannot be made searchable with general models that fail to consider the specific characteristics of a given document group. This issue is particularly pronounced for languages like Hungarian, where AI tools are less effective compared to major world languages spoken by hundreds of millions. As a result, handwritten Hungarian documents are significantly underrepresented in the realm of digital cultural heritage," the researcher added. "One of the primary tasks of the Digital Heritage National Laboratory project is to address these challenges," he emphasized.
The award-winning project will also be showcased to the public at the Science Expo, a Hungarian scientific exhibition held at the Museum of Fine Arts from November 21 to 23.
Arany's Official Documents and Their Significance
Our renowned writer and poet, János Arany, became a regular member of the Hungarian Academy of Sciences (MTA) in 1859, where he later served as Secretary General. His administrative work in this role was of immense importance: he defined and established the operational framework of the Academy, transforming it into one of Hungary's most significant scientific institutions. The corpus of his official documents is substantially larger than previously known, encompassing approximately 9,200 documents, equivalent to about 30,000 manuscript photographs.
To train their Handwritten Text Recognition (HTR) model, experts compiled a corpus including 200 pages of János Arany's handwriting, along with texts by his secretary, Adorján Ring, and nearly 30 additional hands. The model was trained on a total of 874 transcribed manuscript pages, achieving a character-level error rate of less than 5%. The publication of the official documents is currently underway in the Hungarian Academy of Sciences Library repository, where they will be available as searchable PDFs.
