We are pleased to announce that TEKLIA will be presenting four distinct research papers at the International Conference on Document Analysis and Recognition (ICDAR) in 2023. ICDAR is a leading forum for scientists and practitioners in the field of document analysis and recognition, a discipline that is becoming increasingly crucial in the era of digital transition.
The upcoming 17th edition of ICDAR, to be held in San José, California, USA, from 21 to 26 August 2023, is an important opportunity for TEKLIA. This event is an excellent platform for us to showcase our cutting-edge research and engage with industry peers and academics in the field.
Exploring the intersection of deep learning and document recognition, here are the four research papers that TEKLIA will be presenting at ICDAR 2023:
Large Scale Genealogical Information Extraction From Handwritten Quebec Parish Records by Solène Tarride, Martin Maarand, Mélodie Boillet, James McGrath, Eugénie Capel, Hélène Vézina, and Christopher Kermorvant
This paper presents a machine learning-based workflow for extracting genealogical information from handwritten Quebec parish registers. The method, applied to over two million pages of 19th and 20th century registers, recognised 3.2 million acts, 74% of which were found to be valid and complete.
Presented at the Document NLP 1 - Document NLP Session, Monday, August 23, 2023, 10:50-12:30
Handwritten Text Recognition from Crowdsourced Annotations by Solène Tarride, Tristan Faine, Mélodie Boillet, Harold Mouchère, and Christopher Kermorvant
This paper investigates training models for handwriting recognition using multiple imperfect transcriptions. Conducted on the municipal registers of the city of Belfort (1790-1946), the study found consensus or multiple transcription training to be viable methods, while quality-based selection was found to introduce bias.
Presented at the 7th International Workshop on Historical Document Imaging and Processing (HIP’23), August 25th, 2023, 9h10 am.
SIMARA: A Database for Key-Value Information Extraction from Full-Page Handwritten Documents by Solène Tarride, Mélodie Boillet, Jean-François Moufflet, and Christopher Kermorvant
This paper presents a new database for extracting information from historical manuscript documents. The dataset promotes research on segmentation-free information extraction systems.
Presented at Poster Session 2, Wednesday, August 23, 2023, 2:30-4:00
Key-Value Information Extraction from Full Handwritten Pages by Solène Tarride, Mélodie Boillet, and Christopher Kermorvant
This paper presents a Transformer-based method for extracting information from digitised handwritten documents which combines feature extraction, handwriting recognition and named entity recognition in a single model, outperforming traditional two-step methods.
Presented at Document NLP 2 - Information Extraction Session, Monday, August 21, 2023, 4:00-6:00
Don't miss this unique opportunity to meet the TEKLIA team at ICDAR 2023. Mélodie, Solène and Christopher will be in San José for the entire week, eager to share insights from our research and explore potential collaborations. We look forward to engaging discussions on the future of deep learning in document recognition. Join us!