In collaboration with Teklia, the Belfort city archives have launched a pilot project consisting of the automatic transcription of all the registers of the town's city councils. The project aims to automatically process 18,500 pages of various minutes, drawn up between 1790 and 1946, and has two objectives.

belfort
Old town - Belfort Tourism

Artificial Intelligence to enable full text search

Through this innovative project, Teklia deploys its HTR (Handwritten Text Recognition) and OCR (Optical Character Recognition) technology for the processing of digitised documents.

The deliberations of the municipal council are an important source for the history of the city. These documents consist of the texts of the minutes of the municipal council's deliberations, but also lists of councillors, convocations and agendas, drawn up over 150 years ago.

sample_registres

The HTR and OCR models are two recognition models designed for handwritten and printed documents respectively. To work properly on handwritten historical documents, they must be trained on transcripts obtained by manually annotating a sample of different document pages. This training then allows the model to process new pages, detect lines of text and perform an automatic transcription.

sample_reco
Line detection on one of the scanned pages and handwriting recognition

Once the transcription of all the pages of the municipal council's deliberations has been validated, they will be published in free access on the website of the City Archives, and a full page text search will be possible. Thus, everyone will not only be able to consult these pages, but will also be able to search for specific information.

Citizens at the heart of the process

Beyond its innovative technical aspect, this project gives citizens the opportunity to get involved in the local life.

The implementation of a participatory campaign

For the handwriting recognition process to be more efficient, training the HTR models on good quality data is essential. The collection of data for this project will be done through the participatory platform Callico, where volunteers will be able to manually transcribe parts of documents and then correct and validate the automatic transcriptions obtained after training the models.

Discovering the history of the city of Belfort

The aim of this project is above all to allow citizens not only to take part in its implementation, but above all to discover the functioning of a municipal council, as well as the evolution of the City of Belfort through the centuries.

We hope that this pilot initiative will arouse the interest of other municipalities.

If you are responsible for the municipal archives and have already digitised the pages of the deliberations of the municipal council, please let us transcribe them automatically, whether the texts are handwritten or printed: Contact us.


Photo credit:

Belfort Tourisme, City of Belfort - www.belfort-tourisme.com/