rimes

Presentation

The RIMES database (Reconnaissance et Indexation de données Manuscrites et de fac similÉS / Recognition and Indexation of handwritten documents and faxes) has been created to evaluate automatic recognition and indexing systems for handwritten letters. Of particular interest are cases such as those sent by mail or fax from individuals to companies or administrations.

The database was collected by asking volunteers to write handwritten letters in exchange for gift vouchers. Volunteers were given a fictitious identity (same sex as the real one) and up to 5 scenarios. Each scenario was chosen from among 9 realistic topics: change of personal data (address, bank account), request for information, opening and closing (customer account), change of contract or order, complaint (poor quality of service...), payment difficulties (request for delay, tax exemption...), reminder, claim with other circumstances and a target (administrations or service providers (telephone, electricity, bank, insurance). The volunteers wrote a letter with this information in their own words. The layout was free and the only request was to use white paper and to write legibly in black ink.

The campaign was a success with more than 1,300 people contributing to the RIMES database by writing up to 5 mails. The resulting RIMES database contains 12,723 pages, corresponding to 5605 mails of two to three pages each.

RIMES in evaluations:

The database has been used for several competitions with different tasks and with official train/dev/test splits :

See Papers with code for a list of publications using the RIMES database.

RIMES evaluations data :

You can download the data used in the evaluations:

RIMES evaluation data on HuggingFace :

We have published the RIMES evaluation data at line level on HuggingFace

The complete RIMES database :

The complete RIMES database is available on Zenodo

Rimes PyLaia model :

We have trained a PyLaia model on Rimes and published it on HuggingFace.