The RIMES database (Reconnaissance et Indexation de données Manuscrites et de fac similÉS / Recognition and Indexation of handwritten documents and faxes) has been created to evaluate automatic recognition and indexing systems for handwritten letters. Of particular interest are cases such as those sent by mail or fax from individuals to companies or administrations.
The database was collected by asking volunteers to write handwritten letters in exchange for gift vouchers. Volunteers were given a fictitious identity (same sex as the real one) and up to 5 scenarios. Each scenario was chosen from among 9 realistic topics: change of personal data (address, bank account), request for information, opening and closing (customer account), change of contract or order, complaint (poor quality of service...), payment difficulties (request for delay, tax exemption...), reminder, claim with other circumstances and a target (administrations or service providers (telephone, electricity, bank, insurance). The volunteers wrote a letter with this information in their own words. The layout was free and the only request was to use white paper and to write legibly in black ink.
The campaign was a success with more than 1,300 people contributing to the RIMES database by writing up to 5 mails. The resulting RIMES database contains 12,723 pages, corresponding to 5605 mails of two to three pages each.
RIMES in evaluations:
The database has been used for several competitions with different tasks and with official train/dev/test splits :
- ICFHR 2008 Competition
- ICDAR 2009 Competition (word level)
- ICDAR 2011 Competition (word and line level)
See Papers with codefor a list of publications using the RIMES database.
RIMES data :
You can download the data used in the evaluations:
- ICDAR2011 Line level : RIMES-2011-Lines.zip