
Callico
Callico is the annotation and validation platform for digitised documents developed by TEKLIA. We use it in all our projects to generate training data for our Deep Learning models. It is available as open source.
- Callico on GitLab
Deep Learning libraries and tools
we publish and maintain our code as open source on Gitlab.
- Doc-UFCN, a library for detecting objects in scanned documents. See on GitLab
- PyLaia, a handwriting recognition library
- Nerval, a named entity extraction evaluation library. See on GitLab
- DISS, a document image segmentation scoring library. See it on GitLab
Open deep learning models
we publish our models in free access on HuggingFace
- Handwriting recognition models for PyLaia
- Document Layout Analysis models for Doc-UFCN
- Named entity recognitions models for spaCy
Data tools
- Transkribus client and PAGE XML parser
- Virtual keyboard as a web extension for eScriptorium
Arkindex tools
Open-source tools to interact with Arkindex, the document processing platform
- Arkindex command line client: a command line interface to Arkindex instance. See it on PyPi and GitLab
- Arkindex API client: a python library to communicate with Arkindex API. See it on PyPi and GitLab
- Arkindex Export: a library for exploring and using Arkindex exports in sqlite format. See it on PyPi and GitLab
- Arkindex base worker: a base class for integrating processing algorithms in Arkindex. See it on PyPi and GitLab
Public Databases
- The RIMES database: Handwritten documents in French
- NorHand: a dataset for handwritten text recognition in Norwegian. See our paper.
- SIMARA: a dataset of handwritten index cards.