Arkindex is the platform developed by Teklia for the automatic processing of large collections of scanned documents. Arkindex offers the following features:
- Document management: import and organize images of document from files or IIIF manifests
- Manual annotation: annotate images with zones, transcriptions and classifications
- Automatic layout analysis: automatically detect documents components (texts, images, graphics, etc)
- Automatic printed and handwritten text recognition: for printed (OCR) and handwritten (HTR) documents
- Automatic named-entity extraction: identify persons, places, organizations, etc.
Arkindex is based on IIIF (https://iiif.io/) for images and is fully accessible through a REST API.
Try it : https://demo.arkindex.org
Annotation Interface
The annotation interface allows you to:
- define zones on the images with rectangles or polygones
- define the type and class of the zones
- add a manual transcription of the text in the zone

Document classification
- Visual and text based classification
- Custom class definition and training
- Multilabel classification, confidence level

Automatic layout analysis
- text line detection
- document structure detection
- graphical component detection


Automatic Text Recognition (HTR/OCR)
- Printed text and handwritten text recognition
- historical and modern handwriting
- European, Arabic and Chinese languages supported


Processing steps
{{ carousel() }}
High quality software
We aim to produce high quality software at Teklia: you can view the details of our latest releases on this website.