Features
Arkindex is the platform developed by Teklia for the automatic processing of large collections of scanned documents. Arkindex offers the following features:
- Document management: import and organize images of document from files (jpeg, tiff, png), PDF, IIIF manifests . See our Video
- Manual annotation: annotate images with
- zones of elements on the image, with type and position
- text transcriptions at any level (page, paragraph, line, word)
- classifications
- meta-data
- Arkindex is fully integrated with Callico, for advanced collaborative annotation campaigns
- Automatic layout analysis: automatically detect documents components (texts, images, graphics, etc)
- Automatic printed and handwritten text recognition: for printed (OCR) and handwritten (HTR) documents. See our Video
- Automatic named-entity extraction: identify persons, places, organizations, etc.
- Automatic photo cropping, tagging and description
✅ See our comparison page : Arkindex versus the other platforms
Arkindex Workflow Management Capabilities
Arkindex offers extensive capabilities, unmatched by its competitors, for managing complex workflows tailored to your document processing needs:
- Customisable Workflow Design: Arkindex gives you the freedom to define complex workflows tailored to your unique processing requirements. From layout analysis and classification to text recognition (OCR/HTR), named entity recognition and metadata generation, you can curate each step to achieve your desired outcome.
- Real-time monitoring: Stay informed at all times. With Arkindex, you can monitor the progress of each task within your workflow in real time. This powerful feature provides you with an estimated time of arrival for each step, ensuring you can make informed decisions and adjust resources as necessary.
- Error Analysis & Rerun: Not all processes run perfectly every time. Arkindex understands this and provides tools to analyse any errors that may occur in your workflow. Once identified, you can easily rerun processes for those specific elements, ensuring consistency and accuracy.
- Flexible Processing Nodes: To accommodate different infrastructure requirements, Arkindex provides the flexibility to distribute your processing tasks across multiple nodes. Whether it's on-premises, in a cloud environment or even on high performance clusters using SLURM, we've got you covered.
- Seamless integration with custom & open source components: Arkindex is not limited to its built-in functionality. You can effortlessly define your processing steps using your proprietary code or benefit from the vast ocean of open source components available. Docker integration makes integrating these components easy.
Arkindex is based on IIIF (https://iiif.io/) for images and is fully accessible through a REST API.
Arkindex can be used in the cloud or installed on-premise.
Try it : https://demo.arkindex.org
Projects done with Arkindex:
- ICRC - Collaborative transcription and handwriting recognition applied to prisoners' handwritten lists
- Automatic recognition of 100 years of French Census: the SOCFACE project
- SIMARA: automatic conversion of finding aids with handwriting recognition
Arkindex code and releases:
We aim to produce high quality open-source software at Teklia:
- access the source code of the python backend , vue.js frontend and more tools and documentation related to Arkindex
- you can view the details of our latest releases on this website.