Today we are happy to release the source code of our crowdsourced annotation platform Callico, under the AGPL-v3 license! 🎉

callico_metadata_nara_form
Callico interface for entering metadata or extracting information

We started working on Callico a year and a half ago, to provide a clean and as simple as possible to use interface for our annotation needs.
Teklia needs a lot of annotated data (from line positions, to text transcriptions but also named entities present on text or image...) and some of our clients do not always have such datasets available.

Initially we managed small annotation campaigns through our main product Arkindex, but this proved impossible to scale unless we implemented some kind of campaign management to create and assign annotation tasks to users. We made the choice to create a new product instead of extending Arkindex as the features needed are far from its core mission (process millions of documents).

We also looked into existing alternatives such as From the page, Transcrire, Madoc, TACT or even Amazon Mechanical Turk; but they were either incomplete (not covering all our annotation needs), too expensive for mid to large projects, or were hard to integrate with our existing tools. You can read a more detailed comparison on our blog.

Callico has been used in production for over a year, and gained some traction lately amongst our clients. We have been running some succesful public annotation campaigns with hundreds of simultaneous annotators pretty smoothly:

Callico is now capable of managing large annotation campaigns, track annotator progress amongst a variety of tasks (text transcription, polygon segmentation, image classification, entity creation from text or images, polygon grouping and more). You can read more about all the supported modes here.


If you are interested in this project, you can learn more about Callico with the official documentation. We already have setup instructions if you want to self-host your own Callico instance.
If you are a developer or just want to check out the source code, it's available on Gitlab, where you can file issues for any bugs or feature requests!
This is the real beauty of open source. :)