We are happy to announce that a new Arkindex release is available. You can explore Arkindex and try out the newest features on our demo instance, demo.arkindex.org

Annotations

To help in distinguishing between many elements of different types on the same image, element types can now be assigned a distinct color.

Colors on element types in the project details page

The colors can be edited by project admins on the project details page, and are displayed when viewing or annotating elements.

Elements colored by types in Arkindex

Processes

Workflows, which were sitting in between processes and tasks, have been removed. With tasks now directly linked to processes, the workflow_id fields in the API have been changed to process_id , and the RetrieveWorkflow API endpoint has been removed. Those were not used much by external code, so this should have very little impact for most users. This change removes quite a bit of technical debt and will allow us to make more improvements on processes in the future.

Processes now have a start date, which is set anytime the process starts or is retried. This start date is displayed on the processes list page. This will make it easier to find processes that have run for a very long time, or tell how long a process took after it completes.

Workers processes can now have their elements be filtered by class, which can be particularly helpful for workers that report states such as a low quality of annotations as a classification on each element.

Datasets

Datasets, which were introduced in the previous release, can now be managed on each project via the project details page.

Dataset management in the project details page

Additionally, a new process mode called Dataset allows running workers on one or more datasets instead of elements. This process mode can currently only be used with the API: the CreateElementsWorkflow endpoint has been renamed to CreateProcess, and you can choose between the Workers and Dataset modes when creating a process.

Models

As part of our work to make on-premise deployments easier and simplify creation of new workers, a new arkindex worker publish command is available in the Arkindex CLI (version 0.2.7+) to publish any Docker image as a worker without using Git repository imports. This can make it easier to share a worker with a third party, as well as reduce disk space usage when using the same worker on multiple Arkindex instances and allow using continuous integration services to publish workers automatically.

The compatible workers declared for each model are now used to filter the displayed workers when configuring Workers and Training processes. To prevent any issues with models that were not previously declaring their workers, or are missing a worker, users may choose to turn off this filter and select any worker they would like instead.

On Training processes, by default, only the workers of a Training worker type are displayed. This filter can be removed or set to a different type at any time.

Additionally, an error with models in the Django admin that did not have any compatible worker or description set was fixed, and deleting elements that were used as train, test or validation folders in a training process no longer causes errors.

S3 imports

When importing from an S3 bucket, the import task will now detect any ZIP archive and try to extract them automatically before importing. System administrators will need to allow write access to their S3 buckets to Arkindex to make this feature work.

Performance

The ListElementNeighbors API endpoint, which powers the previous and next buttons to navigate between elements in the same parent, has been simplified and optimized and can now support parents with a large amount of child elements.

Some additional constraints introduced in Arkindex 1.4.2 to improve data integrity on the element hierarchy have caused severe performance issues on some Arkindex instances, especially during deletions, and have been found to have significant disk space usage. Those constraints have been replaced with more flexible and optimized triggers, guaranteeing the same level of integrity and avoiding performance troubles.

Finally, some database indexes that have been found to be unnecessary have been removed.

Security

Access to all API endpoints related to Ponos agents, tasks, artifacts and farms has been tightened, and the documentation of each endpoint has been updated to make those rights clearer. Those restrictions have been enabled in part by the new task authentication system introduced in Arkindex 1.4.1.

Additionally, the worker and worker version creation endpoints have been restricted to Git repository processes, and workers may only be created on the repository of the authenticated process.

Misc

  • Removed the RetrieveDataImport endpoint, which was deprecated since Arkindex 1.3.3.
  • Clarified the documentation of element and transcription IDs on CreateElementTranscriptions.
  • Updated Arkindex exports to output ImageServer and TranscriptionEntity IDs as integers instead of strings.
  • Fixed an error when searching for OAuth credentials in the Django admin by user email.
  • Restarting a single task that used a GPU now allows a different GPU to be assigned.