A new release is available for Arkindex instances.

You can learn more about Arkindex on its official documentation.

Important changes

In the API, all the process-related endpoints and URLs have been renamed: mentions of "dataimport" were replaced by "process". For example, the endpoint to list Arkindex processes is now ListProcesses instead of ListDataImports, and the API URL has changed from /api/v1/imports/ to /api/v1/process/. As a consequence, these endpoints are now under a process section in the API documentation.

Process

  • It is now possible to specify a shm_size parameter in a worker version's docker configuration. This allows for the increase of the available shared memory in /dev/shm, which was required for the training of some workers.
  • Users with a contributor-level access to workers (or any user, for public workers) can now start training processes for these workers.
  • It is now possible to create parameters in the worker's user configuration that take a list of booleans as input, using the bool subtype.
A list of booleans
A list of booleans
  • There have been some improvements to the worker activity feature:

    • When a process is retried, its attached worker activities are re-initialized, allowing for the re-assignment of existing worker activities to the new process, as well as re-creating missing worker activities (for example, if new elements were created).
    • This activity initialization can "steal" worker activities from other, older processes with the same worker, worker version and configuration; an older process, however, cannot "steal" a newer process' worker activities.
    • A worker activity's timeout (1 hour by default) is now always calculated from when it was marked as started, even if the process was retried (and the activity was marked as started again), not from when it was created.
    • When a task fails during a process, the started worker activities are all marked as error.
  • Our worker execution system, Ponos, also underwent a few improvements:

    • A bug was fixed which prevented Ponos agents from properly freeing up space after downloading Docker images, causing hosts to run out of space.
    • Empty worker runs (no tasks) are now handled when retrieving a workflow's state, instead of causing an error, which could occur when starting a workflow while the database is under heavy load.

API

  • The ListElementMetaData endpoint (API documentation) now has a load_parents option, which allows it to retrieve an element's parents' metadata along with its own.
  • HTML sanitization (which removed characters like & or >) is now restricted to Markdown metadata, instead of being applied to all metadata. Read more about the rules that apply to the different metadata types in the Arkindex documentation.
  • The bulk creation endpoint CreateElementTranscriptions now supports adding a confidence score to the created elements, not just the transcriptions.

Frontend

  • The Arkindex frontend now supports TypeScript, allowing developers to type variables in order to produce better code more easily and avoid mistakes.
  • The element path display in the element header when browsing elements with multiple parents has ben improved: the different paths stay displayed in the same order when going from one child element to another.
An element with multiple parent paths
An element with multiple parent paths
  • The Search form and view have been refactored, to solve a number of issues.
  • The batch deletion mode is now disabled when the annotation panel is closed.
  • The file import interface's error handling has been improved.
  • The name editing field is now closed when navigating to another element.
  • Users are now blocked from accessing the process configuration and element filtering pages for unconfigurable processes.

Transkribus import

The polygons created by a Transkribus import are now trimmed to fit inside their image if they overflow it, and ignored if they are outside the image's bounds, instead of triggering an error.