A new release is available for Arkindex instances. You can test it on our demo instance: demo.arkindex.org

Project rights

Arkindex now has a solution for rights management on projects that can be accessed by all users. Previously, only instance administrators could grant access to users on specific projects.

As an regular user, you can now:

  • Create a group with specific users, using their email address to find them;
  • Give read-only, read & write or full administrative access to users or groups on your projects;
  • Revoke any rights at any time;
  • View all the groups you belong to.

profile
View the groups you belong to on your profile page

corpus
Manage who can view your project (users or groups)

We plan further developments for this solution, expanding this new right management system to the processes and workers in the next few releases.

PDF text extraction

When you import a PDF file in Arkindex, its text content will be automatically extracted and converted into Line elements.

pdf_text
PDF text is displayed as Lines with their content

So it's now possible to use those Line elements directly in your Machine Learning workflows!

Drop DataSource

This is the last step in a refactoring that spanned over several months: the last part of the previous Machine Learning workflow system is gone. All Machine Learning results are now linked to workers.

Processes

We have made two changes to make it easier to use our workflow system in a production environment:

  • You can now name processes, using any naming convention you choose. It should be easier to categorize your experiments and large imports. An Edit button is available on every process details page.
  • It's now possible to restart a failed or completed task without restarting the whole process, by clicking on the restart button.

restart
You can edit or restart a task from its details page.

Performance enhancements

A few changes and corrections have been made on the backend and frontend to speedup navigation through elements: you should feel the website being more responsive, especially on large projects.

The most notable changes are described in the section below.

API Changes

  • Added a with_zone query parameter to return elements zone when listing children/parents or a project's elements: this allows us to request less data for special cases. It is enabled by default to avoid a breaking change.
  • Removed element-related attributes from ListElementEntities.
  • Exposed the worker_version attribute when retrieving metadatas.
  • Added an attribute to filter ListTranscriptions by element_type.
  • Added a query parameter to return project information in ListElementChildren and ListElementParents.
  • Added the ListElementMetadata endpoint to list all metadatas for an element: they are not listed anymore in RetrieveElement.
  • Removed the deprecated ListElement endpoint.
  • Removed the thumbnail_put_url attribute in ListElementChildren and ListElementParents.

Frontend changes

  • Elements are automatically hidden when their removal is requested.
  • The transcription modal has been widened to view large transcription lines.
  • Added an header action to select/unselect an element.
  • Worker version details are now displayed on a metadata when there is one.

Bugfixes

  • Allowed agents to download artifacts.
  • Restricted the access to the TaskUpdate and DownloadArtifact endpoints.
  • Generated a random hook token for fake repositories.
  • The ListSelection, AddSelection, RemoveSelection, ValidateClassification and RejectClassification endpoints now require a user with a verified email.
  • The UpdateEntity, PartialUpdateEntity and DestroyEntity endpoints now require a user with a verified email.
  • Restricted read access for public groups.
  • Fix stale read on worker version creation that cause some Git imports to fail.