We are pleased to announce the release of Arkindex version 0.14.1, enabled on our demo instance
Arkindex now relies on the postgis database extension for Postgresql. This will allow us to develop new features that will take advantage of its geographical computing capabilities.
We already use said features to bring some performances and data checks improvements regarding the image zones.
Arkindex now checks and require that you have Maintainer or Owner access to a Gitlab repository when using the
Add repository feature. This is required to enable the webhook on the project.
You can always fork a repository to gain ownership.
We have worked a lot to improve the new workers system based on Git repositories.
It's now possible to see precisely which elements are used when starting a new Process. You can pick an element, then load all its children, and finally filter by a specific type (to pick all the
Page nested under a
Volume for example).
<embed alt="elements" embedtype="image" format="fullwidth" id="91"/>
This selection is directly used by the ML workers (simplifying our previous implementation, and thus avoiding bugs!).
You can again split the workload by chunks (up to 10), so that different workers run in parallel (this feature was missing in 0.14.0 in comparison with the previous workflow system)
The Docker image build has also been enhanced:
- build only once a
Dockerfileif it's shared across worker versions
- use Docker image ID to reference images instead of tags
- delete Docker image on the ponos hosts, once they are stored safely on our distribution platform
Finally, you can view which worker version has produced elements, transcriptions, classifications or entities, by hovering over the name of the worker. A small popup will appear with details.
- A new endpoint
CreateMLClassallows you to create a new ML class on a corpus
- Require zones for the endpoint
- Serialize the dataimport created when adding a repository, instead of returning only its ID
text_zoneelement type is added by default to all new corpus
- A new corpus is created for all new IIIF imports
- The endpoint
ListMLClassesis deprecated in favor of
- The endpoint
DataImportElementsis deprecated in favor of
- The parameter
with_transcription_sources_counton the endpoint
ListElementshas been removed.
- The frontend no longer exposes the previous workflow system
- Replace assertions with conditions in registration and login APIs
- Allow a confidence of 0 in CreateClassification
- Preserve Python assert statements in binary build
- Avoid stale read while retrieving elements on a new Process
- Avoid stale read on newly created data import
- Avoid returning extra rows from ListElementNeighbors
- Fix Transkribus login process before validating user access
- Prevent unhandled exception warnings on incorrect Transkribus credentials
- Prevent updating an element's zone to an invalid image
- Properly display errors on login
- Prevent 0×0 image size warnings after a store-wide reset
- Prevent duplicate repository creation
- Add a thumbnails generation task to IIIF imports only
- The top level loading time have decreased by 30-40% after simplifying some internal queries.
Arkindex Base worker
- Base Arkindex api client has been updated to 1.0.2
- Automatic creation of missing ML classes
- Bugfix on score & confidence type invalid checks for falsey values