Arkindex 1.9.1
We are happy to announce that a new Arkindex release is available. You can explore Arkindex and try out the newest features on our demo instance, demo.arkindex.org.
Project categories
Projects can now be grouped into categories. Those categories can only be edited through the administration interface, and they are only visible to instance administrators.
In future releases, those categories will allow automated actions such as deleting inactive projects. We will enable this on our demo instance in order to purge any test data uploaded by users after some time.
Worker configurations
Significant progress has been made on the new worker configuration format. The complete specification is now publicly available on our documentation website. Workers using this new format can now be validated and imported into Arkindex, but they cannot yet be executed. This release includes the necessary changes to then get the Base Worker project to start supporting this new format.
A new teklia-worker-configuration Python package is available on PyPI. This project provides a command-line utility named worker-validator which can convert files written in the previous configuration format into the new one, as well as validate files that are already written in the new format.
A new ImportWorkerVersion API endpoint has been introduced to receive a YAML file that defines a worker in this new format, then create or update a worker and a worker version. The latest version of the Arkindex CLI can now validate and publish these new versions using the new arkindex worker import subcommand.
Workers imported through this new system can be added to a process and configured through the new configuration form, with new field types and better error validation.
Process execution
The integrity of model versions is now verified after they are downloaded and before any task starts. Any issue that occurs when both downloading and checking a model version is now more explicitly shown in the task’s logs. The API changes made to allow for this also pave the way for future improvements in Enterprise Edition which could speed up processes using large models and reduce disk usage.
A new experimental StartWorkerActivity API endpoint has been introduced. This endpoint allows a worker running within a Ponos task to request the next element to process, without needing an JSON file of elements generated by an initialisation task. This will provide a more efficient way to process elements in multiple chunks, as chunks on faster servers could ask for more elements than the slower ones.
API removals
Some deprecated or unused parts of the API have been removed in this release:
-
The support for worker version IDs in
UpdateWorkerActivitywas deprecated since Arkindex 1.5.1 and has been removed. -
The
hashattribute on images has been removed. To create an image throughCreateImage, thepathis now required. -
The
s3_read_only_bucketoption has been removed from image servers. This option no longer has any effect without ahashon images. -
The
CreateIIIFInformationAPI endpoint has been removed. You can useCreateIIIFURLinstead, specifying the URL of the image directly instead of having to retrieve its metadata yourself.
Bug fixes
-
Confidence scores of 0% are now shown on elements and transcriptions.
-
The Esc key now cancels rectangle creation or polygon edition just like it cancels polygon creation.
-
In Community Edition, executing a task will now properly add a
finisheddate on the process when all tasks of the process are finished. -
Uploading a model version using the
arkindex upload model_versioncommand now identifies it as a.tar.zstarchive, allowing the model versions to work exactly the same as they would be when they are uploaded througharkindex models publish.
Upgrade notes
To upgrade a development instance, follow this documentation.
To upgrade a production instance, you need to:
-
Deploy this release’s Docker image:
registry.gitlab.teklia.com/arkindex/backend:1.9.1 -
Run the database migrations:
docker exec ark-backend arkindex migrate -
Update the system workers:
docker exec ark-backend arkindex update_system_workers
The main changes impacting developers and system administrators are detailed below.
Default corpus category setting
A concept of categories has been added to corpora. The data migrations will automatically create a default category to apply to each corpus.
A new default_corpus_category setting has been added to manually specify the default corpus category by its slug. The category must exist in the database. Otherwise, a new W016 warning will occur. The default value for this setting is default, which is the slug for the automatically created default category.