Arkindex 1.6.1

We are happy to announce that a new Arkindex release is available. You can explore Arkindex and try out the newest features on our demo instance, demo.arkindex.org.

Processes

Following the introduction of the RestartTask API endpoint in 1.6.0, process tasks are no longer being overwritten when restarted. This release introduces a link between the old task and the restarted one, using a new original_task_id attribute on tasks. The old tasks are suffixed in order to distinguish them from the new ones.

process restarted task
Figure 1. A process with a restarted task
restarted task logs header
Figure 2. Task logs header with details

It is now possible to create a new process from the elements on which a previous process failed, thanks to a new CreateProcessFailures API endpoint. This action is available from the Workers Activity view of a process.

Access worker activity view from process
Figure 3. Access the worker activity view from a process

From this page, you can create a new process using all the elements on which the previous process failed. You will then be able to configure your process (by selecting workers etc) just like any other process. You can also add all the elements on which the process failed to your selection.

process from failures
Figure 4. Create a new process from failed elements

As part of our ongoing effort to convert internal Arkindex tasks into workers, for more flexibility and easier maintainability, the elements initialisation task at the start of all worker processes has been converted into a worker itself. This does not change anything when you create a worker process, as this step is still added automatically. A new setting has been added for instance administrators, to specify the version of the elements initialisation worker to use.

When viewing the configuration of a process that has already been started and is no longer configurable, the advanced settings are now being retrieved from the process and correctly displayed.

process advanced settings
Figure 5. Advanced settings of a started process

[Enterprise Edition] The handling of hardware requirements for tasks being run using super-computers with Slurm has been updated. Ponos agents now have a mode, which can be either Slurm or Docker, and for Slurm mode agents the hardware reporting requirement has been removed as it does not make sense to report the state of a Slurm frontend host only.

The length of time after which a process task is considered as expired, and can be deleted as part of cleanup, can now be configured on Arkindex instances using a new setting.

Machine learning workers and models

  • Creating a new model from the models list no longer takes you to a new page, but opens a model creation modal instead. This is the same behavior as when you create a new worker from the worker list.

  • The ValidateModelVersion endpoint has been removed, and merged into (Partial)UpdateModelVersion. This endpoint can now be used to make a model version available for processes.

  • Archived models and workers that are not linked to any existing results on an Arkindex instance are now deleted by the cleanup tasks, after a delay that can be set using dedicated settings, and defaults to 30 days.

  • The ListWorkerActivities API endpoint now supports new filters: worker_version_id, model_version_id and worker_configuration_id.

User profile and group management

The profile management page, which can be accessed from the e-mail address dropdown menu in the main navbar, now properly allows users to manage their profile. You can update your display name, as well as change your password.

user profile view
Figure 6. User profile management

The e-mail verification procedure has been updated as well, and now relies on a new VerifyEmail API endpoint.

[Enterprise Edition] Creating a new user group from the groups management tab no longer takes you to a new page, but opens a creation modal instead.

UX improvements

  • The action buttons placement on all modals has been homogenized across Arkindex: they are now placed in the bottom right corner.

  • The transcription panel is now displayed for folder-type elements as well as elements with images, as transcriptions can sometimes be aggregated at a parent element’s level.

  • [Enterprise Edition] The management of ML Classes within a project is now restricted to project administrators, as is the case for other similar project properties (element types, allowed metadata, entity types).

  • [Enterprise Edition] A number of buttons and fields that were not previously disabled when a user did not have sufficient rights to perform an action are now correctly disabled.

Command Line Interface

You can find the dedicated CLI documentation here.

  • It is now possible to send a worker name and a worker description when publishing a worker version using the arkindex worker publish command.

  • The arkindex elements ml-splits command, which is used to create datasets for machine learning, has been updated to use set_names instead of sets. This is a follow-up after the changes introduced in 1.6.0.

  • The arkindex export entities command now supports filtering the exported entities by parent element type and worker version ID.

  • An obsolete filter for deprecation warnings that no longer exist was removed.

  • Obsolete fields were removed from the worker configuration parser.

Bugfixes

  • The code that generates thumbnails for elements in the Arkindex frontend has been updated to handle IIIF URLs that contain query parameters.

  • An unhandled error when attempting to delete an element that is part of a dataset is now properly handled by the frontend.

  • A bug which prevented S3 imports from being assigned to farms and starting in the [Enterprise Edition] has been fixed in the CreateS3Import API endpoint.

Code clean-up

  • The obsolete EntityRole and EntityLink data models have been removed. This implied a new version of the Arkindex Export library as these tables were previously exported. However, older exports can still be re-imported into Arkindex: these tables are simply ignored.

  • As part of our ongoing work to remove Git support from Arkindex, the repository and revision management API endpoints have been removed, as well as the frontend view listing repositories.

Upgrade notes

To upgrade a development instance, follow this documentation.

To upgrade a production instance, you need to:

  • Deploy this release’s Docker image: registry.gitlab.teklia.com/arkindex/backend:1.6.1

  • Run the database migrations: docker exec ark-backend arkindex migrate

The main changes impacting developers and system administrators are detailed below.

Elements initialisation worker

The major change introduced by this release for system administrators and developers is the elements initialisation worker. The elements initialisation at the start of every Workers process used to be an internal Arkindex task, and has been converted to a worker. In order to be able to start workers processes on your Arkindex instance, you need to:

  • Set the docker.init_elements_image setting in your configuration file: it defaults to registry.gitlab.teklia.com/arkindex/workers/init-elements:latest but specifying a tag is recommended, as latest does not guarantee stability.

  • Create the corresponding worker version on your Arkindex instance. If no worker version exists on the instance with the Docker image set in settings, when starting up the backend the system checks will display a warning. You can do this using the worker version publishing command in the CLI, or through the frontend:

    • Go to the workers list page, by clicking on Workers in the user dropdown menu (where your email address is displayed in the main navbar). Use the Create button to create a new worker. You can, for example, name it Elements Initialisation Worker, and set the type as init_worker.

    • Select your new worker in the workers list, and from the versions list on the right, use the Create button to create a new version. Set the Docker image from your settings as the Docker image reference, and {"docker": {"command": "worker-init-elements"}} as the configuration.

If you do not have an elements initialisation worker version correctly set on your instance, you will not be able to launch any workers process.

The current recommended docker.init_elements_image setting is:

registry.gitlab.teklia.com/arkindex/workers/init-elements:0.1.0

Dedicated project exports Redis queue

In order to be able to run them on dedicated servers with sufficient disk space, as they can get quite big, there is now a new Redis queue reserved for project exports. You need to update your deployment to also run RQ workers on the export queue.

For Docker Compose-based deployments, see our sample docker-compose.yml.

If you do not assign RQ workers to the new export queue, no project export job will ever run on your instance.

Ponos task expiry

A new ponos.task_expiry setting is available. It defaults to 30 days. After this delay, tasks can be deleted by the arkindex cleanup command.

ARKINDEX_API_TOKEN deprecation

The system checks that run when you start up the Arkindex backend now show a warning if the ponos.default_env.ARKINDEX_API_TOKEN setting is set. Hard-coding an API token like this, bypassing proper Ponos task authentication, introduced a potential security risk. If you have this setting in your configuration, consider removing it.

Project exports backward compatibility

In 1.6.1, we removed the EntityLink and EntityRole data models. This removes the entity_link and entity_role tables from the project exports, and bumps the export version from 8 to 9. However, version 8 exports can still be imported into an upgraded Arkindex instance using arkindex load_export: these tables are ignored.