Tasks

When a process is started, its worker runs are converted into tasks.

A task is the execution of a single worker run. There may be multiple tasks per worker run when the process has been configured with chunks, has been retried, or when a single task has been restarted.

Task runs

The status page of each process provides an overview on the tasks of the process, grouped by run. Starting a process will create the first run, and retrying a process will create the subsequent runs.

A task run is only a number that helps with this grouping, and has no other effect on the execution of a process.

When there is only one run in a process, the runs are not displayed.

Task runs are not to be confused with worker runs.

Task states

As a task gets executed, it will go through multiple states. These provide a general indication of its health.

stateDiagram %% Normal flow Unscheduled --> Pending Pending --> Running Running --> Completed %% Failure modes state "Task error" as Failed state "System error" as Error Running --> Failed Running --> Error Running --> Cancelled %% Manual stop Pending --> Stopping Running --> Stopping Stopping --> Stopped Unscheduled --> Stopped classDef dark fill:#909090,color:black classDef warning fill:#ffdd57,color:black classDef primary fill:#158cba,color:black classDef success fill:#28b62c,color:black classDef danger fill:#ff4136,color:black class Unscheduled dark class Pending,Stopping warning class Running primary class Completed success class Failed,Error,Cancelled,Stopped danger

Unscheduled

This task is not ready to be executed. This is the default state of a task once it gets created. If a task appears to be stuck in this state, there may be multiple causes:

This task depends on other tasks that have not yet entered a Completed state.
In Community Edition, an error occurred after creating the process and the tasks could not be scheduled.
In Enterprise Edition, for tasks that do not depend on any other task, there are no agents available to process this task yet.

Pending

All the conditions are met for this task to be started, but the task is waiting to be executed by an asynchronous worker.

If a task appears to be stuck in this state, there may be no asynchronous workers available to execute it. In particular, tasks for worker runs that require a GPU are restricted to hosts that have a GPU available, which could reduce the number of hosts that this task could run on.

Running

The task is currently being executed. Detailed updates may be visible through the logs of the task.

Completed

The task has finished executing and was successful. It cannot be restarted, but retrying the process will still re-execute it.

Task error

The task ran, but failed. The task’s logs likely contain further details on the error.

In the Arkindex API, this is also known as failed.

System error

An error occurred in the asynchronous worker or agent at any step of a task’s execution: preparing it, starting it, reporting on its status, uploading its artifacts, etc. The task’s logs likely contain further details on the error. You may need to contact a system administrator for further assistance.

In the Arkindex API, this is also known as error.

Cancelled

This task has reached its time limit and its execution has been automatically stopped.

Stopping

A user has requested for this task to be stopped. The asynchronous worker or agent that executes this task either has not yet received this instruction, or is busy stopping the task.

Stopped

The task has been stopped after a manual request.

Environment variables

Some environment variables are automatically defined by Arkindex and provided to tasks depending on the context in which they execute.

PONOS_DATA: Directory where a task will be able to find artifacts from the tasks that it depends on. One subdirectory will be created for each parent task, named after the UUID of each task. A symbolic link also allows to access these directories by the slug of the task.

Another subdirectory exists with the current task’s UUID, along with a symbolic link named current. Any file created within this directory will be saved as an artifact of this task after it finishes.
PONOS_TASK: The UUID of the current task.
ARKINDEX_PROCESS_ID: UUID of the process that this task is executing for.
ARKINDEX_WORKER_RUN_ID: UUID of the worker run of this task. In some cases, tasks may not have a defined worker run, so this variable may be missing.
ARKINDEX_API_URL: URL to the root of the Arkindex API. This variable can be detected and used automatically by the Python API client.
ARKINDEX_API_CSRF_COOKIE: Name of the cookie used for cross-site resource forgery protection in the Arkindex API. This variable can be detected and used automatically by the Python API client.

ARKINDEX_TASK_TOKEN: A token to be used for authentication with the API. This variable can be detected and used automatically by the Python API client.
ARKINDEX_TASK_CHUNK: When the process has been started with chunks, set to the chunk number of the current task. For example, in a process with 2 chunks, two duplicate tasks are created, one with ARKINDEX_TASK_CHUNK set to 1 in the first one and 2 in the second one.
TASK_ELEMENTS: Path to a JSON file that contains a list of elements to process. This is only available in inference processes, when it has not been disabled in the advanced settings, or for the thumbnail generation system worker after a file import.

S3 ingestion processes

The following environment variables are only set in the context of an S3 ingestion process.

INGEST_S3_ACCESS_KEY: Access key ID for authentication to the S3 bucket to import from. Set from the ingest.access_key_id setting.
INGEST_S3_SECRET_KEY: Secret access key for authentication to the S3 bucket to import from. Set from the ingest.secret_access_key setting.
INGEST_S3_ENDPOINT: A custom endpoint to use to connect to the S3 bucket, instead of Amazon Web Services. Set from the ingest.endpoint setting. When this setting is missing, the variable will not be set.
INGEST_S3_REGION: Region to use to connect to the S3 bucket through Amazon Web Services. Set from the ingest.region setting. When this setting is missing, the variable will not be set.

Time limit

A maximum execution time limit can be configured to control the use of infrastructure resources. There are three levels of granularity on the time limit:

The ponos.maximum_task_ttl allows system administrators to define a default system-wide time limit.
A time limit can be configured per project using the administration interface. When it is set, it will override the system-wide limit for all tasks created in all processes within this project.
The time limit can be updated for individual tasks, after they have been created, using the administration interface. This limit will be reused for any restarts of this task.

When the time limit of a task is set to zero, the execution time becomes unlimited.

It is recommended to set the system-wide time limit to the strictest limit given to the least trusted users of an instance, then to intervene per project or task to increase those limits as needed. This is particularly important to prevent abuse on instances that allow registrations.

Agents

This feature is only available in Enterprise Edition. In Community Edition, tasks have no agents.

In Enterprise Edition, tasks are executed by Ponos agents. The agent to which a task is assigned is shown next to the task’s name.

In the process list, the View agents button in the top-right corner opens a separate page. This page provides an overview of the hardware resources available on each agent, as well as the tasks assigned to it.

Farms

This feature is only available in Enterprise Edition. In Community Edition, there are no agents and no farms.

Every agent is part of exactly one farm. A farm provides an authentication token called a seed that allows each agent to self-register on an Arkindex instance, so that agents can be deployed on the fly.

When configuring a process, a farm can be set on the process through the advanced settings. The tasks in this process will run on any agent within this farm.

When no farm has been explicitly configured, a default farm is used. The ponos.default_farm option sets the default farm to use in the backend configuration.

Farms have access rights. Users will need guest access to a farm in order to use it in a process.

On an instance that allows registrations, system administrators can grant access to the default farm to a default group to allow users to execute processes immediately after registration.