Datasets

Several operations are available through subcommands:

teklia-dan dataset entities

To extract entities from an Arkindex export. More details in the dedicated page.

teklia-dan dataset tokens

To generate a YAML file containing entities and their token(s) to train a DAN model. More details in the dedicated page.

teklia-dan dataset extract

To extract a dataset from an Arkindex export. More details in the dedicated page.

teklia-dan dataset download

To download images of a dataset. More details in the dedicated page.

teklia-dan dataset language-model

To build language model resources of a dataset. More details in the dedicated page.

teklia-dan dataset analyze

To analyze datasets and display statistics. More details in the dedicated page.