Evaluation
Description
Use the teklia-dan evaluate
command to evaluate a trained DAN model.
To evaluate DAN on your dataset:
-
Create a JSON configuration file. You can base the configuration file off the training one. Refer to the dedicated page for a description of parameters.
-
Run
teklia-dan evaluate --config path/to/your/config.json
.
This will, for each evaluated split:
-
Create a YAML file with the evaluation results in the
results
subfolder of thetraining.output_folder
indicated in your configuration. -
Print in the console a metrics Markdown table (see HTR example below).
-
Print in the console a Nerval metrics Markdown table, if the
dataset.tokens
parameter in your configuration is defined (see HTR and NER example below). -
Print in the console the 5 worst predictions (see examples below).
The display of the worst predictions does not support batch evaluation. If the training.data.batch_size parameter is not equal to 1 , then the WER displayed is the WER of the whole batch and not just the image.
|
Parameter | Description | Type | Default |
---|---|---|---|
|
Path to the configuration file. |
|
|
|
Distance threshold for the match between gold and predicted entity during Nerval evaluation. |
|
|
|
Where to save evaluation results in JSON format. |
|
|
|
Sets to evaluate. Defaults to |
|
|
Examples
HTR evaluation
#### DAN evaluation | Split | CER (HTR-NER) | CER (HTR) | WER (HTR-NER) | WER (HTR) | WER (HTR no punct) | | :---: | :-----------: | :-------: | :-----------: | :-------: | :----------------: | | train | x | x | x | x | x | | dev | x | x | x | x | x | | test | x | x | x | x | x | #### 5 worst prediction(s) | Image name | WER | Alignment between ground truth - prediction | | :------------: | :---: | :-----------------------------------------: | | <image_id>.png | x | x | | | | | | | | | x |
HTR and NER evaluation
#### DAN evaluation | Split | CER (HTR-NER) | CER (HTR) | WER (HTR-NER) | WER (HTR) | WER (HTR no punct) | NER | | :---: | :-----------: | :-------: | :-----------: | :-------: | :----------------: | :---: | | train | x | x | x | x | x | x | | dev | x | x | x | x | x | x | | test | x | x | x | x | x | x | #### Nerval evaluation ##### train | tag | predicted | matched | Precision | Recall | F1 | Support | | :-----: | :-------: | :-----: | :-------: | :----: | :---: | :-----: | | Surname | x | x | x | x | x | x | | All | x | x | x | x | x | x | ##### dev | tag | predicted | matched | Precision | Recall | F1 | Support | | :-----: | :-------: | :-----: | :-------: | :----: | :---: | :-----: | | Surname | x | x | x | x | x | x | | All | x | x | x | x | x | x | ##### test | tag | predicted | matched | Precision | Recall | F1 | Support | | :-----: | :-------: | :-----: | :-------: | :----: | :---: | :-----: | | Surname | x | x | x | x | x | x | | All | x | x | x | x | x | x | #### 5 worst prediction(s) | Image name | WER | Alignment between ground truth - prediction | | :------------: | :---: | :-----------------------------------------: | | <image_id>.png | x | x | | | | | | | | | x |