Kaldi

The command atr cer kaldi computes the Character Error Rate (CER) and the Word Error Rate (WER) according to the calculation method used in kaldi, e.g. spaces are ignored in the CER calculation.

Parameters

This command will compute and display scores (CER and WER) by split.

Parameter Description Type Default

--train_truth_file

Path to the train truth file.

Path

--val_truth_file

Path to the validation truth file.

Path

--test_truth_file

Path to the test truth file.

Path

--pred_file

Path to the prediction file.

Path

--preprocess_list

A list of preprocessing functions to be applied on truth and predicted texts. Accepted preprocessing functions are: "ignore_case", "ignore_punct", "ignore_numbers", "escape_punct".

List[str]

[]

--confidence_scores

Whether the prediction file includes confidence scores.

bool

False

The user must provide truth files for the different splits and a prediction file, both in PyLaia format. Examples of expected input files are available in tests/examples/.

Four preprocessing functions are available:

  • "ignore_case": Lower the text before computing error rates,

  • "ignore_punct": Ignore the punctuation before computing error rates,

  • "ignore_numbers": Ignore all numbers before computing error rates,

  • "escape_punct": Consider punctuation characters as separate words.

Examples

Compute scores on a single split

To compute scores on a single split, run the following command:

atr cer kaldi --test_truth_file tests/examples/truth_test.txt --pred_file tests/examples/pred_test.txt

Expected output:

WARNING:utils.py:Input files sizes are not equal: TRUTH (29) but PRED = 27
WARNING:utils.py:2 lines are in the truth dict but missing in the prediction dict:
 - naf/0023182e-5b60-42d6-af90-1b665ccacf0d_038_adc93773-37ce-4e71-b7be-bcbd74508798
 - naf/00586988-8313-43fb-a419-3d7ddd895c30_018_6285195c-39ca-41fb-a7e9-774585cbf079
| Split   |   CER (%) |   WER (%) |   Support |
|---------|-----------|-----------|-----------|
| test    |      4.47 |     20.21 |        27 |

Note that warnings will be issued if the prediction and truth file sizes do not match. In this case, two transcriptions are missing from the prediction file.

Compute scores on a single split confidence scores

To compute scores on a single split when the prediction file includes confidence scores, run the following command:

atr cer kaldi --test_truth_file tests/examples/truth_test.txt \
              --pred_file tests/examples/pred_test_conf.txt \
              --confidence_score

The --confidence_score argument will ensure that confidence scores are not considered as predicted text.

Expected output:

WARNING:utils.py:Input files sizes are not equal: TRUTH (29) but PRED = 27
WARNING:utils.py:2 lines are in the truth dict but missing in the prediction dict:
 - naf/0023182e-5b60-42d6-af90-1b665ccacf0d_038_adc93773-37ce-4e71-b7be-bcbd74508798
 - naf/00586988-8313-43fb-a419-3d7ddd895c30_018_6285195c-39ca-41fb-a7e9-774585cbf079
| Split   |   CER (%) |   WER (%) |   Support |
|---------|-----------|-----------|-----------|
| test    |      4.47 |     20.21 |        27 |

Compute scores on all splits

To compute scores on a multiple splits, run the following command:

atr cer kaldi --test_truth_file tests/examples/multiple_datasets/truth_test.txt \
              --val_truth_file tests/examples/multiple_datasets/truth_val.txt \
              --train_truth_file tests/examples/multiple_datasets/truth_train.txt \
              --pred_file tests/examples/multiple_datasets/pred_all.txt

Expected output:

| Split   |   CER (%) |   WER (%) |   Support |
|---------|-----------|-----------|-----------|
| train   |      4.02 |     20.33 |        70 |
| val     |     10.64 |     30.83 |        70 |
| test    |     11.03 |     37.11 |        70 |

Compute scores on all splits with preprocessing

To compute scores on a multiple splits with preprocessing (ignore numbers and punctuation), run the following command:

atr cer kaldi --test_truth_file tests/examples/multiple_datasets/truth_test.txt \
              --val_truth_file tests/examples/multiple_datasets/truth_val.txt \
              --train_truth_file tests/examples/multiple_datasets/truth_train.txt \
              --pred_file tests/examples/multiple_datasets/pred_all.txt \
              --preprocess_list "ignore_numbers" "ignore_punct"

Expected output:

| Split   |   CER (%) |   WER (%) |   Support |
|---------|-----------|-----------|-----------|
| train   |      3.83 |     13.99 |        70 |
| val     |     10.09 |     23.5  |        70 |
| test    |     10.7  |     27.93 |        70 |