Article and section separation
The newspaper-eval article
command provides a set of metrics to evaluate the quality of article and section detection, based on surface coverage.
To know more about the options of this command, use newspaper-eval article --help
.
Purpose
This command evaluates the alignment between predicted and ground truth articles and sections by performing the following steps:
-
Matching process for sections and articles:
-
Compute an Intersection over Union (IoU) matrix between all predicted and reference zones
-
Use the Hungarian matching algorithm to pair predicted and ground truth articles with an IoU greater than 0.5
-
-
Metric computation:
-
Compute precision, recall, and F1 score based on the matched pairs.
-
Compute the mean IoU across all matched predictions.
-
Parameters
The list of parameters is detailed in this section.
Parameter | Description | Type | Default |
---|---|---|---|
|
Path to the directory containing JSON label files. |
|
|
|
Path to the directory containing JSON prediction files. |
|
|
|
Path to the configuration file with mapping classes. |
|
|
|
Minimum IoU threshold to use for matching. |
|
|
|
Whether to load files using the Journal format. |
|
|
|
Whether to evaluate metrics for each newspaper page. |
|
|
|
Path to a CSV file used to save the evaluation results. |
|
|
|
Whether to allow partial match between the files in |
|
|
Examples
Basic evaluation
Run the following command to compute metrics:
newspaper-eval article --label-dir data/labels/ \
--prediction-dir data/predictions/ \
--config configs/finlam.yaml \
--from-journal
Will output:
INFO Loading labels...
INFO Loading prediction...
INFO The dataset is complete and valid.
INFO Evaluation:
| Level | Precision (%) | Recall (%) | F1 (%) | mIOU (%) | count predicted | count target |
| :-----: | :-----------: | :--------: | :----: | :------: | :-------------: | :----------: |
| article | 62.07 | 64.29 | 63.16 | 80.90 | 58 | 56 |
| section | 73.08 | 67.86 | 70.37 | 80.36 | 26 | 28 |
Evaluation per sample
To compute metrics for each page, use the --per-sample
option:
newspaper-eval article --label-dir data/labels/ \
--prediction-dir data/predictions/ \
--config configs/finlam.yaml \
--from-journal \
--per-sample
Will output:
INFO Loading labels...
INFO Loading prediction...
INFO The dataset is complete and valid.
INFO Evaluation:
| Level | Precision (%) | Recall (%) | F1 (%) | mIOU (%) | count predicted | count target |
| :-----: | :-----------: | :--------: | :----: | :------: | :-------------: | :----------: |
| article | 62.07 | 64.29 | 63.16 | 80.90 | 58 | 56 |
| section | 73.08 | 67.86 | 70.37 | 80.36 | 26 | 28 |
INFO Per sample evaluation:
| Sample | Level | Precision (%) | Recall (%) | F1 (%) | mIOU (%) | count predicted | count target |
| :------------: | :-----: | :-----------: | :--------: | :----: | :------: | :-------------: | :----------: |
| 4100130_1.json | article | 93.33 | 82.35 | 87.50 | 80.29 | 15 | 17 |
| 4100130_2.json | article | 63.64 | 53.85 | 58.33 | 81.67 | 11 | 13 |
| 4100130_3.json | article | 46.88 | 57.69 | 51.72 | 81.12 | 32 | 26 |
| 4100130_1.json | section | 93.33 | 82.35 | 87.50 | 80.29 | 15 | 17 |
| 4100130_2.json | section | 50.00 | 44.44 | 47.06 | 82.91 | 8 | 9 |
| 4100130_3.json | section | 33.33 | 50.00 | 40.00 | 71.13 | 3 | 2 |
Evaluation and saving results to CSV
To save metrics in a CSV file, use the --save-csv-path
option:
newspaper-eval article --label-dir data/labels/ \
--prediction-dir data/predictions/ \
--config configs/finlam.yaml \
--from-journal \
--save-csv-path metrics.csv
Will output:
INFO Loading labels...
INFO Loading prediction...
INFO The dataset is complete and valid.
INFO Evaluation:
| Level | Precision (%) | Recall (%) | F1 (%) | mIOU (%) | count predicted | count target |
| :-----: | :-----------: | :--------: | :----: | :------: | :-------------: | :----------: |
| article | 62.07 | 64.29 | 63.16 | 80.90 | 58 | 56 |
| section | 73.08 | 67.86 | 70.37 | 80.36 | 26 | 28 |
INFO Saving metrics to CSV: metrics.csv.
This will create a new metrics.csv
file:
Level,Precision (%),Recall (%),F1 (%),mIOU (%),count predicted,count target
article,62.07,64.29,63.16,80.90,58,56
section,73.08,67.86,70.37,80.36,26,28