Datasets
Description
Use the teklia-layout-reader inference command to predict on a dataset.
| Parameter | Description | Type | Default |
|---|---|---|---|
|
Path to the local LayoutReader dataset directory. The directory must contain |
|
|
|
Dataset split to use. Must match the name of a corresponding archive in the |
|
|
|
Name of the LayoutReader checkpoint dataset. |
|
|
|
Output directory where results will be saved. If |
|
|
|
Whether to use the zone classes for the prediction. |
|
|
|
Whether to use the separators for the prediction. |
|
|
|
Path to the images (required to visualize predictions). |
|
|
|
Whether to visualize the predicted reading order. Plots will be saved in |
|
|
Examples
Predict reading order on bounding boxes
This command will predict the reading order on the test set of the given dataset, using only bounding box coordinates.
teklia-layout-reader inference --dataset finlam_dataset/ \
--split test \
--model checkpoint_finlam_best/ \
--output output_finlam \
The prediction file (predictions.json) will be saved in the --output directory.
It contains the following information for each sample:
-
"boxes": the list of input boxes. -
"classes": the list of input classes. -
"separators": the list of input separators. -
"predicted_order": the predicted reading order. -
"target_order": the target reading order (only when predicting on an annotated dataset). -
"average_relative_distance": the average absolute difference between the predicted index and the annotated index for each box (only when predicting on an annotated dataset). -
"visualization": the path to the visualization image (only when--visualizeis used).
Predict reading order on bounding boxes using additional information (classes / separators)
By default, LayoutReader makes decision based on box coordinates, but it can also use additional information:
-
--with-classes- LayoutReader will also use the class information for each zone (ex: title, subtitle, body, advertisement…). -
--with-separators- LayoutReader will also use the localization of vertical and horizontal separators.
teklia-layout-reader inference --dataset finlam_dataset/ \
--split test \
--model checkpoint_finlam_best/ \
--output output_finlam \
--with-classes \
--with-separators
Predict reading order on pre-ordered bounding boxes
The input boxes can be pre-sorted to reduce processing errors. Four methods are available via --sort-method:
-
sortxy_by_columns -
sortxy -
sortyx -
random
By default, the boxes are not ordered, and will be taken as they appear in the dataset.
teklia-layout-reader inference --dataset finlam_dataset/ \
--split test \
--model checkpoint_finlam_best/ \
--output output_finlam \
--sort-method random
Predict and visualize
Use the --visualize option to plot the predicted reading order on the images.
teklia-layout-reader inference --dataset finlam_dataset/ \
--split test \
--model checkpoint_finlam_best/ \
--output output_finlam \
--with-classes \
--with-separators \
--images finlam_dataset/images/test/ \
--visualize
This will create visualizations in your output directory: