Introduction

Hugin is meant to be used in two scenarios:

  • as a standalone tool driven a by an experiment configuration file
  • as a library in your code

Both scenarios share same concepts with the main difference that the standalone tool connects all the Hugin components together.

Standalone

Using Hugin involves two steps:

  • training
  • prediction

Both steps are driven using dedicated configuration files. The configuration files are normal YAML files referencing various Hugin components.

This configuration files allow the end user to customize pre-processing, model and post-processing operations.

Training

The training process involves the preparation of a training scenario configuration file. This configuration file is composed out of multiple sections, particularly:

  • Global configuration (the configuration key)
  • Data source specification (the trainer key)
  • Trainer specification (the data_source key)

Global Configuration

Currently in this section (the configuration key in YAML file) you can specify:

  • model_path: a string specifying the “workspace” used for saving the model, and depending on the backend it will hold checkpoints, metrics, etc. This string allows interpolation of trainer attributes.

An example configuration specification could be:

1
2
configuration:
 model_path: "/home/user/experiments/{name}"

Data Source Specification

The data source is intended for locating the data we wish to use in our experiments. As part of Hugin there are multiple data source implementations, particularly:

  • FileSystemLoader: capable of scanning, recursively, a directory for input files and group them together according to a specified pattern.
  • FileLoader: capable of reading file names from an input file. The main purpose of this file is for supporting GDAL Virtual File Systems, for example:
    • /vsicurl/: for retrieving files using cURL (HTTP, FTP, etc)
    • /vsis3/: for retrieving files from AWS S3
    • /vsigs/: for retrieving files from Google Cloud Storage

The data source that should be used is introduced using the YAML data_source key in the YAML file and is an explicit reference to the data source implementation.

The aforementioned data sources can have the following configuration options:

  • data_pattern (mandatory): used for specifying a regular expression matching files that should be taken into consideration
  • id_format (mandatory): used for constructing an scene id used by Hugin for identifying a particular scene. This option is similar to the SQL GROUP BY statement
  • type_format (mandatory): used for identifying the various potential types of data in a scene
  • validation_percent (optional): used for specifying the number of scenes that should be kept for validation purposes
  • randomise (optional, default: `False`): asks the data source to provide the scenes to the other components in a randomized order
  • persist_file (optional): specifies a path where the data source should save the detected files. In case it exists it is used as source for further operation. The main benefit of this configuration option is the ability to reuse the same training/validation split between multiple runs.
  • input_source (mandatory): specifies a location for loading the data. For the FileSystemLoader it represents a directory that should be scanned. For FileLoader it represents an input text file listing all files that should be taken into consideration (on file path per line)

An example configuration for loading the data from the SpaceNet5 competition:

1
2
3
4
5
6
7
8
9
data_source: !!python/object/apply:hugin.io.FileSystemLoader
 kwds:
   data_pattern: '(?P<category>[0-9A-Za-z_]+)_AOI_(?P<location>\d+(_[A-Za-z0-9]+)+)_(?P<type>(PS-MS|PS-RGB|MS|PAN))_(?P<idx>[A-Za-z0-9]+)(?P<gti>_GTI)?.(?P<extension>(tif|tiff|png|jpg|jp2))$'
   id_format: '{location}-{idx}'
   type_format: '{type}{gti}'
   validation_percent: 0.2
   randomise: True
   persist_file: "/storage/spacenet5/split1.yaml"
   input_source: "/storage/spacenet5"

Model Configuration

This section is aimed for configuring the effective training operation.

The primary key specifying the training operation is the trainer key in the YAML file. Currently Hugin only supports handling of raster operation (handling images of various kinds) through the RasterSceneTrainer

The RasterSceneTrainer implementation offers multiple features like:

  • Tiling (subsampling): splitting input scenes in multiple smaller scenes. This is particularly useful for large inputs where the input can not fit in GPU memory. Hugin support overlapping tiles using a specific stride.
  • Co-registration: synchronize input tiles from the various components forming a scene (Eg. a scene might be composed out of data in multiple resolutions: for WorldView-3 we might have an panchromatic channel with 0.31m spatial resolution and multi-spectral data with 1.24m resolution per pixel)
  • Pre-Processing: applying a series of preprocessing operation on the data before it is ingested by models. Some of the operations supported include standardization, augmentation, etc.

The RasterSceneTrainer assembles the data according to a user specified mapping and feeds the data to a model implementation specified by the user. Both the mapping and the model implementation will be discussed in the following sections.

The options supported by the RasterSceneTrainer are:

  • name (mandatory): specifies a name for the trainer. This name is used in multiple locations, particularly for identifying the model in the experiment workspace (discussed in Global Configuration)
  • window_size (optional): specifies the size of the sliding window used for subsampling. If omitted Hugin assumes that it equals the size of one of the randomly picked scenes
  • stride_size (optional): specifies the stride size to be used in case subsampling is needed. If omitted it is inferred from the window size
  • mapping (mandatory): this configuration option specifies how the input to the model should be assembled. This configuration might be shared both between training and prediction time. It is further discussed in (discussed in Mapping section)
  • model (mandatory) specifies to model to be used for training
Mapping

The mapping concept is further discussed in the Mapping section. One specific requirement related to training is the presence of the target mapping. It is needed for specifying the expected output (ground truth) from the various machine learning models.

Model

This configuration option specifies the model to be trained. It is a reference to one of the backend implementations offered by Hugin:

  • KerasModel: The backend supporting running Keras based models
  • SkLearnStandardizer: A custom backend based on SciKit-Learn for training an SciKit-Learn data standardizer
  • SciKitLearnModel: A backend for supporting model compliant to the SciKit-Learn interface (ToDo)
Keras Model

The KerasModel implementation allow running models defined using Keras. It exposes the following options:

  • name (mandatory): Option specifying the name of the model
  • model_path (optional): The location of the trained model. If it exists it is loaded and training resumes from the loaded state. This is particularly useful for transfer learning
  • model_builder (mandatory): The function to be called for building the model
  • loss (mandatory): Loss function to be used by Keras during training. Any Keras loss can be referenced, or used defined functions
  • optimizer (optional): Optimizer function to be used during training. Any Keras optimizer can be referenced
  • batch_size (mandatory): The batch size to be used for feeding the data to the model
  • epochs (mandatory): The maximum number of epochs to run
  • metrics (optional): A list of metrics to be computed during training
  • checkpoint (optional): If defined it enables model checkpoints according to specified configuration. It allows setting the following options:
    • save_best_only (default: False): Saves only the best model
    • save_weights_only (default: False): Save only the model weights
    • mode (valid options: auto, min, max): Save models based on either the maximization or the minimization of the monitored quantity. This only applies when save_best_only is enabled
    • monitor: quantity to be monitored (eg. val_loss or any user defined metric)
  • enable_multi_gpu (optional, default=False): enable multiple GPU usage
  • num_gpus (optional): number of GPUs to be used by Keras
  • callbacks (optional): list of Keras callbacks to be enabled. List is composed out of Keras callbacks or compatible user defined callbacks.

An example configuration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 model: !!python/object/apply:hugin.engine.keras.KerasModel
   kwds:
     name: keras_model1
     model_builder: sn5.models.wnet.wnetv9:build_wnetv9
     batch_size: 200
     epochs: 9999
     metrics:
       - accuracy
       - !!python/name:hugin.tools.utils.dice_coef
       - !!python/name:hugin.tools.utils.jaccard_coef
     loss: categorical_crossentropy
     checkpoint:
       monitor: val_loss
     enable_multi_gpu: True
     num_gpus: 4
     optimizer: !!python/object/apply:keras.optimizers.Adam
       kwds:
         lr: !!float 0.0001
         beta_1: !!float 0.9
         beta_2: !!float 0.999
         epsilon: !!float 1e-8
     callbacks:
       - !!python/object/apply:keras.callbacks.EarlyStopping
         kwds:
           monitor: 'val_dice_coef'
           min_delta: 0
           patience: 40
           verbose: 1
           mode: 'auto'
           baseline: None
           restore_best_weights: False
Limitations
  • Hugin assumes all scenes have an equal size per data type (eg. all multispectral data has the same size).
  • Hugin only support square sliding windows. This is expected to be fixed in an upcoming version
  • Hugin only support the same stride size both horizontally and vertically

Example Experiment

A complete example configuration is depicted bellow:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
configuration:
 model_path: "/home/user/experiments/{name}"
data_source: !!python/object/apply:hugin.io.FileSystemLoader
 kwds:
   data_pattern: '(?P<category>[0-9A-Za-z_]+)_AOI_(?P<location>\d+(_[A-Za-z0-9]+)+)_(?P<type>(PS-MS|PS-RGB|MS|PAN))_(?P<idx>[A-Za-z0-9]+)(?P<gti>_GTI)?.(?P<extension>(tif|tiff|png|jpg|jp2))$'
   id_format: '{location}-{idx}'
   type_format: '{type}{gti}'
   validation_percent: 0.2
   randomise: True
   persist_file: "/storage/spacenet5/split1.yaml"
   input_source: "/storage/spacenet5"
trainer: !!python/object/apply:hugin.engine.scene.RasterSceneTrainer
         kwds:
           name: raster_keras_trainerv2
           stride_size: 100
           window_size: [256, 256]
           model: !!python/object/apply:hugin.engine.keras.KerasModel
             kwds:
               name: keras_model1
               model_builder: sn5.models.wnet.wnetv9:build_wnetv9
               batch_size: 200
               epochs: 9999
               metrics:
                 - accuracy
                 - !!python/name:hugin.tools.utils.dice_coef
                 - !!python/name:hugin.tools.utils.jaccard_coef
               loss: categorical_crossentropy
               checkpoint:
                 monitor: val_loss
               enable_multi_gpu: True
               num_gpus: 4
               optimizer: !!python/object/apply:keras.optimizers.Adam
                 kwds:
                   lr: !!float 0.0001
                   beta_1: !!float 0.9
                   beta_2: !!float 0.999
                   epsilon: !!float 1e-8
               callbacks:
                 - !!python/object/apply:keras.callbacks.EarlyStopping
                   kwds:
                     monitor: 'val_dice_coef'
                     min_delta: 0
                     patience: 40
                     verbose: 1
                     mode: 'auto'
                     baseline: None
                     restore_best_weights: False
           mapping:
             inputs:
               input_1:
                 primary: True
                 channels:
                   - [ "PAN", 1 ]
                 window_size: [256, 256]
               input_2:
                 window_size: [64, 64]
                 channels:
                   - [ "MS", 1 ]
                   - [ "MS", 5 ]
                   - [ "MS", 4 ]
                   - [ "MS", 8 ]
             target:
               output_1:
                 channels:
                   - [ "PAN_GTI", 1 ]
                 preprocessing:
                   - !!python/object/apply:hugin.io.loader.BinaryCategoricalConverter
                     kwds:
                       do_categorical: False

Assuming that the above configuration is saved in a file named experiment.yaml, training can be started as follows:

hugin train --config experiment.yaml

Prediction

Similarly to training, the prediction processes involved the creation of a prediction configuration file. The configuration file is similar to the training file and involves:

  • Data source specification (the data_source key)
  • Predictor configuration (the predictor key)
  • Output configuration (the output key)

Data Source Specification

The data source specification is identical to Data Source Specification used during the training.

Predictor Configuration

This section of the configuration file is aimed in configuring the predictors handling the raster files. The predictors handle the tilling of input image (if needed) and fit the data to the machine learning models, assembling the overall prediction.

Currently we provide the following raster based predictors:

  • RasterScenePredictor: providing the core raster scene handling, delegating the prediction to a trained model
  • AvgEnsembleScenePredictor: provides ensembling between multiple instances of RasterScenePredictor
RasterScenePredictor

The RasterScenePredictor is similar to the RasterSceneTrainer, providing similar capabilities.

The options provided by the RasterScenePredictor are:

  • name (mandatory): specified a name for the predictor
  • window_size (optional): specifies the size of the sliding window used for subsampling. If omitted Hugin assumes that it equals the size of one of the randomly picked scenes
  • stride_size (optional): specifies the stride size to be used in case subsampling is needed. If omitted it is inferred from the window size
  • mapping (mandatory): this configuration option specifies how the input to the model should be assembled. This configuration might be shared both between training and prediction time. It is further discussed in (discussed in Mapping section)
  • model (mandatory) specifies to model to be used for prediction
Mapping

The mapping concept is further discussed in the Mapping section. During the prediction process the presence of the target mapping is optional, and if provided it will be used for computing performance metrics

Model

This configuration option specifies the model to be trained. It is a reference to one of the backend implementations offered by Hugin:

  • KerasModel: The backend supporting running Keras based models
  • IdentityModel: Dummy model returning as prediction its input
  • SciKitLearnModel: A backend for supporting model compliant to the SciKit-Learn interface (ToDo)
Keras Model

The model configuration is identical to the one described in Keras Model with the the difference that most arguments are ignored, with the exception of batch_size.

Example configuration

Output configuration

This configuration section is responsible for exporting the predictions.

Hugin supports multiple exports:

  • RasterIOSceneExporter: exporter dumping the prediction output in geo-referenced Tiff files
  • GeoJSONExporter: exporter vectorizing prediction masks and outputting in GeoJSON files
  • MultipleFormatExporter: an compound exporter allowing exporting in multiple formats
RasterIO Exporter

The RasterIO Exporter provides the ability of exporting geo-referenced Tiff files. Exported files inherit the SRS of a specified component of a scene.

The options supported by the exporter are:

  • srs_source_component (optional): the component of the scene that should be the source of the SRS and coordinates
  • filename_pattern (optional, default: “{scene_id}.tif”): the filename pattern that should be used for newly created files
  • rasterio_creation_options (optional): Options updating various RasterIO/GDAL profile options. See RasterIO Profile for more detailed information.
  • rasterio_options (optional): Options controlling the RasterIO environment. See RasterIO Environment for more detailed information.
Multiple Format Exporter

This exporter allows exporting predictions in multiple formats by wrapping the other supported exporters.

The options supported by the exporter are:

  • exporters (optional): a list o exporters. Each exporter will be triggered separately for each prediction.

An example configuration for an exporter could be:

1
2
3
4
output: !!python/object/apply:hugin.engine.scene.RasterIOSceneExporter
  kwds:
     filename_pattern: '{scene_id}.tif'
     srs_source_component: 'RGB'

Example configuration

1
2
3
4
output: !!python/object/apply:hugin.engine.scene.RasterIOSceneExporter
  kwds:
     filename_pattern: '{scene_id}.tif'
     srs_source_component: 'RGB'

Mapping

The data mapping functionality represents one of the core features of Hugin. It is used by the RasterSceneTrainer and RasterScenePredictor for assembling input data that is sent to the underlying models.

ISPRS Dataset Example

The following example illustrates how to use Hugin for running training and predictions on the ISPRS benchmark dataset for 2D semantic labeling on the city of Potsdam.

This configuration will take in account only RGB input and GTI (or label) output provided in the ISPRS dataset. The U-Net version that comes shipped with Hugin will be used for training.

Training

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
configuration:
  model_path: "/home/alex/experiments/{name}"
data_source: !!python/object/apply:hugin.io.FileSystemLoader
  kwds:
    data_pattern: '(top)_(?P<city>[A-Za-z]+)_(?P<cm>[0-9]+)_(?P<area>[0-9]+)_(?P<type>[A-Za-z]+)\.tif$'
    id_format: '{cm}_{area}'
    type_format: '{type}'
    input_source: '/mnt/ISPRS/training/'
    persist_file: '/tmp/hugin-isprs-cache.yaml'
    validation_percent: 0.2
    randomise: True

trainer: !!python/object/apply:hugin.engine.scene.RasterSceneTrainer
  kwds:
    name: isprs_model_example
    stride_size: 100
    window_size: [256, 256]
    model: !!python/object/apply:hugin.engine.keras.KerasModel
      kwds:
        name: keras_model
        model_path: "/home/alex/experiments/{name}"
        model_builder: hugin.models.unet.unetv14:unet_v14
        batch_size: 20
        epochs: 2
        metrics:
          - accuracy
          - !!python/name:hugin.tools.utils.dice_coef
          - !!python/name:hugin.tools.utils.jaccard_coef
        loss: categorical_crossentropy
        checkpoint:
          monitor: val_loss
        num_gpus: 1
        optimizer: !!python/object/apply:keras.optimizers.Adam
          kwds:
            lr: !!float 0.0001
            beta_1: !!float 0.9
            beta_2: !!float 0.999
            epsilon: !!float 1e-8
        callbacks:
          - !!python/object/apply:keras.callbacks.EarlyStopping
            kwds:
              monitor: 'val_dice_coef'
              min_delta: 0
              patience: 40
              verbose: 1
              mode: 'auto'
              baseline: None
              restore_best_weights: False
    mapping:
      inputs:
        input_1:
          primary: True
          channels:
            - [ "RGB", 1 ]
            - [ "RGB", 2 ]
            - [ "RGB", 3 ]
      target:
        output_1:
          window_shape: [256, 256]
          stride: 100
          channels:
            - [ "GTI", 1 ]
          preprocessing:
            - !!python/object/apply:hugin.io.loader.BinaryCategoricalConverter
              kwds:
                do_categorical: False

After this, we can simply start training our U-Net variant with Hugin by simply running:

hugin train --config ./etc/usecases/train_isprs.yaml

Predictions

After training a model, running predictions is pretty straightforward with Hugin.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
data_source: !!python/object/apply:hugin.io.FileSystemLoader
  kwds:
    data_pattern: '(top)_(?P<city>[A-Za-z]+)_(?P<cm>[0-9]+)_(?P<area>[0-9]+)_(?P<type>[A-Za-z]+)\.tif$'
    id_format: '{cm}_{area}'
    type_format: '{type}'
    input_source: '/mnt/ISPRS/training/'

predictor: !!python/object/apply:hugin.engine.scene.RasterScenePredictor
  kwds:
    name: isprs_predictor
    model: !!python/object/apply:hugin.engine.keras.KerasModel
      kwds:
        name: keras_predictor
        model_path: /storage/syno1/SpaceNet-Roads/alex-train/models/unetv14_spacenetroads_vegas_tiles_twentypercent_adam_cat_crossentropy/thor.sage.ieat.ro-tardis.hdf5
        model_builder: hugin.models.unet.unetv14:unet_v14
    stride_size: 256
    window_size: [256, 256]
    mapping:
      inputs:
        input_1:
          primary: True
          channels:
            - [ "RGB", 1 ]
            - [ "RGB", 2 ]
            - [ "RGB", 3 ]
    prediction_merger: !!python/name:hugin.engine.core.NullMerger

output: !!python/object/apply:hugin.engine.scene.RasterIOSceneExporter
  kwds:
     destination: "/home/alex/postdam_predictions"
     filename_pattern: '{scene_id}.tif'

Then, for running the predictions you just have to specify the path to the configuration file, and the paths from where you want to load the data and save the predictions.

hugin predict --config ./etc/usecases/predict_isprs.yaml --input-dir /mnt/ISPRS/prediction/ \
--output-dir /home/alex/potsdam_predictions