#!/bin/bash
__doc__='
This shell script serves as an executable example for how to train and evaluate
a fusion model on SMART project data.


This tutorial assumes you have:

    1. Setup the project DVC repo

    2. Have registered the location of your DVC repo with geowatch_dvc.

    4. Have pulled the appropriate dataset (in this case Drop4)
       and have unzipped the annotations.

    3. Have a script that predicts features you would like to test.

    5. Have the IARPA metrics code installed:

        # Clone this repo and pip install it to your watch environment
        https://gitlab.kitware.com/smart/metrics-and-test-framework

See these docs for details:
    ../docs/getting_started_dvc.rst
    ../docs/access_dvc_repos.rst
    ../docs/using_geowatch_dvc.rst

This tutorial will cover:

    1. Predicting your features.
    2. Training a fusion model with your features.
    3. Packaging your fusion model checkpoints.
    4. Evaluating your fusion model against the baseline.
'

DATA_DVC_DPATH=$(geowatch_dvc --tags='phase2_data' --hardware=auto)
EXPT_DVC_DPATH=$(geowatch_dvc --tags='phase2_expt' --hardware=auto)

echo "
EXPT_DVC_DPATH=$EXPT_DVC_DPATH
DATA_DVC_DPATH=$DATA_DVC_DPATH
"


__doc_compute_feature__='

Your predict command must specify:

    1. the path to the input kwcoco file

    2. the path to the output kwcoco file which will contain your features
       (Avoid requiring that other output paths are specified. Use default
        paths that are relative to the directory of the output kwcoco file)

    3. the path to your model(s)

    4. any other CLI parameters to configure details of feature prediction.

You will have to specify the exact details for your features, but as an example we
provide a script that will work to predict invariant features if your machine has
enough resources (you need over 100GB of RAM as of 2022-12-21; we would like to fix
this in the future).

You will also need to ensure the referenced model is pulled from the experiments DVC repo.
'
compute_features(){
    # A bash function that runs invariant prediction on a kwcoco file.
    SRC_KWCOCO_FPATH=$1
    DST_KWCOCO_FPATH=$1
    python -m geowatch.tasks.invariants.predict \
        --input_kwcoco="$SRC_KWCOCO_FPATH" \
        --output_kwcoco="$DST_KWCOCO_FPATH" \
        --pretext_package_path="$EXPT_DVC_DPATH"/models/uky/uky_invariants_2022_12_05/TA1_pretext_model/pretext_package.pt \
        --input_space_scale=30GSD \
        --window_space_scale=30GSD \
        --patch_size=256 \
        --do_pca 0 \
        --patch_overlap=0.0 \
        --num_workers="2" \
        --write_workers 2 \
        --tasks before_after pretext
}

# Compute your features on the train and validation dataset
compute_features \
    "$DATA_DVC_DPATH"/Drop4-BAS/data_train.kwcoco.json
    "$DATA_DVC_DPATH"/Drop4-BAS/data_train_invariants.kwcoco.json

compute_features \
    "$DATA_DVC_DPATH"/Drop4-BAS/data_vali.kwcoco.json
    "$DATA_DVC_DPATH"/Drop4-BAS/data_vali_invariants.kwcoco.json

# After your model predicts the outputs, you should be able to use the
# geowatch visualize tool to inspect your features. The specific channels you
# select will depend on the output of your predict script.
python -m geowatch visualize "$DATA_DVC_DPATH"/Drop4-BAS/data_vali_invariants.kwcoco.json \
    --channels "invariants.5:8,invariants.8:11,invariants.14:17" --stack=only --workers=avail --animate=True \
    --draw_anns=False


# shellcheck disable=SC2016
__doc_data_splits__='

Because only some of the regions actually need 100GB to compute the invariants,
it is possible to split the train and validation kwcoco files into a single
kwcoco file per-video and run the compute_features function on the output
individually.

.. code:: bash

    python -m geowatch.cli.split_videos \
        --src "$DATA_DVC_DPATH/Drop4-BAS/data_train.kwcoco.json" \
              "$DATA_DVC_DPATH/Drop4-BAS/data_vali.kwcoco.json" \
        --dst_dpath "$DATA_DVC_DPATH/Drop4-BAS/"


In fact, if your feature prediction script is registered with the
prepare_teamfeats tool, then you can schedule prediction to run on all of them
individually. You can specify a pattern as the input to this tool.

.. code:: bash

    python -m geowatch.cli.queue_cli.prepare_teamfeats \
        --base_fpath \
            "$DATA_DVC_DPATH/Drop4-BAS/data_train_*.kwcoco.json" \
            "$DATA_DVC_DPATH/Drop4-BAS/data_vali_*.kwcoco.json" \
        --expt_dpath="$EXPT_DVC_DPATH" \
        --with_landcover=0 \
        --with_materials=0 \
        --with_invariants=0 \
        --with_invariants2=1 \
        --with_depth=0 \
        --gres=0, --workers=1 --backend=tmux --run=1

You can then union any custom set of regions into a train and validation kwcoco
file for the subsequent steps.

.. code:: bash

    DATA_DVC_DPATH=$(geowatch_dvc --tags=phase2_data --hardware=auto)
    EXPT_DVC_DPATH=$(geowatch_dvc --tags=phase2_expt --hardware=auto)

    kwcoco union \
        --src $DATA_DVC_DPATH/Drop4-BAS/*_train_*_uky_invariants*.kwcoco.json \
        --dst $DATA_DVC_DPATH/Drop4-BAS/combo_train_I2.kwcoco.json

    kwcoco union \
        --src $DATA_DVC_DPATH/Drop4-BAS/*_vali_*_uky_invariants*.kwcoco.json \
        --dst $DATA_DVC_DPATH/Drop4-BAS/combo_vali_I2.kwcoco.json

We recognize that this is currently a pain-point, but we hope that the existing
tools make it somewhat easier to solve or work around problems, and we hope
that our tooling improves to make this even easier in the future.
'


__doc_run_fusion__='
Now that we have a train and validation kwcoco dataset that contain our
computed features we can train or fine-tune a fusion model.

The following is a set of baseline settings that you should start with.  We
also encourage you to try other hyperparameter settings to maximize the
effectiveness of your features. But you should at least train once with this
configuration as a baseline.
'


# Set according to your hardware requirements
# TODO: expose the unused GPU script and use that.
export CUDA_VISIBLE_DEVICES=0

DATA_DVC_DPATH=$(geowatch_dvc --tags='phase2_data' --hardware='auto')
EXPT_DVC_DPATH=$(geowatch_dvc --tags='phase2_expt' --hardware='auto')

DATASET_CODE=Drop4-BAS
KWCOCO_BUNDLE_DPATH=$DATA_DVC_DPATH/$DATASET_CODE

# You should specify a unique name for your experiment.
# This name will be the default in reports generated by the watch mlops
EXPERIMENT_NAME=Drop4_BAS_my_feature_experiment_$(date --iso-8601)

# These are the paths to the kwcoco files that should contain your features
TRAIN_FPATH=$KWCOCO_BUNDLE_DPATH/data_train_invariants.kwcoco.json
VALI_FPATH=$KWCOCO_BUNDLE_DPATH/data_vali_invariants.kwcoco.json

# The pretrained state should be checked out of DVC.  This is the best BAS
# model as of 2022-12-21, we will partially initialize a subset of the network
# with these weights.
PRETRAINED_STATE="$EXPT_DVC_DPATH"/models/fusion/Drop4-BAS/packages/Drop4_TuneV323_BAS_30GSD_BGRNSH_V2/package_epoch0_step41.pt.pt

# You can use the model_stats command to inspect details about any fusion model.
geowatch model_stats "$PRETRAINED_STATE"

# shellcheck disable=SC2016
__doc_channel_conf__='
When training a fusion model, you must specify a channel configuration.
By default we recommend imputing your features as a separate "stream" in
addition to the original six raw bands.

Remember, early fused channels are separated with a pipe (|) and late fused
channel groups are separated with a comma.  This means in the sensorchan
configuration, separate your channels from the raw channels with a comma.  E.g.

    blue|green|red|nir|swir16|swir22,invariants.0:17


By default each channel assumes it exists in each sensor. You can specify
which channels belong to what sensors by prefixing a group. For instance:

    (S2,L8):(blue|green|red|nir|swir16|swir22),(S2):(invariants.0:17)

The above uses S2 and L8 raw bands, but only adds the invariants from
Sentinel-2 images.

You may try early fusing your features with the RGB channels, or any more
complex input channel scheme, but you must train the simple late fused network
as a baseline.
'

CHANNELS="(S2,L8):(blue|green|red|nir|swir16|swir22),(S2,L8):(invariants.0:17)"

# We recommend this training directory layout to differentiate
# training runs on different machines / from different people.
WORKDIR=$EXPT_DVC_DPATH/training/$HOSTNAME/$USER
DEFAULT_ROOT_DIR=$WORKDIR/$DATASET_CODE/runs/$EXPERIMENT_NAME

MAX_STEPS=10000
TARGET_LR=5e-5
python -m geowatch.tasks.fusion fit --config "
    data:
        num_workers             : 3
        train_dataset           : $TRAIN_FPATH
        vali_dataset            : $VALI_FPATH
        channels                : '$CHANNELS'
        time_steps              : 5
        chip_dims               : '224,224'
        batch_size              : 2
        window_space_scale      : 10GSD
        input_space_scale       : 10GSD
        output_space_scale      : 10GSD
        dist_weights            : false
        neg_to_pos_ratio        : 0.1
        time_sampling           : soft2-contiguous-hardish3
        time_span               : '3m-6m-1y'
        use_centered_positives  : true
        normalize_inputs        : 128
        temporal_dropout        : 0.5
        resample_invalid_frames : 1
        quality_threshold       : 0.8
    model:
        class_path: MultimodalTransformer
        init_args:
            name                   : $EXPERIMENT_NAME
            arch_name              : smt_it_stm_p8
            tokenizer              : linconv
            decoder                : mlp
            stream_channels        : 16
            saliency_weights       : 1:70
            class_loss             : focal
            saliency_loss          : focal
            global_change_weight   : 0.00
            global_class_weight    : 0.00
            global_saliency_weight : 1.00
    lr_scheduler:
      class_path: torch.optim.lr_scheduler.OneCycleLR
      init_args:
        max_lr: $TARGET_LR
        total_steps: $MAX_STEPS
        anneal_strategy: linear
        pct_start: 0.05
    optimizer:
      class_path: torch.optim.Adam
      init_args:
        lr: $TARGET_LR
        weight_decay: 1e-3
        betas:
          - 0.9
          - 0.99
    trainer:
      accumulate_grad_batches: 8
      default_root_dir     : $DEFAULT_ROOT_DIR
      accelerator          : gpu
      devices              : 0,
      #devices             : 0,1
      #strategy            : ddp
      check_val_every_n_epoch: 1
      enable_checkpointing: true
      enable_model_summary: true
      log_every_n_steps: 5
      logger: true
      max_steps: $MAX_STEPS
      num_sanity_val_steps: 0
      replace_sampler_ddp: true
      track_grad_norm: 2
    initializer:
        init: $PRETRAINED_STATE
"


# The result of training will output a list of checkpoints in the lightning
# output directory
ls "$DEFAULT_ROOT_DIR"/lightning_logs/*/checkpoints/*.ckpt

# To use them we need to ensure they are packaged.

# Let's assume we have a checkpoint, This command should grab one of them, you
# should be more selective in the one(s) you choose.
CHECKPOINT_FPATH=$(for i in "$DEFAULT_ROOT_DIR"/lightning_logs/*/checkpoints/*.ckpt; do printf '%s\n' "$i"; break; done)
echo "CHECKPOINT_FPATH = $CHECKPOINT_FPATH"

# repackage it as such: (This command may change in the future to make this
# easier / more robust, but it should work in this context)
python -m geowatch.mlops.repackager "$CHECKPOINT_FPATH"

# That should have written a .pt package with a similar name.  To make this
# bash script work, we will just glob for a package and assume its the one we
# want.
PACKAGE_FPATH=$(for i in "$DEFAULT_ROOT_DIR"/lightning_logs/*/checkpoints/*.pt; do printf '%s\n' "$i"; break; done)
echo "PACKAGE_FPATH = $PACKAGE_FPATH"


__doc_eval__='
Now we have a trained packaged model that is aware of your team features.  The
goal is to use it to demonstrate an improvement in the IAPRA scores.  This can
be done using the mlops framework. You can specify multiple values for an
option to grid search over the Cartesian product of all settings.  You should
at the least include your model and the baseline model to determine if your
features are driving an improvement in the scores.
'

DATA_DVC_DPATH=$(geowatch_dvc --tags='phase2_data' --hardware=auto)
EXPT_DVC_DPATH=$(geowatch_dvc --tags='phase2_expt' --hardware=auto)

BASELINE_PACKAGE_FPATH="$EXPT_DVC_DPATH"/models/fusion/Drop4-BAS/packages/Drop4_TuneV323_BAS_30GSD_BGRNSH_V2/package_epoch0_step41.pt.pt
geowatch model_stats "$BASELINE_PACKAGE_FPATH"

# NOTE:
# The schedule evaluation script originally ran on a single coco file that
# contains all of the validation regions. A more stable way to run the system
# involves splitting the larger validation dataset into a single kwcoco file
# per region, and then running it on all regions separately.
python -m geowatch.cli.split_videos "$DATA_DVC_DPATH"/Drop4-BAS/data_vali_invariants.kwcoco.json

python -m geowatch.mlops.schedule_evaluation \
    --params="
        matrix:
            bas_pxl.package_fpath:
                - $PACKAGE_FPATH
                - $BASELINE_PACKAGE_FPATH
            bas_pxl.channels:
                - 'auto'
            bas_pxl.test_dataset:
                - $DATA_DVC_DPATH/Drop4-BAS/data_vali_KR_R001_uky_invariants.kwcoco.json
                - $DATA_DVC_DPATH/Drop4-BAS/data_vali_KR_R002_uky_invariants.kwcoco.json
                - $DATA_DVC_DPATH/Drop4-BAS/data_vali_US_R007_uky_invariants.kwcoco.json
            bas_pxl.chip_dims: auto
            bas_pxl.chip_overlap: 0.3
            bas_pxl.window_space_scale: auto
            bas_pxl.output_space_scale: auto
            bas_pxl.input_space_scale: auto
            bas_pxl.time_span: auto
            bas_pxl.time_sampling: auto
            bas_poly.moving_window_size: null
            bas_poly.thresh:
                - 0.1
            bas_pxl.enabled: 1
            bas_poly.enabled: 1
            bas_poly_eval.enabled: 1
            bas_pxl_eval.enabled: 1
            bas_poly_viz.enabled: 1
    " \
    --root_dpath="$EXPT_DVC_DPATH/_evaluations" \
    --devices="0," --queue_size=1 \
    --backend=tmux --queue_name "demo-queue" \
    --pipeline=bas \
    --run=1

### NOTE:
# The above script assumes that your bashrc activates the appropriate
# virtualenv by default. If this is not the case you will need to specify an
# additional argument to `watch.mlops.schedule_evaluation`. Namely:
# ``--virtualenv_cmd``. For instance if you have a conda environment named
# "watch", you would add ``--virtualenv_cmd="watch"`` to the command.


__doc_mlops__='
This script will run through the entire BAS pipeline and output results in the
"root_dpath".


The names of the outputs are chosen based on a hash of the configuration, which
enables us to reuse existing results. Symlinks are setup such that it is clear
what previous steps a specific result relied on.

The important part is that there will be a folder for each pipeline node.
The bas_pxl_eval node is the IARPA evaluation results and that stores the
metrics we are interested in.

For now you should manually inspect those results, but in the future
the mlops framework will contain a way to aggregate and analyze results
automatically.
'