=============================== Running The System In Smartflow =============================== .. note :: This file is currently a working document and contains a lot of notes This goes over how to run the GeoWATCH system in smartflow. The outline of this document is: * `Building the Docker Image `__, + `Building the Pyenv Base Image `__, + `Building the GeoWATCH Image 2 `__. + `Baking Models into the Image `__. + `Updating Existing Images `__. * `Submit a DAG `__. + `Debugging a DAG `__. * `Old Notes `__. Prerequisites ============= Be sure you have * `Setup the AWS CLI <../environment/getting_started_aws.rst>`_ * `Setup the kubectl CLI <../environment/getting_started_kubectl.rst>`_ * `Setup smartflow `_ Section 0: NOTICE OF AUTOMATION =============================== This entire process has been scripted and lives in the `watch-smartflow-dags repo `_ repo. The `prepare_system.sh `_ script is the main driver. TODO: We should document how to use that script here instead of these manual instructions. Other script of interest are: * `run_smartflow_dags.py `_ - This is used by prepare_system to trigger dags and can be used as a standalone script to trigger dags that have already been uploaded. * `pull_results.py `_ - This pulls results down from smartflow DAG runs and can optionally sumarize metrics. .. _SectionBuildDocker: Section 1: The GeoWATCH Docker Image ==================================== In this section we will go over how to build the docker image used in a submission: There are two images that need to be build, first the `pyenv dockerfile <../../../../dockerfiles/pyenv.Dockerfile>`_. And then the watch dockerfile that builds on top of it `watch dockerfile <../../../../dockerfiles/watch.Dockerfile>`_. The heredocs in these files provide futher instructions. We will also need to add models to the image .. _BuildPyenv: Building the Pyenv Base Image ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Here we will go over the basic use case for a specific version of Python / dependencies. We will use Python 3.11 and strict dependencies. We will assume that your watch repo is in ``~/code/watch``, but if it is not then change the environment variable: .. code:: bash export WATCH_REPO_DPATH=$HOME/code/watch export STAGING_DPATH=$HOME/temp/container-staging # The name/tag of the image we will create export PYENV_IMAGE=pyenv:3.11.2 # The Python version we want export PYTHON_VERSION=3.11.2 # Create a directory on the host for context mkdir -p $STAGING_DPATH # For pyenv we will use an empty directory mkdir -p $STAGING_DPATH/empty DOCKER_BUILDKIT=1 docker build --progress=plain \ -t $PYENV_IMAGE \ --build-arg PYTHON_VERSION=$PYTHON_VERSION \ -f $WATCH_REPO_DPATH/dockerfiles/pyenv.Dockerfile \ $STAGING_DPATH/empty # Optional: push this image to kitware and smartgitlab registries # optional: Push to smartgitlab docker tag $PYENV_IMAGE registry.smartgitlab.com/kitware/$PYENV_IMAGE docker push registry.smartgitlab.com/kitware/$PYENV_IMAGE # optional: Push to gitlab.kitware.com docker tag $PYENV_IMAGE gitlab.kitware.com:4567/smart/watch/$PYENV_IMAGE docker push gitlab.kitware.com:4567/smart/watch/$PYENV_IMAGE Now that the pyenv image ``pyenv:3.11.2`` has been created we can quickly test it: .. code:: bash export PYENV_IMAGE=pyenv:3.11.2 # Hello world should always run docker run --runtime=nvidia -it $PYENV_IMAGE echo "hello world" # Ensure the right python is exposed by default docker run --runtime=nvidia -it $PYENV_IMAGE python --version # if you have a GPU you can run docker run --runtime=nvidia -it $PYENV_IMAGE nvidia-smi .. _BuildGeowatch: Building the GeoWATCH Image ^^^^^^^^^^^^^^^^^^^^^^^^^^^ Now we build the watch image on top of the pyenv image. To ensure we do this cleanly we will make a fresh clone of your local repo which will ensure you dont accidently bake in any secrets or other large files. .. code:: bash export WATCH_REPO_DPATH=$HOME/code/watch export STAGING_DPATH=$HOME/temp/container-staging export PYENV_IMAGE=pyenv:3.11.2 export WATCH_VERSION=$(python -c "import watch; print(watch.__version__)") export BUILD_STRICT=1 # A descriptive name for our watch image PYENV_TAG_SUFFIX=$(python -c "print('$PYENV_IMAGE'.replace(':', ''))") if [[ "$BUILD_STRICT" == "1" ]]; then export WATCH_IMAGE=watch:$WATCH_VERSION-strict-$PYENV_TAG_SUFFIX else export WATCH_IMAGE=watch:$WATCH_VERSION-loose-$PYENV_TAG_SUFFIX fi echo " =========== WATCH_REPO_DPATH = $WATCH_REPO_DPATH STAGING_DPATH = $STAGING_DPATH WATCH_VERSION = $WATCH_VERSION PYENV_IMAGE = $PYENV_IMAGE BUILD_STRICT = $BUILD_STRICT ----------- WATCH_IMAGE=$WATCH_IMAGE =========== " # Create a directory on the host for context mkdir -p $STAGING_DPATH # For watch we make a fresh clone of our local repo [ -d $STAGING_DPATH/watch ] && rm -rf $STAGING_DPATH/watch git clone --origin=host-$HOSTNAME $WATCH_REPO_DPATH/.git $STAGING_DPATH/watch DOCKER_BUILDKIT=1 docker build --progress=plain \ -t "$WATCH_IMAGE" \ --build-arg "BUILD_STRICT=$BUILD_STRICT" \ --build-arg "BASE_IMAGE=$PYENV_IMAGE" \ -f $STAGING_DPATH/watch/dockerfiles/watch.Dockerfile . # Optional: push this image to kitware and smartgitlab registries # optional: Push to smartgitlab docker tag $WATCH_IMAGE registry.smartgitlab.com/kitware/$WATCH_IMAGE docker push registry.smartgitlab.com/kitware/$WATCH_IMAGE # optional: Push to gitlab.kitware.com docker tag $WATCH_IMAGE gitlab.kitware.com:4567/smart/watch/$WATCH_IMAGE docker push gitlab.kitware.com:4567/smart/watch/$WATCH_IMAGE It is a good idea to run some tests to ensure the image built properly .. code:: bash # Hello world should always run docker run --runtime=nvidia -it $WATCH_IMAGE echo "hello world" # Ensure the right python is exposed by default docker run --runtime=nvidia -it $WATCH_IMAGE python --version # Ensure the watch module is exposed by default docker run --runtime=nvidia -it $WATCH_IMAGE geowatch --version # if you have a GPU you can run docker run --runtime=nvidia -it $WATCH_IMAGE nvidia-smi # run the full test suite docker run --runtime=nvidia -it $WATCH_IMAGE ./run_tests.py You may wish to upload this base image to the smartgitlab registry, but we will need to bake in models, so this step is optional, but useful if you want to build the base image on one machine and then bake in models on a different machine. .. code:: bash # Push the container to smartgitlab docker tag $WATCH_IMAGE registry.smartgitlab.com/kitware/$WATCH_IMAGE docker push registry.smartgitlab.com/kitware/$WATCH_IMAGE .. _BakeModel: How to Bake a Model into a Pyenv Dockerfile ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Assuming that you have already build a pyenv docker image we will add a model to it. .. code:: bash # Set this to the name of the pyenv watch image that you built IMAGE_NAME=watch:0.4.5-strict-pyenv3.11.2 NEW_IMAGE_NAME=${IMAGE_NAME}-models-2023-03-28 echo $NEW_IMAGE_NAME # These are more models than we really need, but it will let use resuse this image for more experiments MODELS_OF_INTEREST=" models/fusion/Drop6-MeanYear10GSD/packages/Drop6_TCombo1Year_BAS_10GSD_split6_V42_cont2/Drop6_TCombo1Year_BAS_10GSD_split6_V42_cont2_epoch3_step941.pt models/fusion/Drop4-BAS/packages/Drop4_BAS_2022_12_15GSD_BGRN_V10/Drop4_BAS_2022_12_15GSD_BGRN_V10_v0_epoch0_step0.pt models/fusion/Drop4-BAS/packages/Drop4_BAS_15GSD_BGRNSH_invar_V8/Drop4_BAS_15GSD_BGRNSH_invar_V8_epoch=16-step=8704.pt models/fusion/Drop4-BAS/packages/Drop4_TuneV323_BAS_30GSD_BGRNSH_V2/package_epoch0_step41.pt.pt models/fusion/Drop4-SC/packages/Drop4_tune_V30_8GSD_V3/Drop4_tune_V30_8GSD_V3_epoch=2-step=17334.pt.pt \ models/uky/uky_invariants_2022_03_21/pretext_model/pretext_pca_104.pt \ models/uky/uky_invariants_2022_12_17/TA1_pretext_model/pretext_package.pt \ models/landcover/sentinel2.pt " DVC_EXPT_DPATH=$(geowatch_dvc --tags='phase2_expt' --hardware=auto) # Ensure the models of interest are pulled locally on your machine (cd $DVC_EXPT_DPATH && dvc pull -r aws $MODELS_OF_INTEREST) # We are also going to bake the metrics and data DVC into the repo too for # completeness DVC_DATA_DPATH=$(geowatch_dvc --tags='phase2_data' --hardware=auto) METRICS_REPO_DPATH=$(python -c "import iarpa_smart_metrics, pathlib; print(pathlib.Path(iarpa_smart_metrics.__file__).parent.parent)") # Run the base image as a container so we can put stuff into it # We will use DVC to facilitate the transfer to keep things consistent # We mount our local experiment directory, and pull relevant files docker run \ --volume $DVC_EXPT_DPATH:/host-smart_expt_dvc:ro \ --volume $DVC_DATA_DPATH:/host-smart_data_dvc:ro \ --volume $METRICS_REPO_DPATH:/host-metrics_repo:ro \ -td --name temp_container $IMAGE_NAME docker exec -t temp_container pip install dvc docker exec -t temp_container mkdir -p /root/data docker exec -t temp_container git clone /host-smart_expt_dvc/.git /root/data/smart_expt_dvc docker exec -t temp_container git clone /host-smart_data_dvc/.git /root/data/smart_data_dvc docker exec -t temp_container git clone /host-metrics_repo/.git /root/code/metrics-and-test-framework docker exec -w /root/data/smart_expt_dvc -t temp_container \ dvc remote add host /host-smart_expt_dvc/.dvc/cache # Workaround DVC Issue by removing aws remote # References: https://github.com/iterative/dvc/issues/9264 docker exec -w /root/data/smart_expt_dvc -t temp_container \ dvc remote remove aws # Pull in relevant models you want to bake into the container # These will be specified relative to the experiment DVC repo docker exec -w /root/data/smart_expt_dvc -t temp_container \ dvc pull --remote host $MODELS_OF_INTEREST # Save the modified container as a new image docker commit temp_container $NEW_IMAGE_NAME # Cleanup the temp container docker stop temp_container docker rm temp_container # Push the container to smartgitlab docker tag $NEW_IMAGE_NAME registry.smartgitlab.com/kitware/$NEW_IMAGE_NAME docker push registry.smartgitlab.com/kitware/$NEW_IMAGE_NAME echo $NEW_IMAGE_NAME # optional: Push to gitlab.kitware.com docker tag $WATCH_IMAGE gitlab.kitware.com:4567/smart/watch/$WATCH_IMAGE docker push gitlab.kitware.com:4567/smart/watch/$WATCH_IMAGE .. _UpdateImage: Update An Existing Image ^^^^^^^^^^^^^^^^^^^^^^^^ Say you need to make a small change to the code, but don't want to rebuild the entire model. We can handle that case by mounting the latest repos onto the container, setting the remotes of the repo to point to those, pulling the latest code, and commiting the change as a new image. .. code:: bash export WATCH_REPO_DPATH=$HOME/code/watch export DVC_EXPT_DPATH=$(geowatch_dvc --tags='phase2_expt' --hardware=auto) IMAGE_NAME=watch:0.4.5-strict-pyenv3.11.2-models-2023-03-28 NEW_IMAGE_NAME=watch:0.4.5-strict-pyenv3.11.2-models-2023-03-28-v04 # Mount the image with docker run \ --volume $DVC_EXPT_DPATH:/host-smart_expt_dvc:ro \ --volume $WATCH_REPO_DPATH:/host-watch_repo:ro \ -td --name temp_container $IMAGE_NAME docker exec -w /root/code/watch -t temp_container \ git remote add host /host-watch_repo/.git docker exec -w /root/code/watch -t temp_container \ git pull host dev/0.4.5 # Save the modified container as a new image docker commit temp_container $NEW_IMAGE_NAME docker stop temp_container docker rm temp_container # Push the container to smartgitlab echo $NEW_IMAGE_NAME docker tag $NEW_IMAGE_NAME registry.smartgitlab.com/kitware/$NEW_IMAGE_NAME docker push registry.smartgitlab.com/kitware/$NEW_IMAGE_NAME .. _SubmitDAG: How to Submit a DAG =================== .. .. SeeAlso: ~/code/watch-smartflow-dags/KIT_TA2_PREEVAL10_PYENV_V13.py ~/code/watch-smartflow-dags/KIT_TA2_PREEVAL10_V13.py We maintain the airflow DAGS in the `watch-smartflow-dags repo `_. Ensure that you have the DAG repo: .. code:: bash # This is the repo containing the smartflow dags git clone git@gitlab.kitware.com:smart/watch-smartflow-dags.git $HOME/code/watch-smartflow-dags Choose a DAG file and modify it as necessary .. note:: TODO: Describe in more detail Once you have a DAG file ready upload it to AWS via: .. code:: bash # The path to our DAG repo LOCAL_DAG_DPATH=$HOME/code/watch-smartflow-dags # The name of the DAG file we edited DAG_FNAME=KIT_TA2_PREEVAL10_PYENV_V13.py # Upload the DAG file to AWS aws s3 --profile iarpa cp $LOCAL_DAG_DPATH/$DAG_FNAME \ s3://smartflow-023300502152-us-west-2/smartflow/env/kitware-prod-v4/dags/$DAG_FNAME If you have not done so ensure that we are forwarding the smartflow web service to your machine: .. code:: bash kubectl -n airflow port-forward service/airflow-webserver 2746:8080 Now, navigate to your airflow GUI in the browser at ``localhost:2746/home``, which can be done via the command: .. code:: bash # Not working? python -c "import webbrowser; webbrowser.open('https://localhost:2746/home', new=1)" Building / Modifying a DAG ========================== Our smartflow DAGs are built as sequences of smartflow CLI commands that wrap our local CLI commands. These smartflow CLI commands live in `geowatch/cli/smartflow <../../../../geowatch/cli/smartflow>`_. Each of these uses ffsspec to grab manifests of available assets from an s3 bucket, which then points to the data the task could use. It is the scripts job to pull the data, perform the computation, print debugging info, and push results and debug data back to a new output bucket. See [ComputeInstanceTypes]_ for details on available instance types. Also see References: .. [ComputeInstanceTypes] https://smartgitlab.com/blacksky/smartflow/-/blob/main/docs/Framework/Smartflow-Framework.md#selecting-compute-resources-for-tasks A table as of 2024-05-26 of compute resources is:: | Purpose | RAM/CPU ratio | GPUs | Accelerator | SSD* | SSD Size | Price Range ($/hr) | |--------------------:|--------------:|---------:|------------------:|-------------:|-------------------------:|-------------------:| | general | 4g/core | | | No | | 0.10-9.677 | | gpu-aws-inferentia1 | 2G/core | 1, 4, 16 | inferentia1 | No | | 0.228-4.721 | | gpu-aws-inferentia2 | 2G/core | 1, 6, 12 | inferentia2 | No | | 1.967-12.99 | | gpu-aws-trainium1 | 4G/core | 1, 16 | trainium1 | Yes | 1 x 475 GB, 4 x 1900 GB | 1.34-21.50 | | gpu-nvidia-a10g | 4G/core | 1, 4, 8 | nvidia-a10g | Yes | 1 x 250 GB - 2 x 3800 GB | 1.006-16.288 | | gpu-nvidia-t4 | 4G/core | 1, 4, 8 | nvidia-tesla-t4 | Yes | 1 x 125 GB - 2 x 900 GB | 0.526-7.824 | | gpu-nvidia-v100 | 7.625G/core | 1, 4, 8 | nvidia-tesla-v100 | If Requested | 2 x 900 GB | 3.06-31.212 | | high-cpu | 2G/core | | | If Requested | 1 x 50 GB - 4 x 1900GB | 0.085-6.4512 | | high-mem | 4G/core | | | If Requested | 1 x 75 GB - 4 x 1900GB | 0.1068-5.34 | .. _RunningDAGS: Running DAGS ^^^^^^^^^^^^ In the GUI you can simply search for your dag and hit the run buttom. To programatically interact with airflow on the command line, you need to exec into the airflow scheduler pod. .. code:: bash JQ_QUERY='.items[] | select(.metadata.name | startswith("airflow-scheduler-")) | .metadata.name' AIRFLOW_SCHEDULER_POD_NAME=$(kubectl -n airflow get pods -o json | jq -r "$JQ_QUERY") echo "AIRFLOW_SCHEDULER_POD_NAME=$AIRFLOW_SCHEDULER_POD_NAME" # Get a shell into the scheduler to run airflow commands kubectl -n airflow exec -it pods/$AIRFLOW_SCHEDULER_POD_NAME -- /bin/bash # Inside the airflow shell echo ' airflow dags list airflow dags list -o json > dags.json airflow dags list-jobs # To run a dag you need to trigger and unpause it. airflow dags trigger kit_ta2_preeval10_pyenv_t29_batch_AE_R001 airflow dags unpause kit_ta2_preeval10_pyenv_t29_batch_AE_R001 airflow dags trigger kit_ta2_preeval10_pyenv_t29_batch_KW_R001 airflow dags unpause kit_ta2_preeval10_pyenv_t29_batch_KW_R001 REGION_IDS=("KR_R002" "KR_R001" "NZ_R001") for REGION_ID in "${REGION_IDS[@]}"; do echo "trigger $REGION_ID" airflow dags trigger kit_ta2_preeval10_pyenv_t29_batch_$REGION_ID airflow dags unpause kit_ta2_preeval10_pyenv_t29_batch_$REGION_ID done REGION_IDS=("KR_R002" "KR_R001" "NZ_R001" "KW_R001" "AE_R001") for REGION_ID in "${REGION_IDS[@]}"; do echo "trigger $REGION_ID" airflow dags trigger kit_ta2_preeval10_pyenv_t31_batch_$REGION_ID airflow dags unpause kit_ta2_preeval10_pyenv_t31_batch_$REGION_ID done # Status queries airflow dags list-jobs -d kit_ta2_preeval10_pyenv_t33_post1_batch_KR_R001 -o yaml airflow dags list-runs -d kit_ta2_preeval10_pyenv_t33_post1_batch_KR_R001 -o yaml airflow dags list-runs -d kit_eval_11_rerun_batch_AE_R001 -o yaml ' ### Alternative - execute commands from local shell # Oddly this tends to send outputs with color that we need to strip out. JQ_QUERY='.items[] | select(.metadata.name | startswith("airflow-scheduler-")) | .metadata.name' AIRFLOW_SCHEDULER_POD_NAME=$(kubectl -n airflow get pods -o json | jq -r "$JQ_QUERY") export AIRFLOW_SCHEDULER_POD_NAME kubectl -n airflow exec -it pods/$AIRFLOW_SCHEDULER_POD_NAME -- airflow dags list -o json > dags.json cat dags.json | sed -r "s/\x1B\[([0-9]{1,3}(;[0-9]{1,2};?)?)?[mGK]//g" | cat > dags_nocolor.json airflow dag_state kit_eval_11_rerun_batch_AE_C002 # Note: # This idea will be further developed in ~/code/watch-smartflow-dags/monitor_dags.py python -c "if True: import json import pathlib import cmd_queue # Build pattern to identify the jobs you want to run import xdev pattern = xdev.MultiPattern.coerce([ 'kit_eval_11_rerun_batch*' #f'kit_ta2_preeval10_pyenv_t{t}*' #for t in [31, 35] ]) # FIXME: the json can be output with an error, need to strip it. text = pathlib.Path('dags_nocolor.json').read_text() data = json.loads(text[86:]) valid_rows = [] for item in data: if pattern.match(item['dag_id']): valid_rows.append(item) if 0: # Query the status of the selected dags import os AIRFLOW_SCHEDULER_POD_NAME = os.environ['AIRFLOW_SCHEDULER_POD_NAME'] prefix = f'kubectl -n airflow exec -it pods/{AIRFLOW_SCHEDULER_POD_NAME} -- ' import base64 # easy-to-represent char encoding of the strip ansi pattern pat = base64.b32decode(b'DNOFWKC3GAWTSXL3GEWDG7JIHNNTALJZLV5TCLBSPU5T6KJ7FE7VW3KHJNOQ====').decode('utf8') import re pat = re.compile(pat) from watch.utils.util_yaml import Yaml row_to_states = {} for row in valid_rows: dag_id = row['dag_id'] info = ub.cmd(prefix + f'airflow dags list-runs -d {dag_id} -o yaml', shell=True) text = pat.sub('', info['out']) states = Yaml.loads(text) print(ub.urepr(states)) row_to_states[dag_id] = states orig_row = {r['dag_id']: r for r in valid_rows} dag_info_rows = [] for dag_id, states in row_to_states.items(): row = orig_row[dag_id] if len(states) == 0: row['status'] = None else: mrs = states[-1] row['status'] = mrs['state'] row['execution_date'] = mrs['execution_date'] row['run_id'] = mrs['run_id'] row['start_date'] = mrs['start_date'] row['end_date'] = mrs['end_date'] dag_info_rows.append(row) import pandas as pd df = pd.DataFrame(dag_info_rows) import rich rich.print(df) num_need_run = pd.isna(df['status']).sum() num_running = (df['status'] == 'running').sum() print(f'num_need_run={num_need_run}') print(f'num_running={num_running}') import pandas as pd df = pd.DataFrame(valid_rows) import rich rich.print(df) # Build cmd-queue with the commands to execute queue = cmd_queue.Queue.create(backend='serial') prefix = 'kubectl -n airflow exec -it pods/$AIRFLOW_SCHEDULER_POD_NAME -- ' for item in data: if pattern.match(item['dag_id']): print(item['dag_id']) queue.submit(prefix + 'airflow dags trigger ' + item['dag_id']) queue.submit(prefix + 'airflow dags unpause ' + item['dag_id']) # It is a good idea to comment out the run to check that you # are doing what you want to do before you actually execute. queue.print_commands() queue.run() " .. _DebugDAGS: Debuggging DAGS ^^^^^^^^^^^^^^^ Here is a useful command to get a list of running pods that contain jobs. .. code:: bash kubectl -n airflow get pods Given a pod id there are useful commands .. code:: bash # Pod logs kubectl -n airflow logs pods/{pod_addr} # Exec into a pod kubectl -n airflow exec -it pods/{pod_addr} -- bash Here is a snippet to automatically list pods and allow you to select one to exec info: .. code:: bash kubectl -n airflow get pods # Find your POD_ADDR # POD_ADDR=site-cropped-kwcoco-6254ac27fab04f0b8eb302ac19b09745 # kubectl -n airflow exec -it pods/$POD_ADDR -- bash # Script to list and exec into a running pod python -c "if True: import json import pandas as pd import rich import ubelt as ub info = ub.cmd('kubectl -n airflow get pods -o json') data = json.loads(info['out']) from dateutil.parser import isoparse from datetime import datetime as datetime_cls utc_now = datetime_cls.utcnow() rows = [] for item in data['items']: restart_count = sum([cs['restartCount'] for cs in item['status']['containerStatuses']]) start_time = item['status']['startTime'] start_dt = isoparse(start_time) utc_now = utc_now.replace(tzinfo=start_dt.tzinfo) age_delta = utc_now - start_dt row = { 'name': item['metadata']['name'], 'status': item['status']['phase'], 'startTime': start_time, 'restarts': restart_count, 'age': str(age_delta), } rows.append(row) df = pd.DataFrame(rows) rich.print(df.to_string()) import rich.prompt ans = rich.prompt.Prompt.ask('which one?', choices=list(map(str, df.index.to_list()))) idx = int(ans) pod_addr = df.iloc[idx]['name'] ub.cmd(f'kubectl -n airflow exec -it pods/{pod_addr} -- bash', system=True) " .. _OldNotes: Old Notes ========= How to Bake a Model into a Dockerfile (OLD) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * Must be run in repo root * Ensure whatever variant of the repo you want to be run is checked out. * Need a base directory with a model in ``./models``. .. code:: bash DOCKER_BUILDKIT=1 \ docker build --build-arg BUILD_STRICT=1 -f dockerfiles/ta2_features.Dockerfile . \ --tag registry.smartgitlab.com/kitware/watch/ta2:post-jan31-invariant-rescaled-debug4 In the DAG need to change path to point to the new baked in model. Need to push container to smartgitlab Running Dags After Containers are Using (OLD) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Now we edit a DAG file for airflow .. git clone git@gitlab.kitware.com:smart/watch-smartflow-dags.git Choose a DAG file in ~/code/watch-smartflow-dags/ then edit it to give it a unique name .e.g. ``~/code/watch-smartflow-dags/KIT_TA2_20221121_BATCH.py`` * change name of file and then change ``EVALUATION`` to be a unique string to name it what you want. * change the image names / tags e.g. ``image="registry.smartgitlab.com/kitware/watch/ta2:Ph2Nov21EvalBatch"``, these are all "pod tasks" create_pod_task * ``purpose`` is something about the node that it runs on. For a subset of valid options see: https://smartgitlab.com/blacksky/smartflow/-/blob/118140a81362c5721b5e9bb65ab967fb8bd28163/CHANGELOG.md * make cpu limit a bit less than what is availble on the pod. * Copy the DAG to smartflow S3: .. code:: bash aws s3 --profile iarpa cp Kit_DatasetGeneration.py s3://smartflow-023300502152-us-west-2/smartflow/env/kitware-prod-v2/dags/Kit_DatasetGeneration.py Need to run service to access airflow gui: .. code:: bash kubectl -n airflow port-forward service/airflow-webserver 2746:8080 navigate to ``localhost:2746/home`` Now dags show up in the GUI. SeeAlso ^^^^^^^ * `Connor's Smartflow Training Nodes `_ * Dags live in: https://gitlab.kitware.com/smart/watch-smartflow-dags