MDTF-diagnostics Environment variables
======================================

This page describes the environment variables that the framework will set for your diagnostic when it's run.

Overview
--------

The MDTF-diagnostics framework can be viewed as a "wrapper" for your code that handles data fetching and munging.
Your code communicates with this wrapper in two ways:

- The :doc:`settings file <./pod_settings>` is where your code talks to the framework: when you write your code,
  you document what model data your code uses (not covered on this page, follow the link for details).
- The framework "talks" to a POD through a combination of shell environment variables passed directly to the subprocess
  via the `env` parameter, and by defining a `case_info.yml` file in the `$WORK_DIR` with case-specific environment
  variables. The framework communicates **all** runtime information this way: this is in order to 1) pass information
  in a language-independent way, and 2) to make writing diagnostics easier (i.e., the POD does not need to parse
  command-line settings).

**Note** that environment variables are always strings. Your POD will need to cast non-text data to the
appropriate type (e.g. the bounds of a case analysis time period, ``startdate``, ``enddate``, will need to be converted
to integers.)

Also note that names of environment variables are case-sensitive.

Paths
-----
The following variables are accessed using the ``os.environ`` method:
    ``OBS_DATA``:
      Path to the top-level directory containing any observational or reference data you've provided as the author of your
      diagnostic. Any data your diagnostic uses that doesn't come from the model being analyzed should go here
      (i.e., you supply it to the framework maintainers, they host it, and the user downloads it when they install the
      framework). The framework will ensure this is copied to a local filesystem when your diagnostic is run, but this
      directory should be treated as **read-only**.

    ``POD_HOME``:
      Path to the top-level directory containing your diagnostic's source code. This will be of the form
      ``.../MDTF-diagnostics/diagnostics/<your POD's name>``. This can be used to call sub-scripts from your diagnostic's
      driver script. This directory should be treated as **read-only**.

    ``DATA_DIR``:
      (retained for backwards compatibility with v3.5 and earlier PODs) location of the model
      input data directory.

    ``WORK_DIR``:
      Path to your diagnostic's *working directory*, which is where all output data should be written
      (as well as any temporary files).

  The framework creates the following subdirectories within this directory:

  - ``$WORK_DIR/obs/PS`` and ``$WORK_DIR/model/PS``: All output plots produced by your diagnostic should be written to
    one of these two directories. Only files in these locations will be converted to bitmaps for HTML output.
  - ``$WORK_DIR/obs/netCDF`` and ``$WORK_DIR/model/netCDF``: Any output data files your diagnostic wants to make
    available to the user should be saved to one of these two directories.

Model run information
---------------------
``case_env_file``:
  location of the yaml file with case-specific environment variables accessed by calling
  ``os.environ[`case_env_file`]``. The following environment variables are loaded into a dictionary
  from the case environment file:

    ``CATALOG_FILE``:
      path to the esm-intake catalog header json file used to access the data catalog of
      processed data files generated by the framework. If ``no_pp`` is specified at runtime, and no custom
      preprocessing scripts are run on the input dataset, ``CATALOG_FILE`` is the path to input data catalog
      specified with the ``DATA_CATALOG`` parameter in the runtime configuration file.

    ``CASENAME``:
      User-provided label describing each run of model data being analyzed. Single-run PODs submitted to version 3.5 and
      earlier of the framework directly access this variable with ``os.environ['CASENAME']``.

    ``startdate``, ``enddate``:
      Strings in the format <yyyymmdd> or <yyyymmddHHMMSS> describing the start and end dates of the
      analysis period for a case associated with ``CASENAME``. Single-run PODs submitted to version 3.5 and
      earlier of the framework directly access this variable with ``os.environ['startdate]`` and ``os.environ['enddate]``.

Locations of model data files
-----------------------------

The processed model data files are written to the `$WORK_DIR` and accessed via the esm-intake catalog
output by the framework, or by the original catalog passed to the framework at runtime if no preprocessing
is performed via the ``CATALOG_FILE`` environment variable in the ``case_env_file``

Names of variables and dimensions
---------------------------------

These are set depending on the data your diagnostic requests in its :doc:`settings file <./pod_settings>`. Refer to
the examples below if you're unfamiliar with how that file is organized.

Simple example
--------------

We only give the relevant parts of the :doc:`settings file <ref_settings>` below.

.. code-block:: js
  "dimensions": {
    "lat": {
      "standard_name": "latitude",
      ...
    },
    "lon": {
      "standard_name": "longitude",
      ...
    },
    "time": {
      "standard_name": "time",
      ...
    }
  },
  "varlist": {
    "pr": {
      "standard_name": "precipitation_flux",
    }
  }

The framework will set the following environment variables in the ``case_env_file``:

#. ``lat_coord``: Name of the latitude dimension in the model's native format
#. ``lon_coord``: Name of the longitude dimension in the model's native format
#. ``time_coord``: Name of the time dimension in the model's native format
#. ``pr_var``: Name of the precipitation variable
#. ``PR_FILE`` (retained for backwards compatibility): Absolute path to the file containing
   ``pr`` data, e.g. ``/dir/precip.nc``.

As with ``CASENAME``, ``startdate``, and ``enddate``, the variable-specific environment variables are
accessed with the ``os.environ`` method in single-run PODs from framework versions older than v4.0.