MDTF-diagnostics Environment variables ====================================== This page describes the environment variables that the framework will set for your diagnostic when it's run. Overview -------- The MDTF-diagnostics framework can be viewed as a "wrapper" for your code that handles data fetching and munging. Your code communicates with this wrapper in two ways: - The :doc:`settings file <./pod_settings>` is where your code talks to the framework: when you write your code, you document what model data your code uses (not covered on this page, follow the link for details). - The framework "talks" to a POD through a combination of shell environment variables passed directly to the subprocess via the `env` parameter, and by defining a `case_info.yml` file in the `$WORK_DIR` with case-specific environment variables. The framework communicates **all** runtime information this way: this is in order to 1) pass information in a language-independent way, and 2) to make writing diagnostics easier (i.e., the POD does not need to parse command-line settings). **Note** that environment variables are always strings. Your POD will need to cast non-text data to the appropriate type (e.g. the bounds of a case analysis time period, ``startdate``, ``enddate``, will need to be converted to integers.) Also note that names of environment variables are case-sensitive. Paths ----- The following variables are accessed using the ``os.environ`` method: ``OBS_DATA``: Path to the top-level directory containing any observational or reference data you've provided as the author of your diagnostic. Any data your diagnostic uses that doesn't come from the model being analyzed should go here (i.e., you supply it to the framework maintainers, they host it, and the user downloads it when they install the framework). The framework will ensure this is copied to a local filesystem when your diagnostic is run, but this directory should be treated as **read-only**. ``POD_HOME``: Path to the top-level directory containing your diagnostic's source code. This will be of the form ``.../MDTF-diagnostics/diagnostics/``. This can be used to call sub-scripts from your diagnostic's driver script. This directory should be treated as **read-only**. ``DATA_DIR``: (retained for backwards compatibility with v3.5 and earlier PODs) location of the model input data directory. ``WORK_DIR``: Path to your diagnostic's *working directory*, which is where all output data should be written (as well as any temporary files). The framework creates the following subdirectories within this directory: - ``$WORK_DIR/obs/PS`` and ``$WORK_DIR/model/PS``: All output plots produced by your diagnostic should be written to one of these two directories. Only files in these locations will be converted to bitmaps for HTML output. - ``$WORK_DIR/obs/netCDF`` and ``$WORK_DIR/model/netCDF``: Any output data files your diagnostic wants to make available to the user should be saved to one of these two directories. Model run information --------------------- ``case_env_file``: location of the yaml file with case-specific environment variables accessed by calling ``os.environ[`case_env_file`]``. The following environment variables are loaded into a dictionary from the case environment file: ``CATALOG_FILE``: path to the esm-intake catalog header json file used to access the data catalog of processed data files generated by the framework. If ``no_pp`` is specified at runtime, and no custom preprocessing scripts are run on the input dataset, ``CATALOG_FILE`` is the path to input data catalog specified with the ``DATA_CATALOG`` parameter in the runtime configuration file. ``CASENAME``: User-provided label describing each run of model data being analyzed. Single-run PODs submitted to version 3.5 and earlier of the framework directly access this variable with ``os.environ['CASENAME']``. ``startdate``, ``enddate``: Strings in the format or describing the start and end dates of the analysis period for a case associated with ``CASENAME``. Single-run PODs submitted to version 3.5 and earlier of the framework directly access this variable with ``os.environ['startdate]`` and ``os.environ['enddate]``. Locations of model data files ----------------------------- The processed model data files are written to the `$WORK_DIR` and accessed via the esm-intake catalog output by the framework, or by the original catalog passed to the framework at runtime if no preprocessing is performed via the ``CATALOG_FILE`` environment variable in the ``case_env_file`` Names of variables and dimensions --------------------------------- These are set depending on the data your diagnostic requests in its :doc:`settings file <./pod_settings>`. Refer to the examples below if you're unfamiliar with how that file is organized. Simple example -------------- We only give the relevant parts of the :doc:`settings file ` below. .. code-block:: js "dimensions": { "lat": { "standard_name": "latitude", ... }, "lon": { "standard_name": "longitude", ... }, "time": { "standard_name": "time", ... } }, "varlist": { "pr": { "standard_name": "precipitation_flux", } } The framework will set the following environment variables in the ``case_env_file``: #. ``lat_coord``: Name of the latitude dimension in the model's native format #. ``lon_coord``: Name of the longitude dimension in the model's native format #. ``time_coord``: Name of the time dimension in the model's native format #. ``pr_var``: Name of the precipitation variable #. ``PR_FILE`` (retained for backwards compatibility): Absolute path to the file containing ``pr`` data, e.g. ``/dir/precip.nc``. As with ``CASENAME``, ``startdate``, and ``enddate``, the variable-specific environment variables are accessed with the ``os.environ`` method in single-run PODs from framework versions older than v4.0.