1. GFDL-specific information

This page contains information specific to the site installation at the Geophysical Fluid Dynamics Laboratory.

1.1. Site installation

The DET team maintains a site-wide installation of the framework and all supporting data at /home/mdteam/DET/analysis/mdtf/MDTF-diagnostics. This is kept up-to-date and is accessible from both workstations and PPAN. Please contact us if your use case can’t be accommodated by this installation.

1.2. FRE-centric modes of operation

In addition to the standard, interactive method of running MDTF diagnostics as described in the rest of the documentation, the site installation provides alternative ways to run the diagnostics within GFDL’s existing workflow.

  1. Within FRE XMLs. This is done by calling the mdtf_gfdl.csh wrapper script from an <analysis> tag in the XML. Currently, FRE requires that each analysis script be associated with a single model <component>. This poses difficulties for diagnostics which use data generated by multiple components. We provide two ways to address this issue:

    1. If it’s known ahead of time that a given <component> will dominate the run time and finish last, one can call mdtf_gfdl.csh from an <analysis> tag in that component only. In this case, the framework will search all data present in the /pp/ output directory when it’s called. The <component> being used doesn’t need to generate data analyzed by the diagnostics; in this case it’s only used to schedule the diagnostics’ execution.

    2. If one doesn’t know which <component> will finish last, a more robust solution is to call mdtf_gfdl.csh --component_only from each <component> generating data to be analyzed. When the --component_only flag is set, every time the framework is called it will only run the diagnostics for which all the input data is available and which haven’t run already (which haven’t written their output to $OUTPUT_DIR.

  2. As a batch job on PPAN, managed via slurm. This is handled via the mdtf_gfdl_interactive.csh wrapper script.

  3. Called from an interactive shell on PPAN or workstations.

1.3. Data retrieval options

The framework is currently configured to search data from two types of directory hierarchies. The framework will determine what’s intended based on its input, but this choice can be overridden by passing the following options with the --data_manager flag:

  • The /pp/ hierarchy used by FRE (--data_manager Gfdl_PP). In this case CASE_ROOT_DIR should be set to the root of the directory hierarchy (ie, ending in /pp).

  • The CMIP6 DRS for published data on the Unified Data Archive (--data_manager Gfdl_UDA_CMIP6). In this case CASE_ROOT_DIR should not be set, but the --model and --experiment settings should be populated.

  • The CMIP6 DRS for unpublished data on /data_cmip6. This option must be requested explicitly with --data_manager Gfdl_data_cmip6. In this case CASE_ROOT_DIR should not be set, but the --model and --experiment settings should be populated.

1.4. GFDL-specific options

In addition to the framework’s normal command-line options, the following site-specific options are recognized:

  • --GFDL-PPAN-TEMP, --GFDL_PPAN_TEMP <DIR>: If running on the GFDL PPAN cluster, set the $MDTF_GFDL_TMPDIR environment variable to this location and create temp files here. Note: must be accessible via gcp. Defaults to $TMPDIR.

  • --GFDL-WS-TEMP, --GFDL_WS_TEMP <DIR>: If running on a GFDL workstation, set the $MDTF_GFDL_TMPDIR environment variable to this location and create temp files here. The directory will be created if it doesn’t exist. Note: must be accessible via gcp. Defaults to /net2/$USER/tmp.

  • --frepp: Normally this is set by the mdtf_gfdl.csh wrapper script, and not directly by the user. Set flag to run framework in “online” mode (1a. or 1b. above), processing data as part of the FRE pipeline.

  • --ignore-component, --ignore_component: Normally this is set by the mdtf_gfdl.csh wrapper script, and not directly by the user. If set, this flag tells the framework to search the entire /pp/ directory for model data (1a. above); default is to restrict to model component passed by FRE. Ignored if --frepp is not set.

1.5. GFDL-specific defaults

The following paths are set to more useful default values:

  • --OBS-DATA-REMOTE, --OBS_DATA_REMOTE <DIR>: Site-specific installation of observational data used by individual PODs at /home/Oar.Gfdl.Mdteam/DET/analysis/mdtf/obs_data. If running on PPAN, this data will be GCP’ed to the current node.

  • --OBS-DATA-ROOT, --OBS_DATA_ROOT <DIR>: Local directory for observational data. Defaults to $MDTF_GFDL_TMPDIR/inputdata/obs_data, where the environment variable $MDTF_GFDL_TMPDIR is defined as described above.

  • --MODEL-DATA-ROOT, --MODEL_DATA_ROOT <DIR>: Local directory for model data. Defaults to $MDTF_GFDL_TMPDIR/inputdata/model, where the environment variable $MDTF_GFDL_TMPDIR is defined as described above.

  • --WORKING-DIR, --WORKING_DIR <DIR>: Working directory. Defaults to $MDTF_GFDL_TMPDIR/wkdir, where the environment variable $MDTF_GFDL_TMPDIR is defined as described above.

  • --OUTPUT-DIR, --OUTPUT_DIR, -o <DIR>: Destination for output files. Defaults to $HOME/mdtf_out, which will be created if it doesn’t exist.