# 5. POD development guidelines¶

The framework itself is written in Python, and can call PODs written in any scripting language. However, Python support by the lead team will be “first among equals” in terms of priority for allocating developer resources, etc.

• To achieve portability, the MDTF cannot accept PODs written in closed-source languages (e.g., MATLAB and IDL; try Octave and GDL if possible). We also cannot accept PODs written in compiled languages (e.g., C or Fortran): installation would rapidly become impractical if users had to check compilation options for each POD.

• Python is strongly encouraged for new PODs; PODs funded through the CPO grant are requested to be developed in Python. Python version >= 3.6 is required. Official support for Python 2 was discontinued as of January 2020.

• If your POD was previously developed in NCL or R (and development is not funded through a CPO grant), you do not need to re-write existing scripts in Python 3 if doing so is likely to introduce new bugs into stable code, especially if you’re unfamiliar with Python.

• If scripts were written in closed-source languages, translation to Python 3.6 or above is required.

## 5.2. Preparation for POD implementation¶

We assume that, at this point, you have a set of scripts, written in languages consistent with the framework’s open source policy, that a) read in model data, b) perform analysis, and c) output figures. Here are 3 steps to prepare your scripts for POD implementation.

We recommend running the framework on the sample model data again with both save_ps and save_nc in the configuration input src/default_tests.jsonc set to true. This will preserve directories and files created by individual PODs in the output directory, which could come in handy when you go through the instructions below, and help understand how a POD is expected to write output.

## 5.5. Guidelines for testing your POD¶

Test before distribution. Find people (eg, nearby postdocs/grads and members from other POD-developing groups) who are not involved in your POD’s implementation and are willing to help. Give the tar files and point your GitHub repo to them. Ask them to try running the framework with your POD following the Getting Started instructions. Ask for comments on whether they can understand the documentation.

Test how the POD fails. Does it stop with clear errors if it doesn’t find the files it needs? How about if the dates requested are not presented in the model data? Can developers run it on data from another model? Here are some simple tests you should try:

• Move the inputdata directory around. Your POD should still work by simply updating the values of OBS_DATA_ROOT and MODEL_DATA_ROOT in the configuration input file.

• Try to run your POD with a different set of model data.

• If you have problems getting another set of data, try changing the files’ CASENAME and variable naming convention. The POD should work by updating CASENAME and convention in the configuration input.

• Try your POD on a different machine. Check that your POD can work with reasonable machine configuration and computation power, e.g., can run on a machine with 32 GB memory, and can finish computation in 10 min. Will memory and run time become a problem if one tries your POD on model output of high spatial resolution and temporal frequency (e.g., avoid memory problem by reading in data in segments)? Does it depend on a particular version of a certain library? Consult the lead team if there’s any unsolvable problems.

## 5.6. Other tips on implementation¶

1. Structure of the code package: Implementing the constituent PODs in accordance with the structure described in earlier sections makes it easy to pass the package (or just part of it) to other groups.

2. Robustness to model file/variable names: Each POD should be robust to modest changes in the file/variable names of the model output; see Getting Started regarding the model data filename structure, An example of using framework-provided environment variables and POD development checklist regarding using the environment variables and robustness tests. Also, it would be easier to apply the code package to a broader range of model output.

3. Save digested data after analysis: Can be used, e.g., to save time when there is a substantial computation that can be re-used when re-running or re-plotting diagnostics. See Step 5: Output and cleanup regarding where to save the output.

4. Self-documenting: For maintenance and adaptation, to provide references on the scientific underpinnings, and for the code package to work out of the box without support. See POD development checklist.

5. Handle large model data: The spatial resolution and temporal frequency of climate model output have increased in recent years. As such, developers should take into account the size of model data compared with the available memory. For instance, the example POD precip_diurnal_cycle and Wheeler_Kiladis only analyze part of the available model output for a period specified by the environment variables FIRSTYR and LASTYR, and the convective_transition_diag module reads in data in segments.

6. Basic vs. advanced diagnostics (within a POD): Separate parts of diagnostics, e.g, those might need adjustment when model performance out of obs range.

7. Avoid special characters (!@#\$%^&*) in file/script names.

See Running the package on sample model data and :doc: framework operation walkthrough <dev_walkthrough> for details on how the package is called. See the command line reference for documentation on command line options (or run mdtf --help).

Avoid making assumptions about the machine on which the framework will run beyond what’s listed here; a development priority is to interface the framework with cluster and cloud job schedulers to enable individual PODs to run in a concurrent, distributed manner.