src.cmip6 module

Code to parse CMIP6 controlled vocabularies and elements of the CMIP6 DRS.

Specifications for the above were taken from the CMIP6 planning document. This was accessed at http://goo.gl/v1drZl – we aren’t aware of a permanent URL for this information.

The CMIP6 controlled vocabularies (lists of registered MIPs, modeling centers, etc.) are derived from data in the PCMDI/cmip6-cmor-tables repo, which is included as a git subtree under /data.

Warning

Functionality here has been added as needed for the project and is incomplete. For example, parsing subexperiments is not supported.

class src.cmip6.CMIP6_CVs(*args, **kwargs)[source]

Bases: Singleton

Interface for looking up information from the CMIP6 controlled vocabulary (CV) file.

Lookups are implemented in an ad-hoc way with util.MultiMap; a more robust solution would use sqlite.

__init__(unittest=False)[source]

Constructor. Only executed once, since this is a Singleton. Reads and parses data in CMIP6_CV.json.

is_in_cv(category, items)[source]

Determine if items take values that are valid for the CV category category.

Parameters:
  • category (str) – The CV category to use to validate values.

  • items (str or list of str) – Entries whose validity we’d like to check.

Returns:

Boolean or list of booleans, corresponding to the validity of the entries in items.

get_lookup(source, dest)[source]

Find the appropriate lookup table to convert values in source (keys) to values in dest (values), generating it if necessary.

Parameters:
  • source (str) – The CV category to use for the keys.

  • dest (str) – The CV category to use for the values.

Returns:

util.MultiMap providing a dict-like lookup interface, ie dest_value = d[source_key].

lookup(source_items, source, dest)[source]

Look up the corresponding dest values for source_items (keys).

Parameters:
  • source_items (str or list) – One or more keys.

  • source (str) – The CV category that the items in source_items belong to.

  • dest (str) – The CV category we’d like the corresponding values for.

Returns:

List of dest values corresponding to each entry in source_items.

lookup_single(source_item, source, dest)[source]

The same as lookup(), but perform lookup for a single source_item, and raise KeyError if the number of values returned is != 1.

table_id_from_freq(frequency)[source]

Specialized lookup to determine which MIP tables use data at the requested frequency.

Should really be handled as a special case of lookup().

Parameters:

frequency (CMIP6DateFrequency) – DateFrequency

Returns:

List of MIP table table_id names, if any, that use data at the given frequency.

class src.cmip6.CMIP6DateFrequency(quantity, unit=None)[source]

Bases: DateFrequency

Subclass of DateFrequency to parse data frequency information as encoded in MIP tables, DRS filenames, etc.

Extends DateFrequency in that this records if the data is a climatological average, although this information is not currently used.

Reference: CMIP6 planning document page 16.

format()[source]

Return string representation of the object, as used in the CMIP6 DRS.

format_local()

String representation as used in framework’s local directory hierarchy (defined in src.data_manager.DataManager.dest_path().)

classmethod from_struct(str_)

Object instantiation method used by src.util.dataclass.mdtf_dataclass() for type coercion.

property is_static

Property indicating time-independent data (e.g., fx in CMIP6 DRS.)

max = datetime.timedelta(days=999999999, seconds=86399, microseconds=999999)
min = datetime.timedelta(days=-999999999)
resolution = datetime.timedelta(microseconds=1)
class src.cmip6.CMIP6_VariantLabel(variant_label: str = sentinel.Mandatory, realization_index: int = None, initialization_index: int = None, physics_index: int = None, forcing_index: int = None)[source]

Bases: object

regex_dataclass which represents and parses the CMIP6 DRS variant label identifier string (e.g., r1i1p1f1.)

References: https://earthsystemcog.org/projects/wip/mip_table_about, although this doesn’t document all cases used in CMIP6. See also note 8 on page 9 of the CMIP6 planning document.

variant_label: str = sentinel.Mandatory

Input to from_string(). Complete variant label identifier string (e.g., ‘r1i1p1f1’.)

realization_index: int = None

Realization index (integer following the letter r.)

initialization_index: int = None

Initialization index (integer following the letter i.)

physics_index: int = None

Physics index (integer following the letter p.)

forcing_index: int = None

Forcing index (integer following the letter f.)

classmethod from_string(str_, *args)

Create an object instance from a string representation str_. Used by regex_dataclass() for parsing field values and automatic type coercion.

class src.cmip6.CMIP6_MIPTable(table_id: str = sentinel.Mandatory, table_prefix: str = '', table_freq: InitVar = '', table_suffix: str = '', table_qualifier: str = '')[source]

Bases: object

regex_dataclass which represents and parses the MIP table identifier string.

Reference: https://earthsystemcog.org/projects/wip/mip_table_about, although this doesn’t document all cases used in CMIP6.

table_id: str = sentinel.Mandatory

Input to from_string(). table_id string as used in the DRS.

table_prefix: str = ''

Substring of table_id specifying modeling realm.

table_freq: InitVar = ''

Substring of table_id specifying sampling frequency.

table_suffix: str = ''

Substring of table_id specifying sampling/averaging methods.

table_qualifier: str = ''

Substring of table_id specifying sampling/averaging methods.

frequency: CMIP6DateFrequency

Frequency at which data for the table is sampled. From table_freq.

spatial_avg: str

Method used for spatial averaging, from table_qualifier. Either ‘zonal_mean’ or None.

temporal_avg: str

Method used for time averaging, from table_qualifier. Either ‘point’ or ‘interval’.

region: str

Geographic region described by the table, from table_suffix. Either ‘Antarctica’, ‘Greenland’ or None.

classmethod from_string(str_, *args)

Create an object instance from a string representation str_. Used by regex_dataclass() for parsing field values and automatic type coercion.

class src.cmip6.CMIP6_GridLabel(grid_label: str = sentinel.Mandatory, global_mean: InitVar = '', regrid: str = '', grid_number: int = 0, region: str = '', zonal_mean: InitVar = '')[source]

Bases: object

regex_dataclass which represents and parses the CMIP6 DRS grid label identifier string.

Reference: CMIP6 planning document, note 11 on page 11.

grid_label: str = sentinel.Mandatory

Input to from_string(). grid_label string as used in the DRS.

global_mean: InitVar = ''

Substring of grid_label for globally-averaged data.

regrid: str = ''

Substring of grid_label for regridded data.

grid_number: int = 0

Regridding method used (0 if native grid). As per CMIP6 spec, meaning of each integer is not specified and left to individual modeling centers.

region: str = ''

Geographic region described by the grid. Either ‘Antarctica’, ‘Greenland’ or None.

zonal_mean: InitVar = ''

Substring of grid_label for zonal mean averaging.

spatial_avg: str

Method used for spatial averaging. Either ‘global_mean’, ‘zonal_mean’ or None.

native_grid: bool

Boolean, True if data is on model’s native grid.

classmethod from_string(str_, *args)

Create an object instance from a string representation str_. Used by regex_dataclass() for parsing field values and automatic type coercion.

class src.cmip6.CMIP6_DRSDirectory(grid_label: CMIP6_GridLabel = '', global_mean: InitVar = '', regrid: str = '', grid_number: int = 0, zonal_mean: InitVar = '', table_id: CMIP6_MIPTable = '', table_prefix: str = '', table_freq: InitVar = '', table_suffix: str = '', table_qualifier: str = '', variant_label: CMIP6_VariantLabel = '', realization_index: int = None, initialization_index: int = None, physics_index: int = None, forcing_index: int = None, directory: str = sentinel.Mandatory, activity_id: str = '', institution_id: str = '', source_id: str = '', experiment_id: str = '', version_date: Date = None)[source]

Bases: CMIP6_VariantLabel, CMIP6_MIPTable, CMIP6_GridLabel

regex_dataclass which represents and parses the DRS directory path.

Reference: CMIP6 planning document, page 17.

Warning

This regex will fail on paths involving subexperiments.

directory: str = sentinel.Mandatory

Input to from_string(). Directory path string (excluding filename) as used in the DRS.

activity_id: str = ''

Activity ID (MIP) of data, as parsed from directory.

institution_id: str = ''

Institution ID of data, as parsed from directory.

forcing_index: int = None

Forcing index (integer following the letter f.)

classmethod from_string(str_, *args)

Create an object instance from a string representation str_. Used by regex_dataclass() for parsing field values and automatic type coercion.

global_mean: InitVar = ''

Substring of grid_label for globally-averaged data.

grid_number: int = 0

Regridding method used (0 if native grid). As per CMIP6 spec, meaning of each integer is not specified and left to individual modeling centers.

initialization_index: int = None

Initialization index (integer following the letter i.)

physics_index: int = None

Physics index (integer following the letter p.)

realization_index: int = None

Realization index (integer following the letter r.)

region: str = ''

Geographic region described by the table, from table_suffix. Either ‘Antarctica’, ‘Greenland’ or None.

regrid: str = ''

Substring of grid_label for regridded data.

table_freq: InitVar = ''

Substring of table_id specifying sampling frequency.

table_prefix: str = ''

Substring of table_id specifying modeling realm.

table_qualifier: str = ''

Substring of table_id specifying sampling/averaging methods.

table_suffix: str = ''

Substring of table_id specifying sampling/averaging methods.

zonal_mean: InitVar = ''

Substring of grid_label for zonal mean averaging.

frequency: CMIP6DateFrequency

Frequency at which data for the table is sampled. From table_freq.

spatial_avg: str

Method used for spatial averaging, from table_qualifier. Either ‘zonal_mean’ or None.

temporal_avg: str

Method used for time averaging, from table_qualifier. Either ‘point’ or ‘interval’.

native_grid: bool

Boolean, True if data is on model’s native grid.

source_id: str = ''

Source ID (model name) of data, as parsed from directory.

experiment_id: str = ''

Experiment ID of data, as parsed from directory.

variant_label: CMIP6_VariantLabel = ''

Variant label of data, as parsed from directory.

table_id: CMIP6_MIPTable = ''

MIP table of data, as parsed from directory.

grid_label: CMIP6_GridLabel = ''

Grid label of data, as parsed from directory.

version_date: Date = None

Revision date of data, as parsed from directory.

class src.cmip6.CMIP6_DRSFilename(grid_label: CMIP6_GridLabel = '', global_mean: InitVar = '', regrid: str = '', grid_number: int = 0, zonal_mean: InitVar = '', table_id: CMIP6_MIPTable = '', table_prefix: str = '', table_freq: InitVar = '', table_suffix: str = '', table_qualifier: str = '', variant_label: CMIP6_VariantLabel = '', realization_index: int = None, initialization_index: int = None, physics_index: int = None, forcing_index: int = None, filename: str = sentinel.Mandatory, variable_id: str = '', source_id: str = '', experiment_id: str = '', start_date: Date = None, end_date: Date = None)[source]

Bases: CMIP6_VariantLabel, CMIP6_MIPTable, CMIP6_GridLabel

regex_dataclass which represents and parses the DRS filename.

Reference: CMIP6 planning document, page 14-15.

forcing_index: int = None

Forcing index (integer following the letter f.)

classmethod from_string(str_, *args)

Create an object instance from a string representation str_. Used by regex_dataclass() for parsing field values and automatic type coercion.

global_mean: InitVar = ''

Substring of grid_label for globally-averaged data.

grid_number: int = 0

Regridding method used (0 if native grid). As per CMIP6 spec, meaning of each integer is not specified and left to individual modeling centers.

initialization_index: int = None

Initialization index (integer following the letter i.)

physics_index: int = None

Physics index (integer following the letter p.)

realization_index: int = None

Realization index (integer following the letter r.)

region: str = ''

Geographic region described by the table, from table_suffix. Either ‘Antarctica’, ‘Greenland’ or None.

regrid: str = ''

Substring of grid_label for regridded data.

table_freq: InitVar = ''

Substring of table_id specifying sampling frequency.

table_prefix: str = ''

Substring of table_id specifying modeling realm.

table_qualifier: str = ''

Substring of table_id specifying sampling/averaging methods.

table_suffix: str = ''

Substring of table_id specifying sampling/averaging methods.

zonal_mean: InitVar = ''

Substring of grid_label for zonal mean averaging.

frequency: CMIP6DateFrequency

Frequency at which data for the table is sampled. From table_freq.

spatial_avg: str

Method used for spatial averaging, from table_qualifier. Either ‘zonal_mean’ or None.

temporal_avg: str

Method used for time averaging, from table_qualifier. Either ‘point’ or ‘interval’.

native_grid: bool

Boolean, True if data is on model’s native grid.

filename: str = sentinel.Mandatory

Input to from_string(). Filename as used in the DRS.

variable_id: str = ''

Variable name, as parsed from filename.

table_id: CMIP6_MIPTable = ''

MIP table of data, as parsed from filename.

source_id: str = ''

Source ID (model name) of data, as parsed from filename.

experiment_id: str = ''

Experiment ID of data, as parsed from filename.

variant_label: CMIP6_VariantLabel = ''

Variant label of data, as parsed from filename.

grid_label: CMIP6_GridLabel = ''

Grid label of data, as parsed from filename.

start_date: Date = None

Start date of data, as parsed from filename.

end_date: Date = None

End date of data, as parsed from filename.

date_range: DateRange

Start and end dates combined into a DateRange object.

class src.cmip6.CMIP6_DRSPath(grid_label: CMIP6_GridLabel = '', global_mean: InitVar = '', regrid: str = '', grid_number: int = 0, zonal_mean: InitVar = '', table_id: CMIP6_MIPTable = '', table_prefix: str = '', table_freq: InitVar = '', table_suffix: str = '', table_qualifier: str = '', variant_label: CMIP6_VariantLabel = '', realization_index: int = None, initialization_index: int = None, physics_index: int = None, forcing_index: int = None, filename: CMIP6_DRSFilename = '', variable_id: str = '', source_id: str = '', experiment_id: str = '', start_date: Date = None, end_date: Date = None, directory: CMIP6_DRSDirectory = '', activity_id: str = '', institution_id: str = '', version_date: Date = None, path: str = sentinel.Mandatory)[source]

Bases: CMIP6_DRSDirectory, CMIP6_DRSFilename

regex_dataclass which represents and parses a full CMIP6 DRS path.

activity_id: str = ''

Activity ID (MIP) of data, as parsed from directory.

end_date: Date = None

End date of data, as parsed from filename.

experiment_id: str = ''

Experiment ID of data, as parsed from directory.

forcing_index: int = None

Forcing index (integer following the letter f.)

classmethod from_string(str_, *args)

Create an object instance from a string representation str_. Used by regex_dataclass() for parsing field values and automatic type coercion.

global_mean: InitVar = ''

Substring of grid_label for globally-averaged data.

grid_label: CMIP6_GridLabel = ''

Grid label of data, as parsed from directory.

grid_number: int = 0

Regridding method used (0 if native grid). As per CMIP6 spec, meaning of each integer is not specified and left to individual modeling centers.

initialization_index: int = None

Initialization index (integer following the letter i.)

institution_id: str = ''

Institution ID of data, as parsed from directory.

physics_index: int = None

Physics index (integer following the letter p.)

realization_index: int = None

Realization index (integer following the letter r.)

region: str = ''

Geographic region described by the table, from table_suffix. Either ‘Antarctica’, ‘Greenland’ or None.

regrid: str = ''

Substring of grid_label for regridded data.

source_id: str = ''

Source ID (model name) of data, as parsed from directory.

start_date: Date = None

Start date of data, as parsed from filename.

table_freq: InitVar = ''

Substring of table_id specifying sampling frequency.

table_id: CMIP6_MIPTable = ''

MIP table of data, as parsed from directory.

table_prefix: str = ''

Substring of table_id specifying modeling realm.

table_qualifier: str = ''

Substring of table_id specifying sampling/averaging methods.

table_suffix: str = ''

Substring of table_id specifying sampling/averaging methods.

variable_id: str = ''

Variable name, as parsed from filename.

variant_label: CMIP6_VariantLabel = ''

Variant label of data, as parsed from directory.

version_date: Date = None

Revision date of data, as parsed from directory.

zonal_mean: InitVar = ''

Substring of grid_label for zonal mean averaging.

frequency: CMIP6DateFrequency

Frequency at which data for the table is sampled. From table_freq.

spatial_avg: str

Method used for spatial averaging, from table_qualifier. Either ‘zonal_mean’ or None.

temporal_avg: str

Method used for time averaging, from table_qualifier. Either ‘point’ or ‘interval’.

native_grid: bool

Boolean, True if data is on model’s native grid.

date_range: DateRange

Start and end dates combined into a DateRange object.

path: str = sentinel.Mandatory

Input to from_string(). Full path to data file as used in the DRS.

directory: CMIP6_DRSDirectory = ''

Input to from_string(). Directory path string (excluding filename) as used in the DRS.

filename: CMIP6_DRSFilename = ''

Input to from_string(). Filename as used in the DRS.