src.util module

Common functions and classes used in multiple places in the MDTF code. Specifically, util.py implements general functionality that’s not MDTF-specific.

class src.util._Singleton[source]

Bases: type

Private metaclass that creates a Singleton base class when called. This version is copied from https://stackoverflow.com/a/6798042 and should be compatible with both Python 2 and 3.

_instances = {}
class src.util.Singleton(*args, **kwargs)[source]

Bases: src.util.SingletonMeta

Parent class defining the Singleton pattern. We use this as safer way to pass around global state.

classmethod _reset()[source]

Private method of all Singleton-derived classes added for use in unit testing only. Calling this method on test teardown deletes the instance, so that tests coming afterward will initialize the Singleton correctly, instead of getting the state set during previous tests.

class src.util.ExceptionPropagatingThread(group=None, target=None, name=None, args=(), kwargs=None, *, daemon=None)[source]

Bases: threading.Thread

Class to propagate exceptions raised in a child thread back to the caller thread when the child is join()ed. Adapted from https://stackoverflow.com/a/31614591.

run()[source]

Method representing the thread’s activity.

You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.

join(timeout=None)[source]

Wait until the thread terminates.

This blocks the calling thread until the thread whose join() method is called terminates – either normally or through an unhandled exception or until the optional timeout occurs.

When the timeout argument is present and not None, it should be a floating point number specifying a timeout for the operation in seconds (or fractions thereof). As join() always returns None, you must call is_alive() after join() to decide whether a timeout happened – if the thread is still alive, the join() call timed out.

When the timeout argument is not present or None, the operation will block until the thread terminates.

A thread can be join()ed many times.

join() raises a RuntimeError if an attempt is made to join the current thread as that would cause a deadlock. It is also an error to join() a thread before it has been started and attempts to do so raises the same exception.

class src.util.MultiMap(*args, **kwargs)[source]

Bases: collections.defaultdict

Extension of the dict class that allows doing dictionary lookups from either keys or values.

Syntax for lookup from keys is unchanged, bd['key'] = 'val', while lookup from values is done on the inverse attribute and returns a set of matching keys if more than one match is present: bd.inverse['val'] = ['key1', 'key2']. See https://stackoverflow.com/a/21894086.

__init__(*args, **kwargs)[source]

Initialize MultiMap by passing an ordinary dict.

get_(key)[source]
to_dict()[source]
inverse()[source]
inverse_get_(val)[source]
class src.util.NameSpace[source]

Bases: dict

A dictionary that provides attribute-style access.

For example, d[‘key’] = value becomes d.key = value. All methods of dict are supported.

Note: recursive access (d.key.subkey, as in C-style languages) is not

supported.

Implementation is based on https://github.com/Infinidat/munch.

toDict()[source]

Recursively converts a NameSpace back into a dictionary.

classmethod _toDict(x)[source]

Recursively converts a NameSpace back into a dictionary. nb. As dicts are not hashable, they cannot be nested in sets/frozensets.

classmethod fromDict(x)[source]

Recursively transforms a dictionary into a NameSpace via copy. nb. As dicts are not hashable, they cannot be nested in sets/frozensets.

copy() → a shallow copy of D[source]
_freeze()[source]

Return immutable representation of (current) attributes.

We do this to enable comparison of two Namespaces, which otherwise would be done by the default method of testing if the two objects refer to the same location in memory. See https://stackoverflow.com/a/45170549.

src.util.strip_comments(str_, delimiter=None)[source]
src.util.read_json(file_path)[source]
src.util.parse_json(str_)[source]
src.util.write_json(struct, file_path, verbose=0, sort_keys=False)[source]

Wrapping file I/O simplifies unit testing.

Parameters
  • struct (dict) –

  • file_path (str) – path of the JSON file to write.

  • verbose (int, optional) – Logging verbosity level. Default 0.

src.util.pretty_print_json(struct, sort_keys=False)[source]

Pseudo-YAML output for human-readable debugging output only - not valid JSON

src.util.find_files(src_dirs, filename_globs)[source]

Return list of files in src_dirs matching any of filename_globs.

Wraps glob.glob for the use cases encountered in cleaning up POD output.

Parameters
  • src_dirs – Directory, or a list of directories, to search for files in. The function will also search all subdirectories.

  • filename_globs – Glob, or a list of globs, for filenames to match. This is a shell globbing pattern, not a full regex.

Returns: list of paths to files matching any of the criteria.

If no files are found, the list is empty.

src.util.recursive_copy(src_files, src_root, dest_root, copy_function=None, overwrite=False)[source]

Copy src_files to dest_root, preserving relative subdirectory structure.

Copies a subset of files in a directory subtree rooted at src_root to an identical subtree structure rooted at dest_root, creating any subdirectories as needed. For example, recursive_copy(‘/A/B/C.txt’, ‘/A’, ‘/D’) will first create the destination subdirectory /D/B and copy ‘/A/B/C.txt` to /D/B/C.txt.

Parameters
  • src_files – Absolute path, or list of absolute paths, to files to copy.

  • src_root – Root subtree of all files in src_files. Raises a ValueError if all files in src_files are not contained in the src_root directory.

  • dest_root – Destination directory in which to create the copied subtree.

  • copy_function – Function to use to copy individual files. Must take two arguments, the source and destination paths, respectively. Defaults to shutil.copy2().

  • overwrite – Boolean, deafult False. If False, raise an OSError if any destination files already exist, otherwise silently overwrite.

src.util.resolve_path(path, root_path='', env=None)[source]

Abbreviation to resolve relative paths.

Parameters
  • path (str) – path to resolve.

  • root_path (str, optional) – root path to resolve path with. If not given, resolves relative to cwd.

Returns: Absolute version of path, relative to root_path if given,

otherwise relative to os.getcwd.

src.util.check_executable(exec_name)[source]

Tests if <exec_name> is found on the current $PATH.

Parameters

exec_name (str) – Name of the executable to search for.

Returns: bool True/false if executable was found on $PATH.

src.util.poll_command(command, shell=False, env=None)[source]

Runs a shell command and prints stdout in real-time.

Optional ability to pass a different environment to the subprocess. See documentation for the Python2 subprocess module.

Parameters
  • command – list of command + arguments, or the same as a single string. See subprocess syntax. Note this interacts with the shell setting.

  • shell (bool, optional) – shell flag, passed to Popen, default False.

  • env (dict, optional) – environment variables to set, passed to Popen, default None.

exception src.util.TimeoutAlarm[source]

Bases: Exception

src.util.run_command(command, env=None, cwd=None, timeout=0, dry_run=False)[source]

Subprocess wrapper to facilitate running single command without starting a shell.

Note

We hope to save some process overhead by not running the command in a shell, but this means the command can’t use piping, quoting, environment variables, or filename globbing etc.

See documentation for the Python2 subprocess module.

Parameters
  • command (list of str) – List of commands to execute

  • env (dict, optional) – environment variables to set, passed to Popen, default None.

  • cwd (str, optional) – child processes’ working directory, passed to Popen. Default is None, which uses parent processes’ directory.

  • timeout (int, optional) – Optionally, kill the command’s subprocess and raise a CalledProcessError if the command doesn’t finish in timeout seconds.

Returns

list of str containing output that was written to stdout by each command. Note: this is split on newlines after the fact.

Raises

CalledProcessError – If any commands return with nonzero exit code. Stderr for that command is stored in output attribute.

src.util.run_shell_command(command, env=None, cwd=None, dry_run=False)[source]

Subprocess wrapper to facilitate running shell commands.

See documentation for the Python2 subprocess module.

Parameters
  • commands (list of str) – List of commands to execute

  • env (dict, optional) – environment variables to set, passed to Popen, default None.

  • cwd (str, optional) – child processes’ working directory, passed to Popen. Default is None, which uses parent processes’ directory.

Returns

list of str containing output that was written to stdout by each command. Note: this is split on newlines after the fact, so if commands give != 1 lines of output this will not map to the list of commands given.

Raises

CalledProcessError – If any commands return with nonzero exit code. Stderr for that command is stored in output attribute.

src.util.is_iterable(obj)[source]
src.util.coerce_to_iter(obj, coll_type=<class 'list'>)[source]
src.util.coerce_from_iter(obj)[source]
src.util.filter_kwargs(kwarg_dict, function)[source]

Given a dict of kwargs, return only those kwargs accepted by function.

src.util.signal_logger(caller_name, signum=None, frame=None)[source]

Lookup signal name from number; https://stackoverflow.com/a/2549950.