pyatoa.core.executive

A class and functionality to direct multiple Managers to parallelize data processing in Pyatoa using the in-built concurrent.futures package

>>> from pyatoa import Executive >>> exc = Executive(event_ids=…, station_codes=…, config=…) >>> misfits = exc.process()

Note

I was getting the following error randomly on my runs, I think it had to do with the fact that I was requesting all of my cores to do the job (16). Dropped the max number of jobs to 8 and things ran fine. Will investigate

concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

Module Contents

Classes

Executive

The Executive is hierarchically above Pyatoa's core class, the Manager.

class pyatoa.core.executive.Executive(event_ids, station_codes, config, max_stations=4, max_events=1, cat='+', log_level='DEBUG', cwd=None, datasets=None, figures=None, logs=None, adjsrcs=None, ds_fid_template=None)[source]

The Executive is hierarchically above Pyatoa’s core class, the Manager. It sets up a simple framework to organize and parallelize misfit quantification.

property codes[source]

Define a set of event-station codes that are used to traverse through all possible source receiver combinations.

Note

Workaround for having it be pretty difficult to pass multiple arguments into an executor. Just pass a list of strings that is split by the parallel processes

check()[source]

Parameter checking

process()[source]

Process all events concurrently

process_event(event_id)[source]

Process all given stations concurrently for a single event

Parameters:

event_id (str) – one value from the Executor.events list specifying a given event to process

process_station(event_id_and_station_code)[source]

Parallel process multiple Managers simultaneously, which is the biggest time sync. IO is done in serial to get around BlockingIO

Note

Very broad exceptions to keep process running smoothly, you will need to check log messages individually to figure out if and where things did not work

Note

Employs a workaround to inability to parallel write to HDF5 files BlockingIOError by doing the processing first, and then waiting for each process to finish writing before accessing.

Parameters:

event_id_and_station_code (str) – a string concatenation of a given event id and station code, which will be used to process a single source receiver pair

_check_rank(event_id_and_station_code)[source]

Poor man’s method for determining the processor rank for a given event. Used so that processes that happen only once (e.g., writing config) are done consistently by one process

Parameters:

event_id_and_station_code (str) – a string concatenation of a given event id and station code, which will be used to process a single source receiver pair

Return type:

int

Returns:

rank index in Executive.codes based on event and station

_generate_logger(log_path)[source]

Create a log file for each source. No stream handler, only file output Also create a memory handler to dump all log messages at once, rather than as they happen, allowing multiple stations to write to the same file sequentially

Parameters:

log_path (str) – path and filename to save log file