pyatoa.core.manager

A class to control workflow and temporarily store and manipulate data

Module Contents

Classes

ManagerStats

A simple dictionary that can get and set keys as attributes and has a

Manager

Pyatoas core workflow object.

exception pyatoa.core.manager.ManagerError[source]

Bases: Exception

A class-wide custom exception raised when functions fail gracefully

class pyatoa.core.manager.ManagerStats[source]

Bases: dict

A simple dictionary that can get and set keys as attributes and has a cleaner looking print statement, used for storing internal statistics in the Manager class

__setattr__(key, value)[source]

Implement setattr(self, name, value).

__getattr__(key)[source]
__str__()[source]

Return str(self).

reset()[source]

Convenience function to reset stats to None

class pyatoa.core.manager.Manager(config=None, ds=None, event=None, st_obs=None, st_syn=None, inv=None, windows=None, staltas=None, adjsrcs=None, gcd=None, baz=None)[source]

Pyatoas core workflow object.

Manager is the central workflow control object. It calls on mid and low level classes to gather data, standardize and preprocess stream objects, generate misfit windows, and calculate adjoint sources. Has a variety of internal sanity checks to ensure that the workflow stays on the rails.

property st[source]

Simplified call to return all streams available, observed and synthetic

__str__()[source]

Print statement shows available data detailing workflow

__repr__()[source]

Return repr(self).

check()[source]

(Re)check the stats of the workflow and data within the Manager.

Rechecks conditions whenever called, incase something has gone awry mid-workflow. Stats should only be set by this function.

reset()[source]

Restart workflow by deleting all collected data in the Manager, but retain dataset, event, config, so a new station can be processed with the same configuration as the previous workflow.

write_to_dataset(ds=None, choice=None)[source]

Write the data collected inside Manager to an ASDFDataSet

Parameters:
  • ds (pyasdf.asdf_data_set.ASDFDataSet or None) – write to a given ASDFDataSet. If None, will look for internal attribute self.ds to write to. Allows overwriting to new datasets

  • choice (list or None) –

    choose which internal attributes to write, by default writes all of the following: ‘event’: Event atttribute as a QuakeML ‘inv’: Inventory attribute as a StationXML ‘st_obs’: Observed waveform under tag config.observed_tag ‘st_syn’: Synthetic waveform under tag config.synthetic_tag ‘windows’: Misfit windows collected by Pyflex are stored under

    auxiliary_data.MisfitWindow

    ’adjsrcs’: Adjoint sources created by Pyadjoint are stored under

    auxiliary_data.AdjointSources

    ’config’: the Pyatoa Config object is stored under

    ’auxiliary_data.Config’ and can be used to re-load the Manager and re-do processing

write_adjsrcs(path='./', write_blanks=True)[source]

Write internally stored adjoint source traces into SPECFEM defined two-column ascii files. Filenames are based on what is expected by Specfem, that is: ‘NN.SSS.CCC.adj’

..note::

By default writes adjoint sources for ALL components if one component has an adjoint source. If an adjoint sourced doesn’t exist for a given component, it will be written with zeros. This is to satisfy SPECFEM3D requirements.

Parameters:
  • path (str) – path to save the

  • write_blanks (bool) – write zeroed out adjoint sources for components with no adjoint sources to meet the requirements of SPECFEM3D. defaults to True

load(code=None, path=None, ds=None, synthetic_tag=None, observed_tag=None, config=True, windows=False, adjsrcs=False)[source]

Populate the manager using a previously populated ASDFDataSet. Useful for re-instantiating an existing workflow that has already gathered data and saved it to an ASDFDataSet.

Note

mgmt.load() will return example data with no dataset

Warning

Loading any floating point values may result in rounding errors. Be careful to round off floating points to the correct place before using in future work.

Parameters:
  • code (str) – SEED conv. code, e.g. NZ.BFZ.10.HHZ

  • path (str) – if no Config object is given during init, the User can specify the config path here to load data from the dataset. This skips the need to initiate a separate Config object.

  • ds (None or pyasdf.asdf_data_set.ASDFDataSet) – dataset can be given to load from, will not set the ds

  • synthetic_tag (str) – waveform tag of the synthetic data in the dataset e.g. ‘synthetic_m00s00’. If None given, will use config attribute.

  • observed_tag (str) – waveform tag of the observed data in the dataset e.g. ‘observed’. If None given, will use config attribute.

  • config (bool) – load config from the dataset, defaults to True but can be set False if Config should be instantiated by the User

  • windows (bool) – load misfit windows from the dataset, defaults to False

  • adjsrcs (bool) – load adjoint sources from the dataset, defaults to False

flow(standardize_to='syn', fix_windows=False, iteration=None, step_count=None, **kwargs)[source]

A convenience function to run the full workflow with a single command. Does not include gathering. Takes kwargs related to all underlying functions.

mgmt = Manager()
mgmt.flow() == mgmt.standardize().preprocess().window().measure()
Parameters:
  • standardize_to (str) – choice of ‘obs’ or ‘syn’ to use one of the time series to standardize (resample, trim etc.) the other.

  • fix_windows (bool) – if True, will attempt to retrieve saved windows from an ASDFDataSet under the iteration and step_count tags to use during misfit quantification rather than measuring new windows

  • iteration (int or str) – if ‘fix_windows’ is True, look for windows in this iteration. If None, will check the latest iteration/step_count in the given dataset

  • step_count (int or str) – if ‘fix_windows’ is True, look for windows in this step_count. If None, will check the latest iteration/step_count in the given dataset

Raises:

ManagerError – for any controlled exceptions

flow_multiband(periods, standardize_to='syn', fix_windows=False, iteration=None, step_count=None, plot=False, **kwargs)[source]

Run the full workflow for a number of distinct period bands, returning a final set of adjoint sources generated as a summation of adjoint sources from each of these period bands. Allows for re-using windows collected from the first set of period bands to evaluate adjoint sources from the remaining period bands.

Note

Kwargs are passed through to Manager.preprocess() function only

Basic Usage

Manager.flow_multiband(periods=[(1, 5), (10, 30), (40, 100)])

Parameters:
  • periods (list of tuples) – a list of tuples that define multiple period bands to generate windows and adjoint sources for. Overwrites the Config’s internal min_period and max_period parameters. The final adjoint source will be a summation of all adjoint sources generated.

  • standardize_to (str) – choice of ‘obs’ or ‘syn’ to use one of the time series to standardize (resample, trim etc.) the other.

  • fix_windows (bool) – if True, will attempt to retrieve saved windows from an ASDFDataSet under the iteration and step_count tags to use during misfit quantification rather than measuring new windows

  • iteration (int or str) – if ‘fix_windows’ is True, look for windows in this iteration. If None, will check the latest iteration/step_count in the given dataset

  • step_count (int or str) – if ‘fix_windows’ is True, look for windows in this step_count. If None, will check the latest iteration/step_count in the given dataset

  • plot (str) – name of figure if given, will plot waveform and map for each period band and append period band to figure name plot

Return type:

tuple of dict

Returns:

(windows, adjoint_sources), returns all the collected measurements from each of the period bands

Raises:

ManagerError – for any controlled exceptions

_combine_mutliband_results(windows, adjsrcs)[source]

Function flow_multiband() generates multiple sets of adjoint sources for a variety of period bands, however the User is only interested in a single adjoint source which is the average of all of these adjoint sources.

This function will take the multiple sets of adjoint sources and sum them accordingly, returning a single set of AdjointSource objects which can be used the same as any adjsrc attribute returned from measure.

Parameters:

adjsrcs (dict of dicts) – a collection of dictionaries whose keys are the period band set in flow_multiband(periods) and whose values are dictionaries returned in Manager.adjsrcs from Manager.measure()

Return type:

(dict of Windows, dict of AdjointSource)

Returns:

a dictionary of Windows, and AdjointSource objects for each component in the componet list. Adjoint sources and misfits are the average of all input adjsrcs for the given periods range

standardize(force=False, standardize_to='syn', normalize_to=None)[source]

Standardize the observed and synthetic traces in place. Ensures Streams have the same starttime, endtime, sampling rate, npts.

Parameters:
  • force (bool) – allow the User to force the function to run even if checks say that the two Streams are already standardized

  • standardize_to (str) – allows User to set which Stream conforms to which by default the Observed traces should conform to the Synthetic ones because exports to Specfem should be controlled by the Synthetic sampling rate, npts, etc. Choices are ‘obs’ and ‘syn’.

  • normalize_to (str) – allow for normalizing the amplitudes of the two traces. Choices are: ‘obs’: normalize synthetic waveforms to the max amplitude of obs ‘syn’: normalize observed waveform to the max amplitude of syn ‘one’: normalize both waveforms so that their max amplitude is 1

preprocess(which='both', filter_=True, corners=2, remove_response=False, taper_percentage=0.05, zerophase=True, normalize_to=None, convolve_with_stf=True, half_duration=None, **kwargs)[source]

Apply a simple, default preprocessing scheme to observed and synthetic waveforms in place.

Default preprocessing tasks: Remove response (optional), rotate (optional), filter, convolve with source time function (optional)

User is free to skip this step and perform their own preprocessing on Manager.st_obs and Manager.st_syn if they require their own unique processing workflow.

Parameters:
  • which (str) – “obs”, “syn” or “both” to choose which stream to process defaults to “both”

  • filter (bool) – filter data using Config.min_period and Config.max_period with corners filter corners. Apply tapers and demeans before and after application of filter.

  • taper_percentage (float) – percentage [0, 1] of taper to apply to head and tail of the data before and after preprocessing

  • corners (int) – number of filter corners to apply if `filter`==True

  • zerophase (bool) – apply a zerophase filter (True) or not (False). Zerophase filters are run back and forth meaning no phase shift is applied, but more waveform distorition may be present.

  • remove_response (bool) – flag, remove instrument response from ‘obs’ type data using the provided inv. Defaults to False. Kwargs are passed directly to the the ObsPy remove_response function. See ObsPy docs for available options.

  • convolve_with_stf (bool) – flag, convolve ‘syn’ type data with a Gaussian source time function to mimic a finite source. Used when half half duration in seismic simulations is set to 0. Defaults to True and relies on parameters half_duration

  • half_duration (float) – Source time function half duration in units of seconds. Only used if `convolve_with_stf`==True

  • normalize_to (str) – allow for normalizing the amplitudes of the two traces. Choices are: ‘obs’: normalize synthetic waveforms to the max amplitude of obs ‘syn’: normalize observed waveform to the max amplitude of syn ‘one’: normalize both waveforms so that their max amplitude is 1

window(fix_windows=False, iteration=None, step_count=None, force=False)[source]

Evaluate misfit windows using Pyflex. Save windows to ASDFDataSet. Allows previously defined windows to be retrieved from ASDFDataSet.

Note

  • Windows are stored as dictionaries of pyflex.Window objects.

  • All windows are saved into the ASDFDataSet, even if retrieved.

  • STA/LTA information is collected and stored internally.

Parameters:
  • fix_windows (bool) – do not pick new windows, but load windows from the given dataset from ‘iteration’ and ‘step_count’

  • iteration (int or str) – if ‘fix_windows’ is True, look for windows in this iteration. If None, will check the latest iteration/step_count in the given dataset

  • step_count (int or str) – if ‘fix_windows’ is True, look for windows in this step_count. If None, will check the latest iteration/step_count in the given dataset

  • force (bool) – ignore flag checks and run function, useful if e.g. external preprocessing is used that doesn’t meet flag criteria

retrieve_windows(iteration, step_count, return_previous)[source]

Mid-level window selection function that retrieves windows from a PyASDF Dataset, recalculates window criteria, and attaches window information to Manager. No access to rejected window information.

Parameters:
  • iteration (int or str) – retrieve windows from the given iteration

  • step_count (int or str) – retrieve windows from the given step count in the given dataset

  • return_previous (bool) – if True: return windows from the previous step count in relation to the given iteration/step_count. if False: return windows from the given iteration/step_count

select_windows_plus()[source]

Mid-level custom window selection function that calls Pyflex select windows, but includes additional window suppression functionality. Includes custom Pyflex addition of outputting rejected windows, which will be used internally for plotting.

Note

Pyflex will throw a ValueError if the arrival of the P-wave is too close to the initial portion of the waveform, considered the ‘noise’ section. This happens for short source-receiver distances (< 100km).

This error becomes a PyflexError if no event/station attributes are provided to the WindowSelector

We could potentially deal with this by zero-padding the waveforms, and running select_windows() again, but for now we just raise a ManagerError and allow processing to continue

measure(force=False)[source]

Measure misfit and calculate adjoint sources using PyAdjoint.

Method for caluculating misfit set in Config, Pyadjoint expects standardized traces with the same spectral content, so this function will not run unless these flags are passed.

Returns a dictionary of adjoint sources based on component. Saves resultant dictionary to a pyasdf dataset if given.

Note

Pyadjoint returns an unscaled misfit value for an entire set of windows. To return a “total misfit” value as defined by Tape (2010) Eq. 6, the total summed misfit will need to be scaled by the number of misfit windows chosen in Manager.window().

Parameters:

force (bool) – ignore flag checks and run function, useful if e.g. external preprocessing is used that doesn’t meet flag criteria

_format_windows()[source]

In pyadjoint.calculate_adjoint_source, the window needs to be a list of lists, with each list containing the [left_window, right_window] Each window argument should be given in units of time (seconds).

Return type:

dict of list of lists

Returns:

dictionary with key related to individual components, and corresponding to a list of lists containing window start and end

plot(choice='both', save=None, show=True, corners=None, figsize=None, dpi=100, **kwargs)[source]

Plot observed and synthetics waveforms, misfit windows, STA/LTA and adjoint sources for all available components. Append information about misfit, windows and window selection. Also as subplot create a source receiver map which contains annotated information detailing src-rcv relationship like distance and BAz. Options to plot either or.

For valid key word arguments see visuals.manager_plotter and visuals.map_maker

Parameters:
  • show (bool) – show the plot once generated, defaults to False

  • save (str) – absolute filepath and filename if figure should be saved

  • corners – {lat_min, lat_max, lon_min, lon_max} corners to cut the map to, otherwise a global map is provided

  • choice (str) – choice for what to plot: * ‘wav’: plot waveform figure only * ‘map’: plot a source-receiver map only * ‘both’ (default): plot waveform and source-receiver map together

  • figsize (tuple) – optional size of the figure, set by plot()

  • dpi (int) – optional dots per inch (resolution) of figure