pyatoa.core.inspector
A class to aggregate time windows, source-receiver information and misfit using Pandas.
Module Contents
Classes
This plugin object will collect information from a Pyatoa run folder and |
- class pyatoa.core.inspector.Inspector(tag='default', verbose=True)[source]
Bases:
pyatoa.visuals.insp_plot.InspectorPlotter
This plugin object will collect information from a Pyatoa run folder and allow the User to easily understand statistical information or generate statistical plots to help understand a seismic inversion.
Inherits plotting capabilities from InspectorPlotter class to reduce clutter
- property restarts[source]
Try to guess the indices of restarts for convergence plot based on misfit increase in adjacent good models as well as discontinous misfit values for the final line search model and subsequent initial model. Not guaranteed to catch everything so may require manual review using the convergence() function
- _get_srcrcv_from_dataset(ds)[source]
Get source and receiver information from dataset, this includes latitude and longitude values for both, and event information including magnitude, origin time, id, etc.
Returns Dataframes for sources and receivers iff they are not already contained in the class dataframes, to avoid duplicates.
Returns empty DataFrames if no unique info was found.
- Parameters:
ds (pyasdf.ASDFDataSet) – dataset to query for distances
- Rtype source:
pandas.core.frame.DataFrame
- Return source:
single row Dataframe containing event info from dataset
- Rtype receivers:
multiindexed dataframe containing unique station info
- _get_windows_from_dataset(ds)[source]
Get window and misfit information from dataset auxiliary data Model and Step information should match between the two auxiliary data objects MisfitWindows and AdjointSources
- TODO: break this into _get_windows_from_dataset and
_get_adjsrcs_from_dataset?
- Parameters:
ds (pyasdf.ASDFDataSet) – dataset to query for misfit:
- Return type:
pandas.DataFrame
- Returns:
a dataframe object containing information per misfit window
- _parse_nonetype_eval(iteration, step_count)[source]
Whenever a user does not choose an iteration or step count, e.g., in plotting functions, this function defines default values based on the initial model (if neither given), or the last step count for a given iteration (if only iteration is given). Only step count is not allowed
- discover(path='./', ignore_symlinks=True)[source]
Allow the Inspector to scour through a path and find relevant files, appending them to the internal structure as necessary.
- append(dsfid, srcrcv=True, windows=True)[source]
Simple function to parse information from a pyasdf.asdf_data_setASDFDataSet file and append it to the currect collection of information.
- extend(windows)[source]
Extend the current Inspector data frames with the windows from another Inspector. This is useful for when an inversion has been run in legs, so two individual inspectors constitute a single inversion.
Note
The current inspector is considered leg A, and the argument ‘windows’ is considered leg B. Leg B will have its iteration numbers changed to reflect this
Warning
This will only work if all the events and stations are the same. That is, only two identical inversion scenarios can be used.
- Parameters:
windows (pandas.core.data_frame.DataFrame or list of DataFrames) – Windows from a separate inspector object that will be used to extend the current Inspector. Can also be provided as a list of DataFrames to extend multiple times.
- save(path='./', fmt='csv', tag=None)[source]
Save the downloaded attributes into JSON files for easier re-loading.
Note
fmt == ‘hdf’ requires ‘pytables’ to be installed in the environment
- read(path='./', fmt=None, tag=None)[source]
Load previously saved attributes to avoid re-processing data.
- isolate(iteration=None, step_count=None, event=None, network=None, station=None, channel=None, component=None, keys=None, exclude=None, unique_key=None)[source]
Returns a new dataframe that is grouped by a given index if variable is None, defaults to returning all available values
- Parameters:
event (str) – event id e.g. ‘2018p130600’ (optional)
iteration (str) – iteration e.g. ‘i00’ (optional)
step_count (str) – step count e.g. ‘s00’ (optional)
station (str) – station name e.g. ‘BKZ’ (optional)
network (str) – network name e.g. ‘NZ’ (optional)
channel (str) – channel name e.g. ‘HHE’ (optional)
component (str) – component name e.g. ‘Z’ (optional)
unique_key (str) – isolates model, event and station information, alongside a single info key, such as dlnA. Useful for looking at one variable without have to write out long lists to ‘exclude’ or ‘keys’
keys (list) – list of keys to retain in returned dataset, ‘exclude’ will override this variable, best to use them separately
exclude (list) – list of keys to remove from returned dataset
- Return type:
pandas.DataFrame
- Returns:
DataFrame with selected rows based on selected column values
- nwin(level='step')[source]
Find the cumulative length of misfit windows for a given iter/step, or the number of misfit windows for a given iter/step.
Note
Neat trick to select just by station: insp.windows(level=’station’).query(“station == ‘BFZ’”)
- Parameters:
level (str) –
Level to get number of windows by. Default is ‘step’
step: to get the total window length and number of windows for the given step count.
station: to get this on a per-station basis, useful for identifying sta quality.
- Return type:
pandas.DataFrame
- Returns:
a DataFrame with indices corresponding to iter, step, columns listing the number of windows (nwin) and the cumulative length of windows in seconds (length_s)
- misfit(level='step', reset=False)[source]
Sum the total misfit for a given iteration based on the individual misfits for each misfit window, and the number of sources used. Calculated misfits are stored internally to avoid needing to recalculate each time this function is called
Note
- To get per-station misfit on a per-step basis
df = insp.misfits(level=”station”).query(“station == ‘TOZ’”) df.groupby([‘iteration’, ‘step’]).sum()
- Parameters:
- Return type:
- Returns:
total misfit for each iteration in the class
- stats(level='event', choice='mean', key=None, iteration=None, step_count=None)[source]
Calculate the per-level statistical values for DataFrame
- Parameters:
- Return type:
pandas.DataFrame
- Returns:
DataFrame containing the choice of stats for given options
- minmax(iteration=None, step_count=None, keys=None, quantities=None, pprint=True)[source]
Calculate and print the min/max values for a whole slew of parameters for a given iteration and step count. Useful for understanding the worst/ best case scenarios and their relation to the average.
- Parameters:
iteration (str) – filter for a given iteration
step_count (str) – filter for a given step count
keys (list of str) – keys to calculate minmax values for, must be a subset of Inspector.windows.keys()
quantities (list of str) – quantities to get values for, e.g. min, max, median, must be an attribute of pandas.core.series.Series
pprint (bool) – pretty print the resulting values
- Return type:
- Returns:
dictionary containing the minmax stats
- compare(iteration_a=None, step_count_a=None, iteration_b=None, step_count_b=None)[source]
Compare the misfit and number of windows on an event by event basis between two evaluations. Provides absolute values as well as differences. Final dataframe is sorted by the difference in misfit, showing the most and least improved events.
- Parameters:
- Return type:
pandas.core.data_frame.DataFrame
- Returns:
a sorted data frame containing the difference of misfit and number of windows between final and initial
- compare_windows(iteration_a=None, step_count_a=None, iteration_b=None, step_count_b=None)[source]
Compare individual, matching misfit windows between two evaluations.
Note
This will only work/make sense if the windows were fixed between the two evaluations, such that they share the exact same window selections.
- Parameters:
- Return type:
pandas.core.data_frame.DataFrame
- Returns:
a data frame containing differences of windowing paramenters between final and initial models
- filter_sources(lat_min=None, lat_max=None, lon_min=None, lon_max=None, depth_min=None, depth_max=None, mag_min=None, mag_max=None, min_start=None, max_start=None)[source]
Go through misfits and windows and remove events that fall outside a certain bounding box. Return sources that fall within the box. Bounds are inclusive of given values.
- Parameters:
lat_min (float) – minimum latitude in degrees
lat_max (float) – maximum latitude in degrees
lon_min (float) – minimum longitude in degrees
lon_max (float) – maximum longitude in degrees
depth_min (float) – minimum depth of event in km, depth is positive
depth_max (float) – maximum depth of event in km, depth is positive
mag_min (float) – minimum magnitude
mag_max (float) – maximum magnitude
min_start (obspy.UTCDateTime()) – minimum origintime of event
max_start (obspy.UTCDateTime()) – maximum origintime of event
- get_models()[source]
Return a sorted list of misfits which correspond to accepted models, label discards of the line search, and differentiate the final accepted line search evaluation from the previous iteration and the initial evaluation of the current iteration.
Note
State and status is given as: 0 == INITIAL function evaluation for the model; 1 == SUCCESS -ful function evaluation for the model; -1 == DISCARD trial step from line search.
- Return type:
pandas.core.data_frame.DataFrame
- Returns:
a dataframe containing model numbers, their corresponding iteration, step count and misfit value, and the status of the function evaluation.
- get_srcrcv()[source]
Retrieve information regarding source-receiver pairs including distance, backazimuth and theoretical traveltimes for a 1D Earth model.
- Return type:
pandas.core.frame.DataFrame
- Returns:
separate dataframe with distance and backazimuth columns, that may be used as a lookup table
- get_unique_models(float_precision=3)[source]
Find all accepted models (status 0 or 1) that have a unique misfit value. Because some forward evaluations are repeats of the previous line search evaluation, they will effectively be the same evaluation so they can be removed
- Parameters:
float_precision (int) – identical misfit values will differ after some decimal place. this value determines which decimal place to truncate the values for comparison