w4h.layers module

The Layers module contains functions for splitting data into a layered model and for interpolating data within the layers

w4h.layers.combine_dataset(layer_dataset, surface_elev, bedrock_elev, layer_thick, log=False)[source]

Function to combine xarray datasets or datarrays into a single xarray.Dataset. Useful to add surface, bedrock, layer thick, and layer datasets all into one variable, for pickling or netcdf export, for example.

Parameters:
layer_datasetxr.DataArray

DataArray contining all the interpolated layer information.

surface_elevxr.DataArray

DataArray containing surface elevation data

bedrock_elevxr.DataArray

DataArray containing bedrock elevation data

layer_thickxr.DataArray

DataArray containing layer thickness at each point in the model grid

logbool, default = False

Whether to log inputs and outputs to log file.

Returns:
xarray.Dataset or xarray.DataArray

Dataset with all arrays set to different variables within dataset. Or simply DataArray, if specified.

w4h.layers.get_layer_depths(df_with_depths, surface_elev_col='SURFACE_ELEV', layer_thick_col='LAYER_THICK', layers=9, log=False)[source]

Function to calculate depths and elevations of each model layer at each well based on surface elevation, bedrock elevation, and number of layers/layer thickness

Parameters:
df_with_depthspandas.DataFrame

DataFrame containing well metdata

layersint, default=9

Number of layers. This should correlate with get_drift_thick() input parameter, if drift thickness was calculated using that function, by default 9.

logbool, default = False

Whether to log inputs and outputs to log file.

Returns:
pandas.DataFrame

DataFrame containing new columns for depth to/elevation of layers.

w4h.layers.layer_interp(points, model_grid, layers=None, interp_kind='nearest', surface_grid=None, bedrock_grid=None, layer_thick_grid=None, drift_thick_grid=None, return_type='dataset', export_dir=None, target_col='TARG_THICK_PER', layer_col='LAYER', xcol=None, ycol=None, xcoord='x', ycoord='y', log=False, verbose=False, **kwargs)[source]

Function to interpolate results (by default TARG_THICK_PER, or Target thickness percent per layer). Converts dataframe points to gridded data. Results are saved to Model_Layer variable in output xarray.Dataset, by default

This function uses the scipy.interpolate module for interpolation.

Different interpolation methods may be used by specifying interp_kind=: * ‘Nearest’: Nearest neighbor (fastest). Uses scipy.interpolate.NearestNDInterpolator() * ‘Linear’: Linear interpolation Uses scipy.interpolate.LinearNDInterpolator() * ‘Inter2d’: Spline interpolation Uses scipy.interpolate.bisplrep() * ‘CloughTocher’: Cubic interpolation using clough-tocher method Uses scipy.interpolate.CloughTocher2DInterpolator() * ‘Radial basis function’: Radial basis function Uses scipy.interpolate.RBFInterpolator()

Parameters:
pointslist

List containing pandas dataframes or geopandas geoadataframes containing the point data. Should be resDF_list output from layer_target_thick().

model_gridxr.DataArray or xr.Dataset

Xarray DataArray or DataSet with the coordinates/spatial reference of the output model_grid to interpolate to

layersint, default=None

Number of layers for interpolation. If None, uses the length ofthe points list to determine number of layers. By default None.

interp_kindstr, {‘nearest’, ‘interp2d’,’linear’, ‘cloughtocher’, ‘radial basis function’}

Type of interpolation to use. See scipy.interpolate N-D scattered. Values can be any of the following (also shown in “kind” column of N-D scattered section of table here: https://docs.scipy.org/doc/scipy/tutorial/interpolate.html). By default ‘nearest’

return_typestr, {‘dataset’, ‘dataarray’}

Type of xarray object to return, either xr.DataArray or xr.Dataset, by default ‘dataset.’

export_dirstr or pathlib.Path, default=None

Export directory for interpolated grids, using w4h.export_grids(). If None, does not export, by default None.

target_colstr, default = ‘TARG_THICK_PER’

Name of column in points containing data to be interpolated, by default ‘TARG_THICK_PER’.

layer_colstr, default = ‘Layer’

Name of column containing layer number. Not currently used, by default ‘LAYER’

xcolstr, default = ‘None’

Name of column containing x coordinates. If None, will look for ‘geometry’ column, as in a geopandas.GeoDataframe. By default None

ycolstr, default = ‘None’

Name of column containing y coordinates. If None, will look for ‘geometry’ column, as in a geopandas.GeoDataframe. By default None

xcoordstr, default=’x’

Name of x coordinate in model_grid, used to extract x values of model_grid, by default ‘x’

ycoordstr, default=’y’

Name of y coordinate in model_grid, used to extract x values of model_grid, by default ‘y’

logbool, default = True

Whether to log inputs and outputs to log file.

**kwargs

Keyword arguments to be read directly into whichever scipy.interpolate function is designated by the interp_kind parameter.

Returns:
interp_dataxr.DataArray or xr.Dataset, depending on return_type

By default, returns an xr.DataArray object with the layers added as a new dimension called Layer. Can also specify return_type=’dataset’ to return an xr.Dataset with each layer as a separate variable.

w4h.layers.layer_target_thick(gdf, layers=9, well_id_col='API_NUMBER', return_all=False, export_dir=None, outfile_prefix=None, depth_top_col='TOP', depth_bot_col='BOTTOM', log=False, **kwargs)[source]
Function to calculate thickness of target material

in each layer at each well point. This function loops through each model layer and divides up the well intervals from the input GeoDataFrame into 4 categories: * 1) Intervals that pierce top of model layer but end within it * 2) Intervals contained entirely within model layer * 3) Intervals that begin within the model layer but pierce the bottom * 4) Intervals that begin above and end below the model layer

For each category, the amount/thickness of the target material within each layer is calculated. These records are then “truth-checked” to ensure there are not duplicates and combined. The percent thickness (target thickness/layer thickness) is calculated, before data is returned and/or exported.

Parameters:
gdfgeopandas.GeoDataFrame

Geodataframe containing classified data, surface elevation, bedrock elevation, layer depths, geometry.

layersint, default=9

Number of layers in model, by default 9

well_id_colstr, default=”API_NUMBER”

The name of the column that is used for uniquely identifying each well

return_allbool, default=False

If True, return list of original GeoDataFrames with extra column added for target thick for each layer. If False, return list of geopandas.GeoDataFrames with only essential information for each layer.

export_dirstr or pathlib.Path, default=None

If str or pathlib.Path, should be directory to which to export dataframes built in function.

outfile_prefixstr, default=None

Only used if export_dir is set. Will be used at the start of the exported filenames.

depth_top_colstr, default=’TOP’

Name of column containing data for depth to top of described well intervals.

depth_bot_colstr, default=’BOTTOM’

Name of column containing data for depth to bottom of described well intervals.

logbool, default = True

Whether to log inputs and outputs to log file.

Returns:
res_df and/or reslist

A list of Geopandas GeoDataFrames containing only important information needed for next stage of analysis. If return_all=True, the input data with the actual descriptions will be returned as a separate list of GeoDataFrames.

w4h.layers.merge_metadata(data_df, header_df, well_id_col='API_NUMBER', data_cols=None, header_cols=None, auto_pick_cols=False, drop_duplicate_cols=True, log=False, verbose=False, **kwargs)[source]

Function to merge tables, intended for merging metadata and data tables

Parameters:
data_dfpandas.DataFrame

“Left” dataframe, intended for this purpose to be dataframe with main data, but can be anything

header_dfpandas.DataFrame

“Right” dataframe, intended for this purpose to be dataframe with metadata, but can be anything

data_colslist, optional

List of strings of column names, for columns to be included after join from “left” table (data table). If None, all columns are kept, by default None

header_colslist, optional

List of strings of columns names, for columns to be included in merged table after merge from “right” table (metadata). If None, all columns are kept, by default None

auto_pick_colsbool, default = False

Whether to autopick the columns from the metadata table. If True, the following column names are kept: [well_id_col, ‘LATITUDE’, ‘LONGITUDE’, ‘BEDROCK_ELEV’, ‘SURFACE_ELEV’, ‘BEDROCK_DEPTH’, ‘LAYER_THICK’], by default False.

drop_duplicate_colsbool, optional

If True, drops duplicate columns from the tables so that columns do not get renamed upon merge, by default True.

logbool, default = False

Whether to log inputs and outputs to log file.

**kwargs

kwargs that are passed directly to pd.merge(). By default, the ‘on’ and ‘how’ parameters are defined as on=`well_id_col` and how=’inner’.

Returns:
mergedTablepandas.DataFrame

Merged dataframe