w4h.clean module¶
The Clean module contains functions for cleaning the data (i.e., removing data not to be used in further analysis)
- w4h.clean.remove_bad_depth(df_with_depth, top_col='TOP', bottom_col='BOTTOM', depth_type='depth', verbose=False, log=False)[source]¶
Function to remove all records in the dataframe with well interpretations where the depth information is bad (i.e., where the bottom of the record is neerer to the surface than the top)
- Parameters:
- df_with_depthpandas.DataFrame
Pandas dataframe containing the well records and descriptions for each interval
- top_colstr, default=’TOP’
The name of the column containing the depth or elevation for the top of the interval, by default ‘TOP’
- bottom_colstr, default=’BOTTOM’
The name of the column containing the depth or elevation for the bottom of each interval, by default ‘BOTTOM’
- depth_typestr, {‘depth’, ‘elevation’}
Whether the table is organized by depth or elevation. If depth, the top column will have smaller values than the bottom column. If elevation, the top column will have higher values than the bottom column, by default ‘depth’
- verbosebool, default = False
Whether to print results to the terminal, by default False
- logbool, default = False
Whether to log results to log file, by default False
- Returns:
- pandas.Dataframe
Pandas dataframe with the records remvoed where the top is indicatd to be below the bottom.
- w4h.clean.remove_no_depth(df_with_depth, top_col='TOP', bottom_col='BOTTOM', no_data_val_table='', verbose=False, log=False)[source]¶
Function to remove well intervals with no depth information
- Parameters:
- df_with_depthpandas.DataFrame
Dataframe containing well descriptions
- top_colstr, optional
Name of column containing information on the top of the well intervals, by default ‘TOP’
- bottom_colstr, optional
Name of column containing information on the bottom of the well intervals, by default ‘BOTTOM’
- no_data_val_tableany, optional
No data value in the input data, used by this function to indicate that depth data is not there, to be replaced by np.nan, by default ‘’
- verbosebool, optional
Whether to print results to console, by default False
- logbool, default = False
Whether to log results to log file, by default False
- Returns:
- df_with_depthpandas.DataFrame
Dataframe with depths dropped
- w4h.clean.remove_no_description(df_with_descriptions, description_col='FORMATION', no_data_val_table='', verbose=False, log=False)[source]¶
Function that removes all records in the dataframe containing the well descriptions where no description is given.
- Parameters:
- df_with_descriptionspandas.DataFrame
Pandas dataframe containing the well records with their individual descriptions
- description_colstr, optional
Name of the column containing the geologic description of each interval, by default ‘FORMATION’
- no_data_val_tablestr, optional
The value expected if the column is empty or there is no data. These will be replaced by np.nan before being removed, by default ‘’
- verbosebool, optional
Whether to print the results of this step to the terminal, by default False
- logbool, default = False
Whether to log results to log file, by default False
- Returns:
- pandas.DataFrame
Pandas dataframe with records with no description removed.
- w4h.clean.remove_no_topo(df_with_topo, zcol='SURFACE_ELEV', no_data_val_table='', verbose=False, log=False)[source]¶
Function to remove wells that do not have topography data (needed for layer selection later).
This function is intended to be run on the metadata table after elevations have attempted to been added.
- Parameters:
- df_with_topopandas.DataFrame
Pandas dataframe containing elevation information.
- zcolstr
Name of elevation column
- no_data_val_tableany
Value in dataset that indicates no data is present (replaced with np.nan)
- verbosebool, optional
Whether to print outputs, by default True
- logbool, default = False
Whether to log results to log file, by default False
- Returns:
- pandas.DataFrame
Pandas dataframe with intervals with no topography removed.
- w4h.clean.remove_nonlocated(df_with_locations, xcol='LONGITUDE', ycol='LATITUDE', no_data_val_table='', verbose=False, log=False)[source]¶
Function to remove wells and well intervals where there is no location information
- Parameters:
- df_with_locationspandas.DataFrame
Pandas dataframe containing well descriptions
- metadata_DFpandas.DataFrame
Pandas dataframe containing metadata, including well locations (e.g., Latitude/Longitude)
- logbool, default = False
Whether to log results to log file, by default False
- Returns:
- df_with_locationspandas.DataFrame
Pandas dataframe containing only data with location information