Data Manipulation

DayMet data manipulation

Manipulate DayMet data structures.

DayMet is downloaded in box mode based on watershed bounds, then it can be converted to hdf5 files that models can read.

watershed_workflow.daymet.convertToATS(dat)[source]

Convert dictionary of Daymet datasets to daily average data in standard form.

This:

takes tmin and tmax to compute a mean
splits rain and snow precip based on mean air temp
standardizes units and names for ATS

Soil properties data manipulation

Functions for manipulating soil properties.

Computes soil properties such as permeability, porosity, and van Genutchen parameters given texture properties using the Rosetta model.

Also provides functions for gap filling soil data via clustering, dataframe manipulations to merge soil type regions with shared values, etc.

watershed_workflow.soil_properties.vgm_Rosetta(data)[source]

Return van Genuchten model parameters using Rosetta v3 model.

(Zhang and Schaap, 2017 WRR)

Parameters:: data (numpy.ndarray(nvar, nsamples)) – Input data.
Returns:: van Genuchten model parameters
Return type:: pandas.DataFrame

watershed_workflow.soil_properties.vgm_from_SSURGO(df)[source]

Get van Genutchen model parameters using Rosetta v3.

Parameters:: df (pandas.DataFrame) – SSURGO properties dataframe, from manager_nrcs.FileManagerNRCS().get_properties()
Returns:: df with new properties defining the van Genuchten model. Note that this may be smaller than df as entries in df that have NaN values in soil composition (and therefore cannot calculate a VGM) will be dropped.
Return type:: pandas.DataFrame

watershed_workflow.soil_properties.to_ATS(df)[source]: Converts units from aggregated, Rosetta standard-parameters to ATS.

watershed_workflow.soil_properties.cluster(rasters, nbins)[source]

Given a bunch of raster bands, cluster into nbins.

Returns the coloring map of the clusters. This is used to fill in missing soil property data.

Parameters:

rasters (np.ndarray((nx,ny,nbands))) – nbands rasters providing spatial information on which to be clustered.
nbins (int) – Number of bins to cluster into.

Returns:

codebook (np.ndarray((nbins,nbands))) – The nbins centroids of the clusters.
codes (np.ndarray((nx, ny), int)) – Which cluster each point belongs to.
distortion ((float, np.ndarray((nx*ny))) – The distortion of the kmeans, and the distance between the observation and its nearest code.

watershed_workflow.soil_properties.alpha_from_permeability(perm, poro)[source]

Compute van Genuchten alpha from permeability and porosity.

Uses the relationship from Guarracino WRR 2007.

Parameters:

perm (array(double)) – Permeability, in [m^2]
poro (array(double)) – Porosity, [-]

Returns:

alpha – van Genuchten alpha, in [Pa^-1]

Return type:

array(double)

watershed_workflow.soil_properties.get_bedrock_properties()[source]

Simple helper function to get a one-row dataframe with bedrock properties.

Returns:: Sane default bedrock soil properties.
Return type:: pandas.DataFrame

watershed_workflow.soil_properties.mangle_glhymps_properties(shapes, min_porosity=0.01, max_permeability=inf, max_vg_alpha=inf)[source]

GLHYMPs properties need their units changed and variables renamed.

Parameters:

shapes (list[dict] or list[shapely + properties]) – The raw result from FileManagerGLHYMPS.get_shapes()
min_porosity (float, optional) – Some GLHYMPS entries have 0 porosity; this sets a floor on that value. Default is 0.01.
max_permeability (float, optional) – If provided, sets a ceiling on the permeability.
max_vg_alpha (float, optional) – If provided, sets a ceiling on the vG alpha.

Returns:

The resulting properties in standard form, names, and units.

Return type:

pandas.DataFrame

watershed_workflow.soil_properties.drop_duplicates(df)[source]

Search for duplicate soils which differ only by ID, and rename them, returning a new df.

Parameters:

df (pandas.DataFrame) – A data frame that contains only properties (e.g. permeability, porosity, WRM) and is indexed by some native ID.

Returns:

df_new – After this is called, df_new will:

have a new column, named by df’s index name, containing a tuple of all of the original indices that had the same properties.
be reduced in number of rows relative to df such that soil properties are now unique

Return type:

pandas.DataFrame