High Level API

These high-level functions attempt to “do what the user means.” They group other, lower-level functions into common sets of operations. The “default workflow” is nearly entirely built from calls to this namespace. If the user needs to deviate from the default workflow, they can then call the lower level functions.

This top level module provides functionality for getting shapes and rasters representing watershed boundaries, river networks, digital elevation models, and other GIS datasets and then processing those data sets for use in simulations.

Functions here that get shapes and rasters should be used instead of directly using the source managers, because they additionally do things like convert coordinate systems, get the data into common data structures, etc.

watershed_workflow.get_huc(source, huc, out_crs=None, digits=None)[source]

Get a HUC shape object from a given code.

Parameters:
  • source (source-type) – source object providing get_hucs()

  • huc (str) – hydrologic unit code

  • out_crs (crs-type) – Output coordinate system. Default is source’s crs.

  • digits (int, optional) – Number of digits to round coordinates to.

Returns:

  • out_crs (crs-type) – Coordinate system of out.

  • out (Polygon) – shapely polygon for the hydrologic unit.

watershed_workflow.get_hucs(source, huc, level, out_crs=None, digits=None, **kwargs)[source]

Get shape objects for all HUCs at a level contained in huc.

Parameters:
  • source (source-type) – source object providing get_hucs()

  • huc (str or list[str]) – hydrologic unit code or codes

  • level (int) – HUC level of the requested sub-basins

  • out_crs (crs-type) – Output coordinate system. Default is source’s crs.

  • digits (int, optional) – Number of digits to round coordinates to.

  • kwargs – All additional arguments are passed to the source manager.

Returns:

  • out_crs (crs-type) – Coordinate system of all entries in out.

  • out (list(Polygon)) – List of shapely polygons for the subbasins.

watershed_workflow.get_split_form_hucs(source, huc, level=None, out_crs=None, digits=None)[source]

Get a SplitHUCs object for all HUCs at level contained in huc.

A SplitHUCs object is an object which stores a collection of polygons which share boundaries in a format that makes changing those shared boundaries possible without having to update all shapes that share the boundary.

Parameters:
  • source (source-type) – source object providing get_hucs()

  • huc (str) – hydrologic unit code

  • level (int) – HUC level of the requested sub-basins

  • out_crs (crs-type) – Output coordinate system. Default is source’s crs.

  • digits (int, optional) – Number of digits to round coordinates to.

Returns:

  • out_crs (crs-type) – Coordinate system of out.

  • out (SplitHUCs) – Split-form HUCs object containing subbasins.

watershed_workflow.get_shapes(source, index_or_bounds=None, in_crs=None, out_crs=None, digits=None, properties=False, **kwargs)[source]

Read a shapefile.

If index_or_bounds is a bounding box, in_crs must not be None and is the crs of the bounding box.

Parameters:
  • source (str or source-type) – Filename to parse, or a source object providing the get_shapes() method.

  • index_or_bounds (int or [x_min, y_min, x_max, y_max] bounds, optional) – Filter the file, either by selecting a specific shape by index of the requested shape in the file, or providing a bounds-type tuple to select only shapes that intersect with the bounding box.

  • in_crs (crs-type, optional) – Coordinate system of the bounding box.

  • out_crs (crs-type, optional) – Coordinate system to which shapes will be warped. If not provided, the native crs of the file will be used.

  • digits (int, optional) – Number of digits to round coordinates to.

  • properties (bool, optional) – If true, also get properties from the source.

  • kwargs (dict) – All extra parameters are passed to the source manager’s function.

Returns:

  • out_crs (crs-type) – Coordinate system of out.

  • out (list(shapely)) – List of shapely objects in the shapefile meeting the criteria.

  • out_properties (pandas dataframe) – Only if properties == True, a dataframe of corresponding properties

watershed_workflow.get_split_form_shapes(source, index_or_bounds=-1, in_crs=None, out_crs=None, digits=None)[source]

Read a shapefile.

Note that if index_or_bounds is a bounding box, in_crs must not be None and is the crs of the bounding box.

Parameters:
  • source (str or source-type) – Filename to parse, or a source object providing the get_shapes() method.

  • index_or_bounds (int or [x_min, y_min, x_max, y_max] bounds, optional) – Filter the shapes, either by selecting a specific shape by index of the requested shape in the file, or providing a bounds-type tuple to select only shapes that intersect with the bounding box.

  • in_crs (crs-type, optional) – Coordinate system of out and/or the bounding box provided.

  • out_crs (crs-type, optional) – Coordinate system to which shapes will be warped. If not provided, the native crs of the file will be used.

  • digits (int, optional) – Number of digits to round coordinates to.

Returns:

  • out_crs (crs-type) – Coordinate system of out.

  • out (SplitHUCs) – Split-form polygons object containing subbasins.

watershed_workflow.find_huc(source, shape, in_crs, hint, shrink_factor=1e-05)[source]

Finds the smallest HUC containing shp.

Parameters:
  • source (source-type) – Source object providing a get_hucs() method.

  • shape (Polygon) – Shapely or fiona polygon on which to get the raster.

  • in_crs (crs-type) – CRS of shape.

  • hint (str) – HUC in which to start searching. This should be at least as long as the indexing file size – HUC 2 or longer for WBD, 4 or longer for NHD Plus, or 8 or longer for NHD.

  • shrink_factor (float, optional) – A fraction of the radius of shape to shrink prior for checking containment within HUCs. This fixes cases where shape is on a HUC boundary with potentially some numerical error.

Returns:

out – The smallest containing HUC.

Return type:

str

watershed_workflow.get_reaches(source, huc, bounds_or_shp=None, in_crs=None, out_crs=None, digits=None, tol=None, long=None, merge=False, presimplify=None, properties=None, include_catchments=False, **kwargs)[source]

Get reaches from hydrography source within a given HUC and/or bounding box.

Collects reach datasets within a HUC and/or a bounding box. If bounds are provided, a containing HUC must still be provided to give a hint for file downloads. If bounds are not provided, then all reaches that intersect the HUC are included.

If bounds is provided, crs must not be None and is the crs of the bounding box.

Parameters:
  • source (source-type) – Source object providing a get_hydro() method.

  • huc (str) – HUC containing reaches. If bounds are provided, a hint to help the source find the file containing the bounds. For NHD, this is a HUC4 or smaller.

  • bounds_or_shp ([xmin, ymin, xmax, ymax] bounds or shly object, optional) – Bounding box to filter the river network.

  • in_crs (crs-type, optional) – Coordinate system of the bounds.

  • out_crs (crs-type, optional) – Coordinate system of the output reaches. Default is the native crs of the source.

  • digits (int, optional) – Number of digits to round coordinates to.

  • tol (float, optional) – Tolerance used in filtering the reaches to the provided shape.

  • long (float, optional) – If a reach is longer than this value it gets filtered. Some NHD data has impoundments or other artifacts which appear as very long, perfectly straight single segment reaches.

  • merge (bool, optional) – If true, reaches are merged (via shapely.ops.linemerge), collapsing connected, non-branching reaches into a single LineString.

  • presimplify (double, optional) – If provided, reaches are simplified within the specified tolerance as soon as possible for big extents. Units are that of out_crs.

  • properties (list[str]) – A list of properties to be added to reaches ‘catchment’ for catchment geometry, and property alias names for NHDPlusFlowlineVAA and NHDPlusEROMMA table (Table 16 and 17 NHDPlus user guide)

  • include_catchments (bool, optional) – If True, adds catchment polygons for each reach in the river tree from ‘NHDPlusCatchment’ layer

  • kwargs (dict, optional) – Other arguments are passed to the file manager’s get_reaches() method.

Returns:

  • out_crs (crs-type) – Coordinate system of out.

  • out (list(LineString)) – Reaches in the HUC and/or intersecting the bounds.

watershed_workflow.construct_rivers(reaches, method='geometry', ignore_small_rivers=0, prune_by_area=None, area_property='DivergenceRoutedDrainAreaSqKm', remove_diversions=False, remove_braided_divergences=False, tol=0.1)[source]

Create a river, which is a tree of reaches.

Note, HUCs and rivers must be in the same crs.

Parameters:
  • reaches (list(LineString)) – A list of reaches.

  • method (str, optional='geometry') –

    Provide the method for constructing rivers. Valid are:

    • ’geometry’ looks at coincident coordinates

    • ’hydroseq’ Valid only for NHDPlus data, this uses the NHDPlus VAA tables Hydrologic Sequence. If using this method, get_reaches() must have been called with both ‘HydrologicSequence’ and ‘DownstreamMainPathHydroSeq’ properties requested (or properties=True).

  • ignore_small_rivers (int, optional) – If provided and positive, removes rivers whose number of reaches is less than this value. If negative, keeps the N biggest (in number of reaches) rivers, where N is the negative of the provided value (e.g. -2 keeps the biggest 2 rivers).

  • prune_by_area (float, optional) – If provided, remove reaches whose total contributing area is less than this tol. NOTE: only valid for reaches that include a contributing area property (e.g. NHDPlus).

  • area_property (str, optional='DivergenceRoutedDrainAreaSqKm') – Name of the area property to use for determining reach CA. Note that this defines the units of prune_by_area value.

  • remove_diversions (bool, optional=False) – If true, remove diversions (see documentation of modify_rivers_remove_divergences()).

  • remove_braided_divergences (bool, optional=False) – If true, remove braided divergences (see documentation of modify_rivers_remove_divergences()).

  • tol (float, optional=0.1) – Defines what close is in the case of method == ‘geometry’

Returns:

out – A list of rivers, as River objects.

Return type:

list(river_tree.River)

watershed_workflow.simplify(hucs, rivers, waterbodies=None, simplify_hucs=0, simplify_rivers=None, simplify_waterbodies=None, prune_tol=None, merge_tol=None, snap_tol=None, snap_triple_junctions_tol=None, snap_reach_endpoints_tol=None, snap_waterbodies_tol=None, cut_intersections=False)[source]

Simplifies the HUC and river shapes.

Parameters:
  • hucs (SplitHUCs) – A split-form HUC object containing all reaches.

  • rivers (list(River)) – A list of river objects.

  • waterbodies (list(shply), optional) – A list of waterbodies.

  • simplify_hucs (float, optional) – If provided, simply the hucs by moving points at most this many units (see also shapely.simplify). Units are that of the CRS of shapes.

  • simplify_rivers (float, optional) – If provided, simply the rivers by moving points at most this many units (see also shapely.simplify). Units are that of the CRS of shapes. If not provided, use the simplify_hucs value. Provide 0 to make no changes to the rivers.

  • simplify_waterbodies (float, optional) – Simplify the waterbodies. If not provided, uses the simplify_hucs value. Provide 0 to make no changes to the waterbodies.

  • prune_tol (float, optional = simplify_rivers) – Prune leaf reaches that are smaller than this tolerance. If not provided, uses simplify_rivers value. Provide 0 to not do this step.

  • merge_tol (float, optional = simplify_rivers) – Merges reaches that are smaller than this tolerance with their downstream parent reach. Note that if there is a branchpoint at the downstream node of the small reach, it will get moved to the upstream node. If not provided, uses simplify_rivers value. Provide 0 to not do this step.

  • snap_tol (float, optional = 0.75 * simplify_rivers) – Tolerance used for snapping rivers to nearby huc boundaries. Provide 0 to not snap.

  • snap_triple_junctions_tol (float, optional = 3 * snap_tol) – Tolerance used for snapping river triple junctions to huc triple junctions.

  • snap_reach_endpoints_tol (float, optional = 2 * snap_tol) – Tolerance used for snapping reach junctions to huc boundaries.

  • snap_waterbodies_tol (float, optional = snap_tol) – If not 0, snaps waterbody and HUC nodes that are nearly coincident to be discretely coincident. Note this is not recommended; prefer to include major waterbodies in the HUC network.

  • cut_intersections (bool, optional = False) – If true, force intersections of the river network and the HUC boundary to occur at a coincident node by adding nodes as needed.

Returns:

  • rivers (list(River)) – Snap may change the rivers, so this returns the list of updated rivers.

  • .. note – This also may modify the hucs and waterbody objects in-place.

watershed_workflow.densify(objct, target, objct_orig=None, rivers=None, **kwargs)[source]

Redensify a river, huc, or waterbodies object, meeting a provided target or target resolution function.

Parameters:
  • objct (SplitHUCs, list(River), or list(shapely.Polygon)) – The object to be densified.

  • target (float, list[float]) – Parameters for the target density – either a float target length or a list of floats used in watershed_workflow.densification.limit_from_river_distance object.

  • objct_orig (same as objct, optional) – The object with original coordinates. The original, unsimplified object, if provided, allows better interpolation between the coarsened coordinates.

  • rivers (optional) – If target is a list of floats, the rivers used in the signed distance function.

  • **kwargs (optional) – Passed along to the densify function.

watershed_workflow.get_waterbodies(source, huc, bounds_or_shp=None, in_crs=None, out_crs=None, digits=None, tol=None, prune_by_area=None, clip=True, **kwargs)[source]

Get waterbodies from NHDPlus hydrography source within a given HUC and/or bounding box.

Collects waterbody datasets within a HUC and/or a bounding box. If bounds are provided, a containing HUC must still be provided to give a hint for file downloads. If bounds are not provided, then all bodies that intersect the HUC are included.

If bounds is provided, crs must not be None and is the crs of the bounding box.

Parameters:
  • source (source-type) – Source object providing a get_hydro() method.

  • huc (str) – HUC containing reaches. If bounds are provided, a hint to help the source find the file containing the bounds. For NHD, this is a HUC4 or smaller.

  • bounds_or_shp ([xmin, ymin, xmax, ymax] bounds or shply, optional) – Bounding box or shapely shape to filter the river network.

  • in_crs (crs-type, optional) – Coordinate system of the bounds.

  • out_crs (crs-type, optional) – Coordinate system of the output reaches. Default is the native crs of the source.

  • digits (int, optional) – Number of digits to round coordinates to.

  • tol (float, optional) – Tolerance used in filtering the reaches to the provided shape.

  • prune_by_area (float, optional) – If provided, remove bodies whose total area is less than this tol.

  • clip (bool, optional) – If true, clip the waterbodies to the bounds_or_shape. Some waterbodies are not entirely contained within the HUC, just close. Default is true.

  • **kwargs (dict, optional) – Other arguments are passed to the file manager’s get_waterbodies() method.

Returns:

  • out_crs (crs-type) – Coordinate system of out.

  • out (list(LineString)) – Waterbodies in the HUC and/or intersecting the bounds.

watershed_workflow.triangulate(hucs, rivers=None, river_corrs=None, internal_boundaries=None, hole_points=None, diagnostics=True, verbosity=1, tol=1, refine_max_area=None, refine_distance=None, refine_max_edge_length=None, refine_min_angle=None, enforce_delaunay=False, river_region_dist=None)[source]

Triangulates HUCs and rivers.

Note, refinement of a given triangle is done if any of the provided criteria is met.

Parameters:
  • hucs (SplitHUCs) – A split-form HUC object from, e.g., get_split_form_hucs()

  • rivers (list[watershed_workflow.river_tree.River], optional) – List of rivers, used to refine the triangulation in conjunction with refine_distance.

  • river_corrs (list[shapely.geometry.Polygon], optional) – List of rivers corridor polygons.

  • internal_boundaries (list[shapely.geometry.Polygon, watershed_workflow.river_tree.River], optional) – List of objects, whose boundary (in the case of polygons/waterbodies) or reaches (in the case of River) will be present in the edges of the triangulation.

  • hole_points (list(shapely.Point), optional) – List of points inside the polygons to be left as holes/voids (excluded from mesh)

  • diagnostics (bool, optional) – Plot diagnostics graphs of the triangle refinement.

  • tol (float, optional) – Set tolerance for minimum distance between two nodes. The unit is the same as that of the watershed’s CRS. The default is 1.

  • refine_max_area (float, optional) – Refine a triangle if its area is greater than this area.

  • refine_distance (list(float), optional) –

    Refine a triangle if its area is greater than a function of its centroid’s distance from the nearest point on the river network. The argument is given by:

    [near_distance, near_area, far_distance, far_area]

    Defining d as the distance from triangle centroid to the nearest point on the river network and area as the area of the triangle in question, refinement occurs if:

    • d < near_distance and area > near_area

    • d > far_distance and area > far_area

    • otherwise, defining d’ = (d - near_distance) / (far_distance - near_distance), refining occurs if area > near_area + (far_area - near_area) * d’

    Effectively this simply writes a piecewise linear function of triangle distance from centroid and uses that as a max area criteria.

  • refine_max_edge_length (float, optional) – Refine a triangle if its max edge length is greater than this length.

  • refine_min_angle (float, optional) – Try to ensure that all triangles have a minimum edge length greater than this value.

  • enforce_delaunay (bool,optional, experimental) –

    Attempt to ensure all triangles are proper Delaunay triangles.

  • river_region_dist (float, optional) – Create river region based on the distance from river networks. This is useful if explicit representation of riverbed is desired. Default is None.

Returns:

  • vertices (np.array((n_points, 2), ‘d’)) – Array of triangle vertices.

  • triangles (np.array((n_tris, 3), ‘i’)) – For each triangle, a list of 3 indices into the vertex array that make up that triangle.

  • areas (_only if diagnostics=True_, np.array((n_tris), ‘d’)) – Array of triangle areas.

watershed_workflow.tessalate_river_aligned(hucs, rivers, river_width, river_n_quads=1, internal_boundaries=None, hole_points=None, diagnostics=False, ax=None, **kwargs)[source]

Tessalate HUCs using river-aligned quads along the corridor and triangles away from it.

Parameters:
  • hucs (SplitHUCs) – The huc geometry to tessalate. Note this will be adjusted if required by the river corridor.

  • rivers (list[River]) – The rivers to mesh with quads

  • river_width (float or dict or callable or boolean) – Width of the quads, either a float or a dictionary providing a {StreamOrder : width} mapping. Or a function (callable) that computer width using node properties Or boolean, where True means, width for each reach is explicitely provided properties as “width”

  • river_n_quads (int, optional) – Number of quads across the river. Currently only 1 is supported (the default).

  • hole_points (list(shapely.Point), optional) – List of points inside the polygons to be left as holes/voids (excluded from mesh)

  • internal_boundaries (list[shapely.Polygon], optional) – List of internal boundaries to embed in the domain, e.g. waterbodies.

  • diagnostics (bool, optional) – If true, prints extra diagnostic info.

  • ax (matplotlib Axes object, optional) – For debugging – plots troublesome reaches as quad elements are generated to find tricky areas.

  • kwargs – All other arguments are passed to the triangulation function for refinement.

Returns:

  • vertices (np.array((n_vertices, 2), ‘d’)) – Array of triangle vertices.

  • cell_conn (list[list[int]]) – For each cell, an ordered list of indices into the vertices array that make up that cell.

  • areas (_only if diagnostics=True_, np.array((n_cell_vertices), ‘d’)) – Array of areas.

watershed_workflow.get_raster_on_shape(source, shape, in_crs, out_crs=None, buffer=0, mask=False, nodata=None, **kwargs)[source]

Collects a raster that covers the requested shape.

Parameters:
  • source (str or source-type) – Filename to parse, or a source object providing the get_raster() method.

  • shape (polygon) – shapely or fiona polygon on which to get the raster.

  • in_crs (crs-type) – CRS of shape.

  • out_crs (crs-type, optional) – CRS of the raster. Defaults to the source’s CRS.

  • buffer (double, optional) – Size of a buffer, in units of the shape’s CRS, added to shape to ensure pixels cover the entire shape. Default is 0.

  • mask (bool, optional=False) – If True, mask the raster outside of shape.

  • nodata (dtype, optional=raster nodata) – Value to place outside of shape.

  • kwargs – All extra arguments are passed to the source’s get_raster() method.

Returns:

  • profile (dict) – Rasterio profile of the image including rasterio CRS and transform

  • raster (ndarray) – The raster data in a 2D-array.

watershed_workflow.values_from_raster(points, points_crs, raster, raster_profile, algorithm='nearest')[source]

Interpolate a raster onto a collection of unstructured points.

Parameters:
  • points (np.array((n_points, 2), 'd')) – Array of points to interpolate onto.

  • points_crs (crs-type) – Coordinate system of the points.

  • raster (np.array) – 2D array forming the raster.

  • raster_profile (dict) – rasterio profile for the raster.

  • algorithm (str, optional) – Algorithm used for interpolation. One of: * “nearest” for nearest neighbor pixels (default) * “piecewise bilinear” for interpolation

Returns:

out – Array of raster values interpolated onto the points.

Return type:

np.array((n_points,))

watershed_workflow.elevate(mesh_points, mesh_crs, dem, dem_profile, algorithm='piecewise bilinear')[source]

Elevate mesh_points onto the provided dem.

Parameters:
  • mesh_points (np.array((n_points, 2), 'd')) – Array of triangle vertices.

  • mesh_crs (crs-type) – Mesh coordinate system.

  • dem (np.array) – 2D array forming an elevation raster.

  • dem_profile (dict) – rasterio profile for the elevation raster.

  • algorithm (str, optional) – Algorithm used for interpolation. One of: * “nearest” for nearest-neighbor pixels * “piecewise bilinear” for interpolation (default)

Returns:

out – Array of triangle vertices, including a z-dimension.

Return type:

np.array((n_points, 3), ‘d’)

watershed_workflow.color_raster_from_shapes(shapes, shapes_crs, shape_colors, raster_bounds, raster_dx, raster_crs=None, nodata=None)[source]

Color in a raster by filling in a collection of shapes.

Given a canvas specified by bounds and pixel size, color a raster by, for each shape, finding the intersection of that shape with the canvas and coloring it by a provided value. Paint by numbers.

Note, if the shapes overlap, the last shape containing a pixel gives the color of that pixel.

Parameters:
  • shapes (list(Polygon)) – Collection of shapes (likely) overlapping the canvas.

  • shapes_crs (crs-type) – Coordinate system of the shapes.

  • shapes_colors (iterable[]) – Color to label the interior of each polygon with.

  • raster_bounds ([xmin, ymin, xmax, ymax]) – Bounding box for the output raster, in the given CRS.

  • raster_dx (float) – Pixel size (assumed the same in both x and y).

  • raster_crs (crs-type, optional=shapes_crs) – Coordinate system of the raster.

  • nodata (dtype, optional={-1 (int), nan (float)}) – Value to place in pixels which intersect no shape. Note the type of this should be the same as the type of shape_colors.

Returns:

  • out_profile (dict) – rasterio profile of the color raster.

  • out (np.array(raster_bounds, dtype)) – Raster of colors.

watershed_workflow.color_existing_raster_from_shapes(shapes, shapes_crs, shape_colors, raster, raster_profile)[source]

Color in a raster by filling in a collection of shapes.

Given a canvas, find the intersection of that shape with the canvas and coloring it by a provided value. Paint by numbers.

Note, if the shapes overlap, the last shape containing a pixel gives the color of that pixel.

Parameters:
  • shapes (list(Polygon)) – Collection of shapes (likely) overlapping the canvas.

  • shapes_crs (crs-type) – Coordinate system of the shapes.

  • shapes_colors (iterable[]) – Color to label the interior of each polygon with.

  • raster (np.ndarray) – The canvas to color on.

  • raster_profile (dict) – Rasterio style profile including at least CRS, nodata, and transform.