Skip to main content

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Beta

The current list of use cases is a snap shot of what this page will contain in the future. The team is working on improving the use case pages to provide meaningful examples of how active WMA projects are using data and tools. Tell us about your experience using the catalog!

Use Cases


3dep-grid-to-line

Keywords: Interpolation; grid-to-line; 3DEP, Topography; NHD; Stream-segments

Domain: Topography; Stream Characteristics

Language: Python

Description:

In this example, a set of NHD stream-segments accessed using the HyRiver package. A USGS 3DEP Elevation dataset is pulled in using the ClimateR-Catalog. Finally, gdptools is used to interpolate the 3DEP elevation to the NHD stream-segments.


3DEP Elevation Grid-to-line intersection

Introduction

In this example, a set of NHD stream-segments accessed using the HyRiver package. A USGS 3DEP Elevation dataset is pulled in using the ClimateR-Catalog. Finally, gdptools is used to interpolate the 3DEP elevation to the NHD stream-segments.

Figure 1

Geo Data Processing Tools (gdptools)

gdptools is a python package for calculating area-weighted statistics of grid-to-polygon. grid-to-line, and polygon-to-polygon interpolations. gdptools utilizes metadata from the OPeNDAP catalog to aid generalizing gdptools processing capabilities. There are also methods for calculating area-weighted statistics for any gridded dataset that can be opened by xarray, however some of the metadata encapsulated in the json entries will have to be added by the user. Examples can be found in Non-Catalog Examples.

Open Source Python Tools

  • xarray: Package to work with labelled multi-dimensional arrays.
  • geopandas: Package to work with geospatial data.

Data

  • 3DEP: The USGS 3D Elevation Program (3DEP) Datasets are the primary elevation data product produced and distributed by the USGS.
  • NHDPlus_V2: National Hydrography Dataset Plus Version 2, a selection of flowline geometries from HUC 0708

Maintenance Schedule

  • The intent of this notebook is simply a demonstration of an application or workflow. As we progress in the build-out of NGHF project tools, this notebook may be updated or replaced. There is no other planned maintenance at this time.

Description

This workflow demonstrates the grid to line interpoltion functionality from the gdptools package. The purpose of this tool is to attribute gridded data to line geometries. This is done by interpolating points along each line geometry and extracting the underlaying gridded data values at those points. From this point values, statistics are calculated and attributed to the original input lines.

The inputs to the line interpolation tool are a Geopandas geodataframe containing the line geometries, and a Pandas dataframe containing information about the gridMET gridded data. The line interpolation tool can process multi-dimensional xarray data, outputing the statistics organized by each time segment. Also, multiple gridded datasets can be processed, so for a gridMET example, three different climate variables can be accessed and processed.

The outputs are a dataframe containing the statistics for each line geometries calculated for each gridded dataset at each time segment, as well as a geodataframe containing the interpolated point geometries which have the linearly interpolated values from the gridded dataset. Depending on the user’s inputs, the sample points can be interpolated from the line geometry at regular intervals, or the line vertices can be used instead.

Create conda environment

  • gdptools-examples can be downloaded from the project repo

Imports

import warnings
warnings.filterwarnings("ignore")
import geopandas as gpd
import pandas as pd
import hvplot.pandas
import hvplot.xarray

import xarray as xr
import pynhd
from pynhd import NLDI, NHDPlusHR, WaterData
from gdptools import UserTiffData
from gdptools import InterpGen
import geoviews as gv

gv.extension('bokeh', 'matplotlib')
# gv.extension()
# gv.output(dpi=120, fig='png')
# gv.output(backend='bokeh')

Create target lines

nldi = NLDI()
station_id = "06730200"

basin = nldi.get_basins(station_id)

flw_main = nldi.navigate_byid(
    fsource="nwissite",
    fid=f"USGS-{station_id}",
    navigation="upstreamMain",
    source="flowlines",
    distance=1000,
)

nhdp_mr = WaterData("nhdflowline_network")
comids = [int(c) for c in flw_main.nhdplus_comid.to_list()]
nhd_mr_fl = nhdp_mr.byid("comid", comids)
nhd_mr_fl.columns.to_list()
flw = pynhd.prepare_nhdplus(nhd_mr_fl, 0, 0, purge_non_dendritic=True)
flw2 = flw.rename(columns={"comid": "ID", "tocomid": "toID"})
strm_segs = pynhd.topoogical_sort(flw2)

# copy the sorted comids into a list and remove the last element
# List will be used to sort ouput into topological order later
sort_ids = strm_segs[0]

# create a new dataframe with the sorted comids
mr_fl = nhd_mr_fl.set_index("comid").loc[sort_ids[:-1]]
mr_fl = mr_fl.reset_index()

mr_fl.head()
comidgeometryfdateresolutiongnis_idgnis_namelengthkmreachcodeflowdirwbareacomi...qc_12vc_12qe_12ve_12lakefractsurfarearareahloadrpuidvpuidenabled
02889852MULTILINESTRING Z ((-105.63211 40.05203 0.0000...2000-12-22T05:00:00ZMedium178467North Boulder Creek0.37310190005000122With Digitized2888392...0.376-9998.000000.376-9998.000001.050667.04271510.00454410c10L1
12889186MULTILINESTRING Z ((-105.62824 40.05363 0.0000...1999-05-28T04:00:00ZMedium178467North Boulder Creek0.55210190005000121With Digitized0...0.4980.573060.4980.57306NaNNaNNaN10c10L1
22889770MULTILINESTRING Z ((-105.62207 40.05389 0.0000...1999-05-28T04:00:00ZMedium178467North Boulder Creek0.47810190005000120With Digitized2888388...0.592-9998.000000.592-9998.000001.052090.1367276.69109810c10L1
32889188MULTILINESTRING Z ((-105.61719 40.05556 0.0000...1999-05-28T04:00:00ZMedium178467North Boulder Creek0.39610190005000119With Digitized0...0.6590.628360.6590.62836NaNNaNNaN10c10L1
42889772MULTILINESTRING Z ((-105.61573 40.05258 0.0000...1999-05-28T04:00:00ZMedium178467North Boulder Creek0.25710190005000118With Digitized2888396...0.785-9998.000000.785-9998.000001.048345.9116114.70828610c10L1

5 rows × 138 columns

Plot the target lines

from bokeh.models import PrintfTickFormatter
formatter = PrintfTickFormatter(format="%i")
gv.tile_sources.EsriTerrain * gv.Path(mr_fl.to_crs(4326), vdims="comid").opts(color='comid',cmap='category20', width=600, height=300, aspect="equal", line_width=5, show_legend=True, colorbar=True, colorbar_opts={'formatter': formatter}, clabel="segment ID", title="NHDPlus MR Flowlines upstream of USGS Gage 06730200")

Figure 2

Define source data

import pprint
climater_cat = "https://mikejohnson51.github.io/climateR-catalogs/catalog.parquet"
cat = pd.read_parquet(climater_cat)

_id = "USGS 3DEP"
_asset = "10m CONUS DEM"
_vars = ["elevation"]
# create list of catalog parameters for each variable
cat_params = [
    cat.query("id == @_id & asset == @_asset & variable == @_vars").to_dict(orient="records")[0]
    for _var in _vars
]
# create dictionary of variable names and catalog parameters
cat_dict = dict(zip(_vars, cat_params))
pprint.pprint(cat_dict)
{'elevation': {'T_name': None,
               'URL': '/vsicurl/https://prd-tnm.s3.amazonaws.com/StagedProducts/Elevation/13/TIFF/USGS_Seamless_DEM_13.vrt',
               'X1': -180.0005555560936,
               'X_name': None,
               'Xn': 180.00000026239937,
               'Y1': -180.0005555560936,
               'Y_name': None,
               'Yn': 72.0005555562949,
               'asset': '10m CONUS DEM',
               'crs': '+proj=longlat +datum=NAD83 +no_defs',
               'description': '10m CONUS DEM',
               'duration': None,
               'ensemble': None,
               'id': 'USGS 3DEP',
               'interval': None,
               'model': None,
               'nT': nan,
               'ncols': 3888006.0,
               'nrows': 939612.0,
               'resX': 9.259259266022042e-05,
               'resY': 9.25925926599094e-05,
               'scenario': None,
               'tiled': '',
               'toptobottom': None,
               'type': 'vrt',
               'units': 'meters',
               'variable': 'elevation',
               'varname': 'elevation'}}

Load source data

import rioxarray as rxr
ds_3dep = rxr.open_rasterio(cat_dict["elevation"]["URL"])
ds_3dep
<xarray.DataArray (band: 1, y: 939612, x: 3888006)> Size: 15TB
[3653217093672 values with dtype=float32]
Coordinates:
  * band         (band) int64 8B 1
  * x            (x) float64 31MB -180.0 -180.0 -180.0 ... 180.0 180.0 180.0
  * y            (y) float64 8MB 72.0 72.0 72.0 72.0 ... -15.0 -15.0 -15.0 -15.0
    spatial_ref  int64 8B 0
Attributes:
    _FillValue:    -999999.0
    scale_factor:  1.0
    add_offset:    0.0

Define UserTiffData object

varname = "elevation"
tx_name = "x"
ty_name = "y"
band = 1
bandname = "band"
crs = 4269
f_crs = 4326
id_feature = "comid"
user_data = UserTiffData(
    var=varname,
    ds=ds_3dep,
    proj_ds=crs,
    x_coord=tx_name,
    y_coord=ty_name,
    band=band,
    bname=bandname,
    f_feature=mr_fl,
    id_feature=id_feature,
    proj_feature=crs
)

Define InterpGen object and generate interpolation

interpObject1 = InterpGen(user_data, pt_spacing=100, stat='all', interp_method="linear")
# stats1, pts1 = interpObject1.calc_interp()
stats1, pts1 = interpObject1.calc_interp()

Plot the sub-setted 3DEP 10m DEM

ds_ss = user_data.get_source_subset(key="").to_dataset(name="elevation")
ds_ss.elevation.hvplot.quadmesh(x="x", y="y", geo=True, project=True, rasterize=True,  cmap="viridis", width=600, height=300, aspect="equal", clabel="3DEP 10m DEM", xlabel="Longitude", ylabel="Latitude", title="3DEP 10m DEM")

Figure 3

Show the interpolated points

pts1.head(10)
bandspatial_refvaluesdistcomidgeometryvarname
pt
0103620.1482400.02889186POINT (-105.62824 40.05363)elevation
1103620.536633100.02889186POINT (-105.62721 40.05347)elevation
2103605.426694200.02889186POINT (-105.62617 40.05330)elevation
3103589.268663300.02889186POINT (-105.62513 40.05317)elevation
4103578.737594400.02889186POINT (-105.62408 40.05319)elevation
5103571.558704500.02889186POINT (-105.62307 40.05350)elevation
6103561.002180600.02889186POINT (-105.62208 40.05386)elevation
0103562.4259440.02889188POINT (-105.61719 40.05555)elevation
1103560.132325100.02889188POINT (-105.61623 40.05523)elevation
2103524.454683200.02889188POINT (-105.61593 40.05429)elevation

Show The statistics

stats1.head(10)
comidvarnamemeanmedianstdminmax
02889186elevation3592.3826733589.26866321.8696693561.0021803620.536633
02889188elevation3530.4345883542.29350435.5129393474.7253993562.425944
02889198elevation3448.1634133448.1634135.0887963443.0746173453.252209
02889214elevation1558.9487751558.9098721.1771881556.9959571560.632596
02889222elevation3383.0791573384.11194720.1861743357.8560333407.269491
02889228elevation1562.4676561562.3295101.2007231560.5146571564.282919
02889258elevation1566.9948471567.2764081.4055141564.5516181569.605378
02889260elevation1570.0614881569.9269301.1771321568.6648251572.449691
02889264elevation1572.7802861572.7802860.2216801572.5586071573.001966
02889276elevation1574.9208871574.8831091.2827721572.9880321576.906272

Plot the mean elevation by comid

# sort stats by topological stream order
stats = stats1.set_index("comid").loc[sort_ids[:-1]]
mr_fl["tmean"] = stats["mean"].values
pmin = mr_fl["tmean"].min()
pmax = mr_fl["tmean"].max()
pmin, pmax
import geoviews as gv

gv.extension("bokeh")
gv.tile_sources.EsriTerrain * gv.Path(mr_fl.to_crs(4326), vdims=["tmean"]).opts(width=600, height=300, aspect="equal",color="tmean", cmap="viridis", line_width=5, colorbar=True, clim=(pmin, pmax), clabel="Mean Elevation (m)", title="Mean Elevation by Stream Segment")

Figure 4

Plot the interpolated data as a scatter plot by comid

import plotly.graph_objects as go
fig = go.Figure()
pts = pts1.groupby("comid")
for index in range(len(pts)):
    group = pts.get_group(sort_ids[index])
    fig.add_trace(go.Scatter(x=group["dist"], y=group["values"], mode='markers', marker=dict(size=5, color=group["values"], colorscale='Viridis', cmin=pmin, cmax=pmax), name=str(sort_ids[index])))
fig.update_layout(title="Stream-segment Interpolated Elevation 10m DEM", xaxis_title="Distance (m)", yaxis_title="Elevation (m)")
fig.show()

Figure 5