Pipeline¶

Purpose¶

The Pipeline command is a high-level orchestrator that executes a complete hydrological analysis workflow in a single command. It automates the sequence of terrain conditioning, flow routing, and feature extraction, managing intermediate file creation and dependency handling automatically.

When to Use¶

Use the pipeline when you need to derive standard hydrographic features from a raw DEM without manually managing the sequence of individual operations. It is ideal for:

Batch Processing: Automating analysis across many DEMs.
Standard Workflows: Ensuring consistent processing steps.
Quick Analysis: Getting from a raw DEM to streams and basins with minimal configuration.

Workflow Steps¶

The pipeline executes the following operations in order:

Breach: Removes pits by carving paths (optional, dependent on search radius).
Fill: Fills remaining depressions to ensure hydrologic conditioning.
Flow Direction: Computes D8 flow direction with flat resolution.
Flow Accumulation: Calculates upstream drainage area.
Stream Extraction: Vectorizes stream networks based on drainage area threshold.
Basin Delineation: (Optional) Delineates watersheds for stream junctions.

Parameters¶

dem_file¶

Path to the input raw DEM raster. GDAL-readable format. Single band. Float32.

output_dir¶

Directory where all output files will be written. The directory must exist. Existing files with generated names (e.g., fdr.tif) will be overwritten.

chunk_size¶

Tile dimension in pixels for processing. Default 2048. Set to 0 for in-memory processing.

search_radius_ft¶

Search radius in feet for the pit breaching step. Default: 200. If set to 0, the breaching step is skipped, and the pipeline relies solely on filling. Internal conversion to cell count is performed based on the raster's spatial reference.

max_cost¶

Maximum elevation cost for the breaching step. Default infinity. See Breach

da_sqmi¶

Drainage area threshold in square miles for stream extraction. Default: 1.0. Internal conversion to cell count is performed based on the raster's spatial reference. Determines the density of the resulting stream network.

basins¶

Flag to enable watershed delineation. If set, basins are delineated for every junction node in the extracted stream network. Outputs a basins.tif raster and basins.gpkg vector.

fill_holes¶

Flag to fill nodata holes in the DEM during the conditioning phase. See Fill

Outputs¶

The pipeline generates the following files in the output_dir:

Filename	Description	Data Type
`dem_corrected.tif`	Hydrologically conditioned DEM (Breached & Filled)	Float32
`fdr.tif`	D8 Flow Direction Raster	UInt8
`accum.tif`	Flow Accumulation Raster	Int64
`streams.gpkg`	GeoPackage containing `streams` (lines) and `junctions` (points)	Vector
`basins.tif`	Basin ID Raster (created only if `--basins` is used)	Int64
`basins.gpkg`	GeoPackage containing `basins` (created only if `--basins` is used)	Vector

CLI Usage¶

Run the full pipeline with breaching, stream extraction (1 sq mi threshold), and basin delineation:

overflow pipeline \
    --dem_file raw_dem.tif \
    --output_dir ./results \
    --search_radius_ft 200 \
    --da_sqmi 1.0 \
    --basins

Run a fill-only pipeline (skip breaching) for a dense network (0.1 sq mi):

overflow pipeline \
    --dem_file raw_dem.tif \
    --output_dir ./results \
    --search_radius_ft 0 \
    --da_sqmi 0.1

Python API Usage¶

The pipeline function is primarily a CLI convenience wrapper. To replicate the pipeline logic in Python, invoke the individual core functions sequentially. This allows for greater control over intermediate filenames and parameters.

import overflow

# 1. Setup paths
dem_file = "raw_dem.tif"
output_dir = "./results"

# 2. Terrain Conditioning (Breach + Fill)
# Note: Python API uses cell count
radius_cells = 50

overflow.breach(dem_file, f"{output_dir}/dem_breached.tif", search_radius=radius_cells)
overflow.fill(f"{output_dir}/dem_breached.tif", f"{output_dir}/dem_corrected.tif")

# 3. Flow Routing
overflow.flow_direction(f"{output_dir}/dem_corrected.tif", f"{output_dir}/fdr.tif")
overflow.accumulation(f"{output_dir}/fdr.tif", f"{output_dir}/accum.tif")

# 4. Feature Extraction
threshold_cells = 50000

overflow.streams(
    fac_path=f"{output_dir}/accum.tif",
    fdr_path=f"{output_dir}/fdr.tif",
    output_dir=output_dir,
    threshold=threshold_cells
)

# 5. Basins
# Uses the junctions layer from the generated streams.gpkg
overflow.basins(
    fdr_path=f"{output_dir}/fdr.tif",
    drainage_points_path=f"{output_dir}/streams.gpkg",
    output_path=f"{output_dir}/basins.tif",
    layer_name="junctions"
)