Basin Delineation Algorithm¶

Overview¶

Basin delineation assigns each cell a label identifying its drainage outlet. The algorithm traces flow paths upstream from outlets, creating watershed boundaries that define the contributing area for each drainage point.

Core Concept¶

The algorithm uses upstream breadth-first search (BFS) from outlet cells. Starting at each outlet, it propagates upstream through the flow direction network, assigning basin labels to all cells that drain to that outlet.

Definitions¶

Outlet cell: A cell where flow exits the domain or terminates:

\[ \text{is\_outlet}(c) = \text{downstream}(c) \text{ is nodata or out of bounds} \]

Drainage point: A user-specified outlet location with an associated basin ID.

Upstream neighbor: A cell \(n\) is upstream of \(c\) if \(n\) flows directly to \(c\):

\[ \text{upstream}(n, c) \iff \text{downstream}(n) = c \]

Data Structures¶

Structure	Description
Labels Array	Integer grid storing basin ID for each cell
Drainage Points	Map from (row, col) to basin ID
Basin Graph	Directed graph of basin connectivity
Queue	FIFO queue for BFS traversal

Algorithm¶

Phase 1: Outlet Identification¶

Scan all cells to find outlets:

def find_outlets(fdr):
    outlets = []

    for cell in all_cells():
        if is_nodata(fdr[cell]):
            continue

        downstream = get_downstream(cell, fdr)

        if downstream is None or is_nodata(fdr[downstream]):
            outlets.append(cell)

    return outlets

Phase 2: Upstream Labeling¶

For each outlet, propagate labels upstream using BFS:

def label_watersheds(fdr, drainage_points):
    labels = zeros(fdr.shape, dtype=int64)
    basin_graph = {}
    queue = Queue()

    # Start from all outlets
    for outlet in find_outlets(fdr):
        basin_id = generate_unique_id(outlet)
        labels[outlet] = basin_id
        queue.push((outlet, basin_id))

    while not queue.empty():
        cell, current_basin = queue.pop()

        # Check if this cell is a drainage point
        if cell in drainage_points:
            new_basin = drainage_points[cell]
            basin_graph[new_basin] = current_basin  # Record downstream connection
            current_basin = new_basin

        # Find and label upstream neighbors
        for neighbor in neighbors_8(cell):
            if labels[neighbor] != 0:
                continue  # Already labeled

            if not is_upstream(neighbor, cell, fdr):
                continue  # Not flowing to current cell

            labels[neighbor] = current_basin
            queue.push((neighbor, current_basin))

    return labels, basin_graph

Upstream Neighbor Generator¶

Iterate through neighbors and check if they flow to the current cell:

def upstream_neighbors(cell, fdr):
    row, col = cell

    for direction in range(8):
        dr, dc = OFFSETS[direction]
        neighbor = (row + dr, col + dc)

        if out_of_bounds(neighbor):
            continue

        # Check if neighbor's flow direction points to cell
        neighbor_direction = fdr[neighbor]
        if neighbor_direction == NODATA or neighbor_direction == UNDEFINED:
            continue

        # Opposite direction should point back to cell
        opposite = (direction + 4) % 8
        if neighbor_direction == opposite:
            yield neighbor

Drainage Point Handling¶

Coordinate Transformation¶

Drainage points are typically provided in geographic coordinates. Convert to raster indices:

\[ c = \lfloor (x - x_0) / \Delta x \rfloor \]

\[ r = \lfloor (y_0 - y) / |\Delta y| \rfloor \]

where \((x_0, y_0)\) is the raster origin and \((\Delta x, \Delta y)\) are pixel dimensions.

Snapping to Flow Accumulation¶

Drainage points may not fall exactly on stream channels. Snapping moves them to the highest accumulation cell within a search radius:

def snap_to_stream(point, fac, radius):
    best_cell = point
    max_acc = fac[point]

    for dr in range(-radius, radius + 1):
        for dc in range(-radius, radius + 1):
            cell = (point[0] + dr, point[1] + dc)

            if fac[cell] > max_acc:
                max_acc = fac[cell]
                best_cell = cell

    return best_cell

Basin ID Assignment¶

Each drainage point receives a unique basin ID. The graph records downstream relationships:

\[ \text{graph}[B_i] = B_j \iff \text{basin } B_i \text{ drains to basin } B_j \]

Basin Graph¶

The basin graph is a directed acyclic graph (DAG) representing drainage relationships:

Vertices: Basin IDs
Edges: Downstream connections

For drainage points \(D = \{d_1, d_2, \ldots, d_k\}\) with basin IDs \(B = \{b_1, b_2, \ldots, b_k\}\):

\[ (b_i, b_j) \in E \iff \exists \text{ flow path from } d_i \text{ to } d_j \]

Polygon Generation¶

Convert labeled raster to vector polygons using boundary tracing.

Boundary Cell Detection¶

A cell is on the basin boundary if any 8-connected neighbor has a different label:

\[ \text{is\_boundary}(c) = \exists n \in N_8(c) : \text{label}(n) \neq \text{label}(c) \]

Moore-Neighbor Tracing¶

Trace the boundary of each basin using the Moore-Neighbor algorithm:

Find a boundary cell (leftmost cell in topmost row)
Start facing "up" (north)
Turn clockwise until finding a cell with the same label
Move to that cell, record its position
Repeat until returning to the starting cell

def trace_boundary(start, labels, basin_id):
    polygon = [start]
    current = start
    direction = 0  # Start facing up

    while True:
        # Search clockwise for next boundary cell
        for i in range(8):
            check_dir = (direction + 5 + i) % 8  # Start from back-left
            neighbor = get_neighbor(current, check_dir)

            if labels[neighbor] == basin_id:
                current = neighbor
                direction = check_dir
                break

        if current == start and len(polygon) > 1:
            break

        polygon.append(current)

    return polygon

Reference: Moore-Neighbor Tracing

Tiled Processing¶

For large rasters, basins are labeled per-tile and then connected:

Phase 1: Local Labeling¶

Each tile runs the watershed labeling algorithm independently:

Labels offset by tile index to ensure uniqueness
Perimeter cells recorded for cross-tile connection

Phase 2: Cross-Tile Connection¶

Examine adjacent tile boundaries:

For tiles \(A\) and \(B\) sharing an edge, for each pair of adjacent cells \((a, b)\):

if fdr[a] points to b:
    graph[labels[a]] = labels[b]
elif fdr[b] points to a:
    graph[labels[b]] = labels[a]

Phase 3: Label Finalization¶

Walk the graph to assign final basin IDs:

def finalize_labels(labels, graph, drainage_points):
    # Find final basin for each intermediate label
    final_basin = {}

    for label in all_labels:
        current = label

        while current in graph and current not in drainage_points:
            current = graph[current]

        final_basin[label] = current

    # Apply final labels
    for cell in all_cells():
        if labels[cell] in final_basin:
            labels[cell] = final_basin[labels[cell]]

Output Modes¶

Drainage Point Basins Only¶

Label only cells draining to specified drainage points. Other cells receive nodata.

All Basins¶

Label all cells, including those draining off-map:

\[ \text{label}(c) = \begin{cases} b_i & \text{if } c \text{ drains to drainage point } d_i \\ \text{outlet\_id}(c) & \text{if } c \text{ drains off-map} \end{cases} \]

Complexity¶

Let \(n\) be total cells and \(d\) be drainage point count.

Operation	Time Complexity
Outlet finding	\(O(n)\)
Upstream BFS	\(O(n)\)
Graph walking	\(O(d)\)
Polygon tracing	\(O(p)\)

Where \(p\) is total boundary perimeter.