inland_vulnerability¶

inland_vulnerability ¶

Inland flood vulnerability function implementation.

This module provides vulnerability calculations specific to inland flooding using FLSBT (Flood-Specific Building Types) based damage function lookup.

Classes¶

InlandFloodVulnerability ¶

InlandFloodVulnerability(wildcard_fields: Optional[List[str]] = None)

Bases: AbstractVulnerabilityFunction

Inland flood vulnerability function implementation.

This class implements flood-specific vulnerability calculations using a two-stage lookup process: 1. Match building attributes to damage function IDs via complete lookup tables 2. Interpolate damage percentages from depth-damage curves using those IDs

The lookup process considers: - Construction type (C, M, S, W) - Occupancy type (RES1, COM1, etc.) - Number of stories (with min/max ranges) - Square footage (optional filtering) - Foundation type (BASE, PILE, SHAL, SLAB) - Flood peril type (RLS, RHS, RLL, RHL, CST, CMV, CHW)

Wildcard Configuration

Use wildcard_fields to force certain attributes to be treated as wildcards even when values are present:

[] (default): match on all available attributes
['occupancy_type']: ignore occupancy, match on other attributes
['foundation_type', 'number_stories']: match only on occupancy + construction
['occupancy_type', 'foundation_type', 'number_stories', 'area', 'general_building_type']: match ALL curves regardless of building attributes

Initialize InlandFloodVulnerability.

Parameters:

Name	Type	Description	Default
`wildcard_fields`	`list`	Building attribute fields to ignore during damage function matching even when values are present. Defaults to `[]` (match on all attributes).	`None`

Source code in src/inland_consequences/inland_vulnerability.py

def __init__(
    self,
    wildcard_fields: Optional[List[str]] = None,
):
    """
    Initialize InlandFloodVulnerability.

    Args:
        wildcard_fields (list, optional): Building attribute fields to ignore
            during damage function matching even when values are present.
            Defaults to ``[]`` (match on all attributes).
    """
    self.wildcard_fields = wildcard_fields or []

Functions¶

calculate_vulnerability ¶

calculate_vulnerability()

Not used in the DuckDB pipeline.

The DuckDB-based vulnerability pipeline is driven by :meth:gather_damage_functions and :meth:compute_damage_function_statistics, which are called directly by :class:~inland_consequences.InlandFloodAnalysis.

Source code in src/inland_consequences/inland_vulnerability.py

def calculate_vulnerability(self):
    """Not used in the DuckDB pipeline.

    The DuckDB-based vulnerability pipeline is driven by
    :meth:`gather_damage_functions` and
    :meth:`compute_damage_function_statistics`, which are called directly
    by :class:`~inland_consequences.InlandFloodAnalysis`.
    """
    raise NotImplementedError(
        "Use gather_damage_functions() and compute_damage_function_statistics() "
        "with a DuckDB connection instead."
    )

apply_damage_percentages ¶

apply_damage_percentages()

Not used in the DuckDB pipeline.

See :meth:compute_damage_function_statistics.

Source code in src/inland_consequences/inland_vulnerability.py

def apply_damage_percentages(self):
    """Not used in the DuckDB pipeline.

    See :meth:`compute_damage_function_statistics`.
    """
    raise NotImplementedError(
        "Use compute_damage_function_statistics() with a DuckDB connection instead."
    )

create_vulnerability_tables ¶

create_vulnerability_tables(connection: DuckDBPyConnection) -> None

Create tables from the vulnerability data CSVs.

Loads the cross-reference tables (xref_contents, xref_inventory, xref_structures) and the damage function structure table (ddf_structure) into the provided DuckDB connection.

Parameters:

Name	Type	Description	Default
`connection`	`DuckDBPyConnection`	Active DuckDB connection.	required

Source code in src/inland_consequences/inland_vulnerability.py

def create_vulnerability_tables(self, connection: duckdb.DuckDBPyConnection) -> None:
    """Create tables from the vulnerability data CSVs.

    Loads the cross-reference tables (xref_contents, xref_inventory,
    xref_structures) and the damage function structure table (ddf_structure)
    into the provided DuckDB connection.

    Args:
        connection: Active DuckDB connection.
    """
    base_path = Path(__file__).parent / "data"
    csv_files = {
        "xref_contents": base_path / "df_lookup_contents.csv",
        "xref_inventory": base_path / "df_lookup_inventory.csv",
        "xref_structures": base_path / "df_lookup_structures.csv",
    }

    for table_name, csv_path in csv_files.items():
        df = pd.read_csv(csv_path)
        arrow_table = pa.Table.from_pandas(df)
        connection.execute(f"DROP TABLE IF EXISTS {table_name}")
        connection.execute(f"CREATE TABLE {table_name} AS SELECT * FROM arrow_table")

    df_structure = pd.read_csv(base_path / "df_structure.csv")
    structure_arrow_table = pa.Table.from_pandas(df_structure)
    connection.execute("DROP TABLE IF EXISTS ddf_structure")
    connection.execute(f"CREATE TABLE ddf_structure AS SELECT * FROM structure_arrow_table")

gather_damage_functions ¶

gather_damage_functions(connection: DuckDBPyConnection) -> None

Match buildings to damage functions based on building attributes.

Performs a multi-attribute matching algorithm that assigns appropriate damage function IDs to each building. When multiple damage functions match a building, they are assigned with equal probability weights. Damage functions matched n-times receive n*(weight) to ensure total weights sum to 1.0.

Matching Logic: - All attributes are optional (NULL = wildcard, matches any value): * occupancy_type: building use type (e.g., RES1, COM1) * foundation_type: 4-letter codes (BASE=basement, SLAB=slab, PILE=pile, SHAL=shallow) * number_stories: building height in stories * general_building_type: construction material (e.g., wood, masonry, concrete)

Wildcard Configuration: - Use self.wildcard_fields to force certain attributes to be treated as wildcards even when values are present.

Output: - Creates structure_damage_functions, content_damage_functions, and inventory_damage_functions tables with building_id, damage_function_id, and weight (probability). Weights sum to 1.0 for each building.

Source code in src/inland_consequences/inland_vulnerability.py

def gather_damage_functions(self, connection: duckdb.DuckDBPyConnection) -> None:
    """Match buildings to damage functions based on building attributes.

    Performs a multi-attribute matching algorithm that assigns appropriate
    damage function IDs to each building. When multiple damage functions match
    a building, they are assigned with equal probability weights. Damage
    functions matched n-times receive n*(weight) to ensure total weights sum
    to 1.0.

    Matching Logic:
    - All attributes are optional (NULL = wildcard, matches any value):
      * occupancy_type: building use type (e.g., RES1, COM1)
      * foundation_type: 4-letter codes (BASE=basement, SLAB=slab, PILE=pile, SHAL=shallow)
      * number_stories: building height in stories
      * general_building_type: construction material (e.g., wood, masonry, concrete)

    Wildcard Configuration:
    - Use self.wildcard_fields to force certain attributes to be treated as
      wildcards even when values are present.

    Output:
    - Creates structure_damage_functions, content_damage_functions, and
      inventory_damage_functions tables with building_id, damage_function_id,
      and weight (probability). Weights sum to 1.0 for each building.
    """
    # Build conditional checks for wildcarded fields
    # If a field is in wildcard_fields, we skip the matching check for it
    use_occupancy = 'occupancy_type' not in self.wildcard_fields
    use_foundation = 'foundation_type' not in self.wildcard_fields
    use_stories = 'number_stories' not in self.wildcard_fields
    use_sqft = 'area' not in self.wildcard_fields
    use_construction = 'general_building_type' not in self.wildcard_fields
    use_flood_peril_type = 'flood_peril_type' not in self.wildcard_fields

    structure_query = '''
    CREATE TABLE structure_damage_functions AS
    WITH 
    -- STEP 1: Generate all potential building-to-curve matches
    -- Cross join creates all possible combinations, then filter by attribute matching
    curve_matches AS (
        SELECT 
            b.ID,
            c.damage_function_id,
            b.first_floor_height,
            0 as ffh_sig,  -- Fixed uncertainty for first floor height (could be extended)

            -- Attribute Matching Logic:
            -- Each WHEN clause checks if an attribute mismatch should disqualify the curve.
            -- NULL values in either building or curve act as wildcards (match anything).
            -- Wildcarded fields (via wildcard_fields parameter) are skipped entirely.
            -- If all attributes match (or are NULL/wildcarded), is_match = 1; otherwise 0.
            CASE 
                -- Occupancy Type Matching (conditionally enabled):
                -- Direct comparison of occupancy codes (e.g., RES1, RES2, COM1, AGR1)
                WHEN {use_occupancy} AND b.occupancy_type IS NOT NULL AND c.occupancy_type IS NOT NULL
                    AND b.occupancy_type != c.occupancy_type THEN 0

                -- Foundation Type Matching (conditionally enabled):
                -- Direct comparison of 4-letter HAZUS foundation codes (BASE, PILE, SHAL, SLAB)
                -- Buildings are preprocessed to 4-letter codes matching lookup tables
                WHEN {use_foundation} AND b.foundation_type IS NOT NULL AND c.foundation_type IS NOT NULL
                    AND b.foundation_type != c.foundation_type THEN 0

                -- Story Count Matching (conditionally enabled):
                -- Check if building story count falls within curve's min/max range
                -- Example: 2-story building matches curves with story_min=1, story_max=3
                WHEN {use_stories} AND b.number_stories IS NOT NULL AND c.story_min IS NOT NULL AND c.story_max IS NOT NULL 
                    AND NOT (b.number_stories BETWEEN c.story_min AND c.story_max) THEN 0

                -- Sqft (Area) Matching (conditionally enabled):
                -- Check whether the building's area falls within the curve's sqft range.
                -- A NULL sqft_min means no lower bound; a NULL sqft_max means no upper bound.
                -- Both NULL -> curve applies to any area (wildcard).
                WHEN {use_sqft} AND b.area IS NOT NULL
                    AND NOT (
                        (c.sqft_min IS NULL OR b.area >= c.sqft_min)
                        AND (c.sqft_max IS NULL OR b.area <= c.sqft_max)
                    ) THEN 0

                -- Construction Type Matching (conditionally enabled):
                -- Direct comparison of construction material codes (W=Wood, M=Masonry, C=Concrete, S=Steel, MH=Manufactured Housing)
                -- Preprocessed from numeric codes in Milliman data, direct field in NSI data
                WHEN {use_construction} AND b.general_building_type IS NOT NULL 
                    AND b.general_building_type != c.construction_type THEN 0

                -- Flood Peril Type Matching (conditionally enabled):
                -- Direct comparison of flood peril type codes (RLS, RLL, RHS, RHL, CHW, CMV, CST)
                -- Assigned to each building by _assign_flood_peril_type based on velocity/duration
                WHEN {use_flood_peril_type} AND b.flood_peril_type IS NOT NULL AND c.flood_peril_type IS NOT NULL
                    AND b.flood_peril_type != c.flood_peril_type THEN 0

                -- If none of the mismatch conditions triggered, it's a valid match
                ELSE 1
            END AS is_match
        FROM buildings b
        CROSS JOIN xref_structures c
    ),

    -- STEP 2: Filter to only valid matches
    -- Remove all building-curve pairs where is_match = 0
    filtered_matches AS (
        SELECT ID, damage_function_id, first_floor_height, ffh_sig
        FROM curve_matches
        WHERE is_match = 1
    ),

    -- STEP 3: Calculate match frequencies for weight assignment
    -- Count how many total curves match each building, and how many times each specific curve appears
    -- (Some curves may appear multiple times due to different flood peril types, etc.)
    curve_frequencies AS (
        SELECT 
            ID, damage_function_id, first_floor_height, ffh_sig,
            COUNT(*) OVER (PARTITION BY ID) AS total_matches,                    -- Total curves for this building
            COUNT(*) OVER (PARTITION BY ID, damage_function_id) AS curve_count   -- Times this curve appears
        FROM filtered_matches
    ),

    -- STEP 4: Calculate probability weights and deduplicate
    -- Each curve gets weight = (# times it appears) / (total # of curve matches)
    -- This ensures weights sum to 1.0
    unique_scenarios AS (
        SELECT DISTINCT
            ID, damage_function_id, first_floor_height, ffh_sig,
            CAST(curve_count AS DOUBLE) / NULLIF(total_matches, 0) AS weight
        FROM curve_frequencies
    )

    -- STEP 5: Final output with renamed columns
    SELECT 
        ID AS building_id,
        damage_function_id,
        first_floor_height,
        ffh_sig,
        weight
    FROM unique_scenarios;
    '''.format(
        use_occupancy=use_occupancy,
        use_foundation=use_foundation,
        use_stories=use_stories,
        use_sqft=use_sqft,
        use_construction=use_construction,
        use_flood_peril_type=use_flood_peril_type
    )
    connection.execute("DROP TABLE IF EXISTS structure_damage_functions")
    connection.execute(structure_query)

    # Content Damage Functions - matches on construction, occupancy, stories, sqft, foundation, and flood peril
    content_query = '''
    CREATE TABLE content_damage_functions AS
    WITH 
    -- STEP 1: Generate all potential building-to-curve matches for contents
    curve_matches AS (
        SELECT 
            b.ID,
            c.damage_function_id,

            -- Content matching includes all structure attributes plus flood peril type
            CASE 
                -- Occupancy Type Matching
                WHEN {use_occupancy} AND b.occupancy_type IS NOT NULL AND c.occupancy_type IS NOT NULL
                    AND b.occupancy_type != c.occupancy_type THEN 0

                -- Foundation Type Matching (4-letter HAZUS codes: BASE, PILE, SHAL, SLAB)
                WHEN {use_foundation} AND b.foundation_type IS NOT NULL AND c.foundation_type IS NOT NULL
                    AND b.foundation_type != c.foundation_type THEN 0

                -- Story Count Matching
                WHEN {use_stories} AND b.number_stories IS NOT NULL AND c.story_min IS NOT NULL AND c.story_max IS NOT NULL 
                    AND NOT (b.number_stories BETWEEN c.story_min AND c.story_max) THEN 0

                -- Construction Type Matching
                WHEN {use_construction} AND b.general_building_type IS NOT NULL 
                    AND b.general_building_type != c.construction_type THEN 0

                -- Flood Peril Type Matching (new for contents)
                WHEN b.flood_peril_type IS NOT NULL AND c.flood_peril_type IS NOT NULL
                    AND b.flood_peril_type != c.flood_peril_type THEN 0

                -- SQFT Matching (if specified in lookup table)
                WHEN b.area IS NOT NULL AND c.sqft_min IS NOT NULL AND b.area < c.sqft_min THEN 0
                WHEN b.area IS NOT NULL AND c.sqft_max IS NOT NULL AND b.area > c.sqft_max THEN 0

                ELSE 1
            END AS is_match
        FROM buildings b
        CROSS JOIN xref_contents c
    ),

    -- STEP 2: Filter to only valid matches
    filtered_matches AS (
        SELECT ID, damage_function_id
        FROM curve_matches
        WHERE is_match = 1
    ),

    -- STEP 3: Calculate match frequencies for weight assignment
    curve_frequencies AS (
        SELECT 
            ID, damage_function_id,
            COUNT(*) OVER (PARTITION BY ID) AS total_matches,
            COUNT(*) OVER (PARTITION BY ID, damage_function_id) AS curve_count
        FROM filtered_matches
    ),

    -- STEP 4: Calculate probability weights and deduplicate
    unique_scenarios AS (
        SELECT DISTINCT
            ID, damage_function_id,
            CAST(curve_count AS DOUBLE) / NULLIF(total_matches, 0) AS weight
        FROM curve_frequencies
    )

    -- STEP 5: Final output
    SELECT 
        ID AS building_id,
        damage_function_id,
        weight
    FROM unique_scenarios;
    '''.format(
        use_occupancy=use_occupancy,
        use_foundation=use_foundation,
        use_stories=use_stories,
        use_construction=use_construction
    )
    connection.execute("DROP TABLE IF EXISTS content_damage_functions")
    connection.execute(content_query)

    # Inventory Damage Functions - simpler matching on occupancy, foundation, and flood peril only
    inventory_query = '''
    CREATE TABLE inventory_damage_functions AS
    WITH 
    -- STEP 1: Generate all potential building-to-curve matches for inventory
    curve_matches AS (
        SELECT 
            b.ID,
            c.damage_function_id,

            -- Inventory matching is simpler: occupancy, foundation, flood peril
            CASE 
                -- Occupancy Type Matching
                WHEN {use_occupancy} AND b.occupancy_type IS NOT NULL AND c.occupancy_type IS NOT NULL
                    AND b.occupancy_type != c.occupancy_type THEN 0

                -- Foundation Type Matching (4-letter HAZUS codes: BASE, PILE, SHAL, SLAB)
                WHEN {use_foundation} AND b.foundation_type IS NOT NULL AND c.foundation_type IS NOT NULL
                    AND b.foundation_type != c.foundation_type THEN 0

                -- Flood Peril Type Matching
                WHEN b.flood_peril_type IS NOT NULL AND c.flood_peril_type IS NOT NULL
                    AND b.flood_peril_type != c.flood_peril_type THEN 0

                ELSE 1
            END AS is_match
        FROM buildings b
        CROSS JOIN xref_inventory c
    ),

    -- STEP 2: Filter to only valid matches
    filtered_matches AS (
        SELECT ID, damage_function_id
        FROM curve_matches
        WHERE is_match = 1
    ),

    -- STEP 3: Calculate match frequencies for weight assignment
    curve_frequencies AS (
        SELECT 
            ID, damage_function_id,
            COUNT(*) OVER (PARTITION BY ID) AS total_matches,
            COUNT(*) OVER (PARTITION BY ID, damage_function_id) AS curve_count
        FROM filtered_matches
    ),

    -- STEP 4: Calculate probability weights and deduplicate
    unique_scenarios AS (
        SELECT DISTINCT
            ID, damage_function_id,
            CAST(curve_count AS DOUBLE) / NULLIF(total_matches, 0) AS weight
        FROM curve_frequencies
    )

    -- STEP 5: Final output
    SELECT 
        ID AS building_id,
        damage_function_id,
        weight
    FROM unique_scenarios;
    '''.format(
        use_occupancy=use_occupancy,
        use_foundation=use_foundation
    )
    connection.execute("DROP TABLE IF EXISTS inventory_damage_functions")
    connection.execute(inventory_query)

gather_missing_functions ¶

gather_missing_functions(connection: DuckDBPyConnection) -> None

Add damage functions for buildings not matched by gather_damage_functions.

Buildings may be unmatched when their ranged attributes (number_stories, area) fall outside the min/max bounds defined in xref_structures. This method clamps those values to the nearest available bound: - Values that are NULL or below the minimum range: use the minimum bound curve - Values above the maximum range: use the maximum bound curve

Matched records are inserted into structure_damage_functions with appropriate probability weights. Validation log entries are recorded for each building with a clamped attribute.

Source code in src/inland_consequences/inland_vulnerability.py

def gather_missing_functions(self, connection: duckdb.DuckDBPyConnection) -> None:
    """Add damage functions for buildings not matched by gather_damage_functions.

    Buildings may be unmatched when their ranged attributes (number_stories, area)
    fall outside the min/max bounds defined in xref_structures. This method clamps
    those values to the nearest available bound:
    - Values that are NULL or below the minimum range: use the minimum bound curve
    - Values above the maximum range: use the maximum bound curve

    Matched records are inserted into structure_damage_functions with appropriate
    probability weights. Validation log entries are recorded for each building
    with a clamped attribute.
    """
    use_occupancy = 'occupancy_type' not in self.wildcard_fields
    use_foundation = 'foundation_type' not in self.wildcard_fields
    use_stories = 'number_stories' not in self.wildcard_fields
    use_sqft = 'area' not in self.wildcard_fields
    use_construction = 'general_building_type' not in self.wildcard_fields
    use_flood_peril_type = 'flood_peril_type' not in self.wildcard_fields

    # Build a temp table with clamped matches and clamp reasons so we can
    # insert into both structure_damage_functions and validation_log.
    create_sql = '''
    CREATE OR REPLACE TEMP TABLE _clamped_matches AS
    WITH
    -- Identify buildings with no existing damage function match
    missing_buildings AS (
        SELECT b.*
        FROM buildings b
        LEFT JOIN structure_damage_functions sdf ON b.id = sdf.building_id
        WHERE sdf.building_id IS NULL
    ),

    -- Match missing buildings on non-range attributes only (same attribute
    -- logic as gather_damage_functions but without story/sqft range checks)
    non_range_candidates AS (
        SELECT
            b.id,
            b.number_stories,
            b.area,
            b.first_floor_height,
            c.damage_function_id,
            c.story_min,
            c.story_max,
            c.sqft_min,
            c.sqft_max
        FROM missing_buildings b
        CROSS JOIN xref_structures c
        WHERE (
            CASE
                WHEN {use_occupancy} AND b.occupancy_type IS NOT NULL AND c.occupancy_type IS NOT NULL
                    AND b.occupancy_type != c.occupancy_type THEN 0

                WHEN {use_foundation} AND b.foundation_type IS NOT NULL AND c.foundation_type IS NOT NULL AND
                    CASE
                        WHEN b.foundation_type = 'I' THEN 'PILE'
                        WHEN b.foundation_type = 'B' THEN 'BASE'
                        WHEN b.foundation_type = 'S' THEN 'SLAB'
                        WHEN b.foundation_type = 'P' THEN 'PILE'
                        WHEN b.foundation_type = 'W' THEN 'BASE'
                        WHEN b.foundation_type = 'C' THEN 'SHAL'
                        WHEN b.foundation_type = 'F' THEN 'SHAL'
                        ELSE NULL
                    END != c.foundation_type THEN 0

                WHEN {use_construction} AND b.general_building_type IS NOT NULL
                    AND b.general_building_type != c.construction_type THEN 0

                WHEN {use_flood_peril_type} AND b.flood_peril_type IS NOT NULL AND c.flood_peril_type IS NOT NULL
                    AND b.flood_peril_type != c.flood_peril_type THEN 0

                ELSE 1
            END = 1
        )
    ),

    -- Compute global range bounds across all matching curves per building
    with_bounds AS (
        SELECT
            nrc.*,
            MIN(CASE WHEN story_min IS NOT NULL THEN story_min END) OVER (PARTITION BY id) AS global_story_min,
            MAX(CASE WHEN story_max IS NOT NULL THEN story_max END) OVER (PARTITION BY id) AS global_story_max,
            MIN(CASE WHEN sqft_min IS NOT NULL THEN sqft_min END) OVER (PARTITION BY id) AS global_sqft_min,
            MAX(CASE WHEN sqft_max IS NOT NULL THEN sqft_max END) OVER (PARTITION BY id) AS global_sqft_max
        FROM non_range_candidates nrc
    ),

    -- Calculate effective (clamped) values and clamp reasons
    with_effective AS (
        SELECT
            wb.*,
            -- Effective stories: clamp to [global_story_min, global_story_max]
            CASE
                WHEN NOT {use_stories} THEN number_stories
                WHEN global_story_min IS NULL THEN number_stories
                WHEN number_stories IS NULL OR number_stories < global_story_min THEN global_story_min
                WHEN number_stories > global_story_max THEN global_story_max
                ELSE number_stories
            END AS effective_stories,
            -- Effective sqft: clamp to [global_sqft_min, global_sqft_max]
            CASE
                WHEN NOT {use_sqft} THEN area
                WHEN global_sqft_min IS NULL THEN area
                WHEN area IS NULL OR area < global_sqft_min THEN global_sqft_min
                WHEN area > global_sqft_max THEN global_sqft_max
                ELSE area
            END AS effective_sqft,
            -- Story clamp reason
            CASE
                WHEN NOT {use_stories} THEN NULL
                WHEN global_story_min IS NULL THEN NULL
                WHEN number_stories IS NULL THEN 'STORIES_NULL_CLAMPED_TO_MIN'
                WHEN number_stories < global_story_min THEN 'STORIES_BELOW_MIN'
                WHEN number_stories > global_story_max THEN 'STORIES_ABOVE_MAX'
                ELSE NULL
            END AS story_clamp,
            -- Sqft clamp reason
            CASE
                WHEN NOT {use_sqft} THEN NULL
                WHEN global_sqft_min IS NULL THEN NULL
                WHEN area IS NULL THEN 'SQFT_NULL_CLAMPED_TO_MIN'
                WHEN area < global_sqft_min THEN 'SQFT_BELOW_MIN'
                WHEN area > global_sqft_max THEN 'SQFT_ABOVE_MAX'
                ELSE NULL
            END AS sqft_clamp
        FROM with_bounds wb
    ),

    -- Filter using effective (clamped) range values
    range_matched AS (
        SELECT *
        FROM with_effective
        WHERE
            (NOT {use_stories} OR story_min IS NULL OR story_max IS NULL
             OR effective_stories BETWEEN story_min AND story_max)
            AND (NOT {use_sqft}
                 OR (sqft_min IS NULL OR effective_sqft >= sqft_min)
                    AND (sqft_max IS NULL OR effective_sqft <= sqft_max))
    ),

    -- Calculate match frequencies for weight assignment
    frequencies AS (
        SELECT
            id, damage_function_id, first_floor_height,
            0 AS ffh_sig,
            number_stories, area,
            global_story_min, global_story_max,
            global_sqft_min, global_sqft_max,
            story_clamp, sqft_clamp,
            COUNT(*) OVER (PARTITION BY id) AS total_matches,
            COUNT(*) OVER (PARTITION BY id, damage_function_id) AS curve_count
        FROM range_matched
    )

    SELECT DISTINCT
        id AS building_id,
        damage_function_id,
        first_floor_height,
        ffh_sig,
        CAST(curve_count AS DOUBLE) / NULLIF(total_matches, 0) AS weight,
        number_stories, area,
        global_story_min, global_story_max,
        global_sqft_min, global_sqft_max,
        story_clamp, sqft_clamp
    FROM frequencies;
    '''.format(
        use_occupancy=use_occupancy,
        use_foundation=use_foundation,
        use_stories=use_stories,
        use_sqft=use_sqft,
        use_construction=use_construction,
        use_flood_peril_type=use_flood_peril_type
    )

    connection.execute(create_sql)

    # Insert clamped matches into structure_damage_functions
    connection.execute('''
        INSERT INTO structure_damage_functions
            (building_id, damage_function_id, first_floor_height, ffh_sig, weight)
        SELECT building_id, damage_function_id, first_floor_height, ffh_sig, weight
        FROM _clamped_matches
    ''')

    # Log validation entries for story clamping
    connection.execute('''
        INSERT INTO validation_log
            (building_id, table_name, source, rule, message, severity)
        SELECT DISTINCT
            building_id,
            'structure_damage_functions',
            'gather_missing_functions',
            story_clamp,
            CASE story_clamp
                WHEN 'STORIES_NULL_CLAMPED_TO_MIN'
                    THEN 'Building has no story count; damage function matched using minimum story range (' || global_story_min || ')'
                WHEN 'STORIES_BELOW_MIN'
                    THEN 'Building story count (' || number_stories || ') below available curve range; clamped to minimum (' || global_story_min || ')'
                WHEN 'STORIES_ABOVE_MAX'
                    THEN 'Building story count (' || number_stories || ') above available curve range; clamped to maximum (' || global_story_max || ')'
            END,
            'WARNING'
        FROM _clamped_matches
        WHERE story_clamp IS NOT NULL
    ''')

    # Log validation entries for sqft clamping
    connection.execute('''
        INSERT INTO validation_log
            (building_id, table_name, source, rule, message, severity)
        SELECT DISTINCT
            building_id,
            'structure_damage_functions',
            'gather_missing_functions',
            sqft_clamp,
            CASE sqft_clamp
                WHEN 'SQFT_NULL_CLAMPED_TO_MIN'
                    THEN 'Building has no area; damage function matched using minimum sqft range (' || global_sqft_min || ')'
                WHEN 'SQFT_BELOW_MIN'
                    THEN 'Building area (' || CAST(ROUND(area, 0) AS INTEGER) || ') below available curve sqft range; clamped to minimum (' || global_sqft_min || ')'
                WHEN 'SQFT_ABOVE_MAX'
                    THEN 'Building area (' || CAST(ROUND(area, 0) AS INTEGER) || ') above available curve sqft range; clamped to maximum (' || global_sqft_max || ')'
            END,
            'WARNING'
        FROM _clamped_matches
        WHERE sqft_clamp IS NOT NULL
    ''')

    # Cleanup
    connection.execute("DROP TABLE IF EXISTS _clamped_matches")

compute_damage_function_statistics ¶

compute_damage_function_statistics(connection: DuckDBPyConnection) -> None

Compute the mean and standard deviation of the damage functions.

Source code in src/inland_consequences/inland_vulnerability.py

def compute_damage_function_statistics(self, connection: duckdb.DuckDBPyConnection) -> None:
    """Compute the mean and standard deviation of the damage functions."""
    sql_statement = '''
        CREATE OR REPLACE TABLE damage_function_statistics AS

        WITH 
        -- 1. Unpivot DDF Structure
        ddf_points AS (
            SELECT 
                ddf_id,
                CAST(REPLACE(REPLACE(REPLACE(variable_name, 'depth_', ''), 'm', '-'), '_', '.') AS DOUBLE) AS depth_ft,
                value_column as value
            FROM ddf_structure 
            UNPIVOT EXCLUDE NULLS (
                value_column FOR variable_name IN (COLUMNS('^depth_'))
            )
        ),

        -- 2. Create Interpolation Segments
        ddf_segments AS (
            SELECT 
                ddf_id,
                depth_ft as x1,
                value as y1,
                LEAD(depth_ft) OVER (PARTITION BY ddf_id ORDER BY depth_ft) as x2,
                LEAD(value) OVER (PARTITION BY ddf_id ORDER BY depth_ft) as y2
            FROM ddf_points
        ),

        -- 3. Generate Hazard Evaluation Points (With Tagging)
        hazard_points AS (
            SELECT h.id, h.return_period, sdf.damage_function_id as ddf_id, h.depth - sdf.first_floor_height as eval_depth, 'mean' as point_type
            FROM hazard h JOIN structure_damage_functions sdf ON h.ID = sdf.building_id
            UNION ALL
            SELECT h.id, h.return_period, sdf.damage_function_id as ddf_id, h.depth + h.std_dev - sdf.first_floor_height as eval_depth, 'max' as point_type
            FROM hazard h JOIN structure_damage_functions sdf ON h.ID = sdf.building_id
            UNION ALL
            SELECT h.id, h.return_period, sdf.damage_function_id as ddf_id, h.depth - h.std_dev - sdf.first_floor_height as eval_depth, 'min' as point_type
            FROM hazard h JOIN structure_damage_functions sdf ON h.ID = sdf.building_id
        ),

        -- 4. Perform Linear Interpolation
        interpolated_results AS (
            SELECT 
                hp.id,
                hp.return_period,
                hp.point_type,
                CASE 
                    WHEN ds.x1 IS NULL THEN 0 
                    WHEN ds.x2 IS NULL THEN ds.y1
                    ELSE ds.y1 + (hp.eval_depth - ds.x1) * (ds.y2 - ds.y1) / (ds.x2 - ds.x1)
                END as calc_damage
            FROM hazard_points hp
            ASOF LEFT JOIN ddf_segments ds 
                ON hp.ddf_id = ds.ddf_id AND hp.eval_depth >= ds.x1
        ),

        -- 5. Pivot Statistics
        pivoted_stats AS (
            SELECT 
                id as building_id,
                return_period,
                MAX(CASE WHEN point_type = 'mean' THEN calc_damage END) as d_mean,
                MAX(CASE WHEN point_type = 'min'  THEN calc_damage END) as d_min,
                MAX(CASE WHEN point_type = 'max'  THEN calc_damage END) as d_max
            FROM interpolated_results
            GROUP BY id, return_period
        ),

        -- 6. Apply Triangular Distribution Logic
        triangular_calc AS (
            SELECT 
                building_id,
                return_period,
                d_mean,
                d_min,
                d_max,
                -- Calculate Mode: (3 * Mean) - Min - Max
                -- We clamp it between Min and Max to ensure the math doesn't break if skew is extreme
                GREATEST(d_min, LEAST(d_max, (3 * d_mean) - d_min - d_max)) as mode_clamped
            FROM pivoted_stats
        )

        -- 7. Final Output
        SELECT 
            building_id,
            return_period,
            d_mean as damage_percent,
            d_min,
            d_max,

            -- New Column: The "Peak" of the distribution
            mode_clamped as d_mode,
            -- 1. Hinge Corrected Mean
            -- Corrects for the asymmetric slopes pulling the average down
            d_mean + ( (d_min + d_max - 2 * d_mean) * (1.0 / SQRT(2 * PI())) ) as damage_percent_mean,

            ABS(d_max - d_min) / 2.0 as damage_percent_std,
            -- Standard Deviation (Square root of the Variance)
            SQRT(ABS(
                (POWER(d_min, 2) + POWER(d_max, 2) + POWER(mode_clamped, 2) 
                - (d_min * d_max) 
                - (d_min * mode_clamped) 
                - (d_max * mode_clamped)) / 18
            )) as triangular_std_dev,
            NULLIF(d_max - d_min, 0) / 4 AS range_std_dev
        FROM triangular_calc;
    '''
    connection.execute(sql_statement)