Pipeline

The high-level run() function is the primary entry point for most users.

climagrid.run

climagrid.pipeline.orchestrator.run(assets, start_dt, end_dt, *, sources=None, features='all', bbox_radius_km=50.0, max_join_distance_km=100.0, source_kwargs=None)[source]

Fetch environmental data, join it to assets, and compute stress features.

Parameters:
  • assets (AssetRegistry | str | Path) – An AssetRegistry instance, or a path to an asset CSV/GeoJSON file.

  • start_dt (datetime) – Start of the time range (UTC-aware recommended).

  • end_dt (datetime) – End of the time range (UTC-aware recommended).

  • sources (list[str] | None) – List of source names to fetch from. Valid values: "nasa_power", "noaa_hrrr", "noaa_ncei", "usda_nrcs", "usfs_wfigs". Defaults to ["nasa_power"].

  • features (list[str] | str) – List of feature names to compute, or "all" (default). Valid values: "thermal", "freeze_thaw", "ice_loading", "soil", "wildfire", "conductor_sag".

  • bbox_radius_km (float) – Margin in km added around the full asset extent when building the bounding box for grid and station sources. Point-based sources (e.g. NASA POWER) ignore this and fetch one location per asset. Default 50 km.

  • max_join_distance_km (float) – Maximum distance for spatial join. Assets farther than this from any data point will have NaN environmental values. Default 100 km.

  • source_kwargs (dict[str, dict] | None) – Optional dict of keyword arguments passed to each source adapter. E.g. {"noaa_ncei": {"token": "my-cdo-token"}}.

Returns:

Wide-form DataFrame: one row per (asset_id, timestamp), with all environmental and feature columns present.

Return type:

DataFrame

Example

>>> import climagrid
>>> from datetime import datetime, timezone
>>> df = climagrid.run(
...     "assets.csv",
...     start_dt=datetime(2024, 7, 1, tzinfo=timezone.utc),
...     end_dt=datetime(2024, 7, 2, tzinfo=timezone.utc),
...     sources=["nasa_power"],
...     features="all",
... )
>>> df.columns.tolist()
['asset_id', 'timestamp', 'lat', 'lon', 'nasa_temperature_2m', ...]

Source and feature names

Valid sources values:

  • "nasa_power": NASA POWER adapter

  • "noaa_hrrr": NOAA HRRR adapter (requires pip install "climagrid[noaa-nwp]")

  • "noaa_ncei": NOAA NCEI CDO adapter (requires free API token)

  • "usda_nrcs": USDA NRCS AWDB adapter

  • "usfs_wfigs": USFS NIFC WFIGS wildfire adapter

Valid features values:

  • "thermal": feat_thermal_aging_factor, feat_heat_hours_above_35c

  • "conductor_sag": feat_conductor_sag_index

  • "freeze_thaw": feat_freeze_thaw_cycles

  • "ice_loading": feat_ice_loading_risk

  • "soil": feat_soil_saturation_index

  • "wildfire": feat_wildfire_proximity