transform#

geobricks.transform(data, *, src_crs=None, base_template=None, bounds=None, bounds_crs=None, x0=None, y0=None, mask=None, to_file=False, export_extension=None, rasterize=False, main_var_list=None, rasterize_mode=['sum', 'dominant', 'and'], **rio_kwargs)[source]#

Reproject, clip, rasterize or convert space-time data. transform(), reproject() and convert() are three aliases of the same function.

Parameters#

datastr, pathlib.Path, xarray.Dataset, xarray.DataArray, geopandas.GeoDataFrame or pandas.DataFrame

Data to transform. Supported file formats are .tif, .asc, .nc, vector formats supported by geopandas (.shp, .json, …), and .csv.

src_crsint or str or rasterio.crs.CRS, optional

Coordinate reference system of the source (data). When passed as an integer, src_crs refers to the EPSG code. When passed as a string, src_crs can be OGC WKT string or Proj.4 string.

base_templatestr, pathlib.Path, xarray.DataArray or geopandas.GeoDataFrame, optional

Filepath, used as a template for spatial profile. Supported file formats are .tif, .nc and vector formats supported by geopandas (.shp, .json, …).

boundsiterable or None, optional, default None

Boundaries of the target domain as a tuple (x_min, y_min, x_max, y_max). The values are expected to be given according to bounds_crs if it is not None. If bounds_crs is None, bounds are expected to be given according to the destination CRS dst_crs if it is not None. It dst_crs is also None, bounds are then expected to be given according to the source CRS (src_crs of data’s CRS).

bounds_crsint or str or rasterio.crs.CRS, optional

Coordinate reference system of the bounds (if bounds is not None). When passed as an integer, src_crs refers to the EPSG code. When passed as a string, src_crs can be OGC WKT string or Proj.4 string.

x0: number, optional, default None

Origin of the X-axis, used to align the reprojection grid.

y0: number, optional, default None

Origin of the Y-axis, used to align the reprojection grid.

maskstr, pathlib.Path, shapely.geometry, xarray.DataArray or geopandas.GeoDataFrame, optional

Filepath of mask used to clip the data.

to_filebool or path (str or pathlib.Path), default False

If True and if data is a path (str or pathlib.Path), the resulting dataset will be exported to a file with the same pathname and the suffix ‘_geop4th’. If to_file is a path, the resulting dataset will be exported to this specified filepath.

export_extensionstr, optional

Extension to which the data will be converted and exported. Only used when the specified data is a filepath. It data is a variable and not a file, it will not be exported.

If rasterize=True and export_extension is not specified, it will be set to ‘.tif’ by default.

rasterizebool, default False

Option to rasterize data (if data is a vector data).

main_var_listiterable, default None

Data variables to rasterize. Only used if rasterize is True. If None, all variables in data are rasterized.

rasterize_modestr or list of str, or dict, default [‘sum’, ‘dominant’, ‘and’]

Defines the mode to rasterize data:

  • for numeric variables: 'mean' or 'sum' (default)

  • for categorical variables: 'percent' or 'dominant' (default)

    • 'dominant' rises the most frequent level for each cell

    • 'percent' creates a new variable per level, which stores

    the percentage (from 0 to 100) of occurence of this level compared to all levels, for each cell.

  • for boolean variables: 'or' or 'and' (default)

The modes can be specified for each variable by passing rasterize_mode as a dict: {'<var1>': 'mean', '<var2>': 'percent', ...}. This argument specification makes it possible to force a numeric variable to be rasterized as a categorical variable. Unspecified variables will be rasterized with the default mode.

**rio_kwargskeyword args, optional

Argument passed to the xarray.Dataset.rio.reproject() function call.

Note: These arguments are prioritary over base_template attributes.

May contain:

  • dst_crs : str

  • resolution : float or tuple

  • shape : tuple (int, int)

  • transform : Affine

  • nodata : float or None

  • resampling :

    • see help(rasterio.enums.Resampling)

    • most common are: 5 (average), 13 (sum), 0 (nearest), 9 (min), 8 (max), 1 (bilinear), 2 (cubic)…

    • the functionality 'std' (standard deviation) is also available

  • see help(xarray.Dataset.rio.reproject)

Returns#

Transformed dataxarray.Dataset or geopandas.GeoDataFrame.

The type of the resulting variable is accordingly to the type of input data and to the conversion operations (such as rasterize):

  • all vector data will be output as a geopandas.GeoDataFrame

  • all raster data and netCDF will be output as a xarray.Dataset

If data is a file, the resulting dataset will be exported to a file as well (with the suffix ‘_geop4th’), except if the parameter to_file=False is passed.