transform#
- geobricks.transform(data, *, src_crs=None, base_template=None, bounds=None, bounds_crs=None, x0=None, y0=None, mask=None, drop=False, to_file=False, export_extension=None, rasterize=False, main_var_list=None, rasterize_mode=['mean', 'coverage', 'and'], force_polygon=False, force_point=False, **rio_kwargs)[source]#
Reproject, clip, rasterize or convert space-time data.
transform(),reproject()andconvert()are three aliases of the same function.Parameters#
- datastr, pathlib.Path, xarray.Dataset, xarray.DataArray, geopandas.GeoDataFrame or pandas.DataFrame
Data to transform. Supported file formats are .tif, .asc, .nc, vector formats supported by geopandas (.shp, .json, …), and .csv.
- src_crsint or str or rasterio.crs.CRS, optional
Coordinate reference system of the source (
data). When passed as an integer,src_crsrefers to the EPSG code. When passed as a string,src_crscan be OGC WKT string or Proj.4 string.- base_templatestr, Path, xarray.DataArray or geopandas.GeoDataFrame, optional
Filepath, used as a template for spatial profile. Supported file formats are .tif, .nc and vector formats supported by geopandas (.shp, .json, …).
- boundsiterable or None, optional, default None
Boundaries of the target domain as a tuple (x_min, y_min, x_max, y_max). The values are expected to be given according to
bounds_crsif it is not None. Ifbounds_crsis None,boundsare expected to be given according to the destination CRSdst_crsif it is not None. Itdst_crsis also None,boundsare then expected to be given according to the source CRS (src_crsofdata’s CRS).- bounds_crsint or str or rasterio.crs.CRS, optional
Coordinate reference system of the bounds (if
boundsis not None). When passed as an integer,src_crsrefers to the EPSG code. When passed as a string,src_crscan be OGC WKT string or Proj.4 string.- x0: number, optional, default None
Origin of the X-axis, used to align the reprojection grid.
- y0: number, optional, default None
Origin of the Y-axis, used to align the reprojection grid.
- maskstr, Path, shapely.geometry, xarray.DataArray or geopandas.GeoDataFrame, optional
Filepath of mask used to clip the data.
- dropbool, default False
Only applicable for raster/xarray.Dataset types. If True, coordinate labels that only correspond to NaN values are dropped from the result.
- to_filebool or path (str or pathlib.Path), default False
If True and if
datais a path (str or pathlib.Path), the resulting dataset will be exported to a file with the same pathname and the suffix ‘_geop4th’. Ifto_fileis a path, the resulting dataset will be exported to this specified filepath.- export_extensionstr, optional
Extension to which the data will be converted and exported. Only used when the specified
datais a filepath. Itdatais a variable and not a file, it will not be exported.If
rasterize=Trueandexport_extensionis not specified, it will be set to ‘.tif’ by default.- rasterizebool, default False
Option to rasterize data (if
datais a vector data).- main_var_listiterable, default None
Data variables to rasterize. Only used if
rasterizeisTrue. IfNone, all variables indataare rasterized.- rasterize_modestr or list of str, or dict, default [‘mean’, ‘coverage’, ‘and’]
Defines the mode to rasterize data:
for numeric variables:
'count','sum'or ``’mean’``(default)'mean'refers to:the sum of polygon values weighted by their relative coverage on each cell, when the vector data contains Polygons (appropriate for intensive quantities)
the average value of points on each cell, when the vector data contains Points
'sum'refers to:the sum of polygon values downscaled to each cell (appropriate for extensive quantities)
the sum values of points on each cell, when the vector data contains Points
'count'refers to:the number of points or polygons intersecting each cell
for categorical variables:
'fraction'or'dominant'or ``’coverage’``(default)'coverage'refers to: - the area covered by each level on each cell, when the vector data contains Polygons - the count of points for each level on each cell, when the vector data contains Points'dominant'rises the most frequent level for each cell'fraction'creates a new variable per level, which stores
the fraction (from 0 to 1) of the coverage of this level compared to all levels, for each cell.
for boolean variables:
'or'or'and'(default)
The modes can be specified for each variable by passing
rasterize_modeas a dict:{'<var1>': 'mean', '<var2>': 'percent', ...}. This argument specification makes it possible to force a numeric variable to be rasterized as a categorical variable. Unspecified variables will be rasterized with the default mode. If data contains no variable other than ‘geometry’, the arbitrary name ‘data’ can be used to specify a mode for the whole data.- force_polygonbool, default False
Only Polygon geometry types will be kept when rasterizing.
- force_pointbool, default False,
Only Point geometry types will be kept when rasterizing.
- **rio_kwargskeyword args, optional
Argument passed to the
xarray.Dataset.rio.reproject()function call.Note: These arguments are prioritary over
base_templateattributes.May contain:
dst_crs: strresolution: float or tupleshape: tuple (int, int) of (height, width)transform: Affinenodata: float or Noneresampling:see
help(rasterio.enums.Resampling)most common are:
5(average),13(sum),0(nearest),9(min),8(max),1(bilinear),2(cubic)…the functionality
'std'(standard deviation) is also available
see
help(xarray.Dataset.rio.reproject)
Returns#
- Transformed dataxarray.Dataset or geopandas.GeoDataFrame.
The type of the resulting variable is accordingly to the type of input data and to the conversion operations (such as rasterize):
all vector data will be output as a geopandas.GeoDataFrame
all raster data and netCDF will be output as a xarray.Dataset
If
datais a file, the resulting dataset will be exported to a file as well (with the suffix ‘_geop4th’), except if the parameterto_file=Falseis passed.