merge_data#

geobricks.merge_data(data, *, extension=None, tag='', flatten=False, spatial_intersect=False, **kwargs)[source]#

This function merge all files inside a folder.

Parameters#

data(list of) str or pathlib.Path, or variable (xarray.Dataset, xarray.DataArray, geopandas.GeoDataFrame or pandas.DataFrame)

Iterable of data to merge, or a path to a folder containing data to merge. In that case, the arguments extension and tag can be passed.

When datasets overlap, the last dataset has the highest priority.

extension: str, optional

Only the files with this extension will be retrieved (when data is a folder path).

tag: str, optional

Only the files containing this tag in their names will be retrieved (when data is a folder path).

flattenbool, default False

If True, data will be flattent over the time axis.

spatial_intersectbool, default False

If True: if datasets have different spatial extent, they will be merged according to the intersection of their spatial coordinates. If False, the datasets will be merged according to the union of their spatial coordinates.

Note that for all other dimensions (time, …), the datasets will always be merged according to the union of their indexes.

**kwargs

Optional other arguments passed to geo.load_any() (arguments for xarray.open_dataset, pandas.DataFrame.read_csv, pandas.DataFrame.to_csv, pandas.DataFrame.read_json or pandas.DataFrame.to_json function calls).

May contain:

  • decode_cf

  • sep

  • encoding

  • force_ascii

>>> help(xarray.open_dataset)
>>> help(pandas.read_csv)
>>> help(pandas.to_csv)
>>> ...

Returns#

geopandas.GeoDataFrame, xarray.Dataset, pandas.DataFrame or numpy.array

Merged data is stored in a variable whose type is accordingly to the type of data:

  • all vector data will be loaded as a geopandas.GeoDataFrame

  • all raster data and netCDF will be loaded as a xarray.Dataset

  • other data will be loaded either as a pandas.DataFrame (CSV and JSON) or as a numpy.array (TIFF)