scan_files#

geobricks.scan_files(paths: str | Path | List[str | Path], *, variables_to_find: str | List[str] | None = None, bbox: Tuple[float, float, float, float] | None = None, mask: str | Path | Dataset | GeoDataFrame | None = None, start_date: str | Timestamp | None = None, end_date: str | Timestamp | None = None, frequency: str | None = None, file_extension: str = '*.nc') Dict[str, List[Dict[str, Any]]][source]#

Find and analyze existing files across multiple paths, extracting metadata including spatial bounds, temporal range, and variables.

Parameters#

pathsUnion[str, Path, List[Union[str, Path]]]

Directory path(s) to scan. Can be a single path or list of paths.

variables_to_findOptional[Union[str, List[str]]]

Variable name(s) to search for. Can be a single variable or list of variables. If None, returns all files.

bboxOptional[Tuple[float, float, float, float]]

Target bounding box (North, West, South, East)

maskOptional[Union[str, Path, xr.Dataset, gpd.GeoDataFrame]]

Mask to extract bbox from. Can be path or dataset. Overrides bbox if provided.

start_dateOptional[Union[str, pd.Timestamp]]

Start date for filtering

end_dateOptional[Union[str, pd.Timestamp]]

End date for filtering

frequencyOptional[str]

Target frequency for filtering

file_extensionstr, default “*.nc”

File pattern to search. Accepts formats like “*.nc”, “.nc”, or “nc”.

Returns#

Dict[str, List[Dict[str, Any]]]

Dict mapping variable names to lists of file metadata dicts.