Code standards#
Under progress…
Section to be completed
GEOP4TH’s philosophy#
GEOP4TH relies on generic and format-agnostic elementary functions, called geobricks and defined in the geobricks.py script.
The philosophy of geobricks is to aim for a high level of abstraction: mandatory user’s choices are to be reduced to the minimum. But
it should still be possible for the advanced users to access advanced options. To combine these two goals, we use smart default values and smart procedures
in order to infer the parameter values when they are not passed by the user (for some examples, have a look at geobricks.transform(), geobricks.load(),
geobricks.compare()…).
Geobricks form the foundation of GEOP4TH. Therefore, special attention is given to them, and they are expected to remain as stable as possible.
On the other hand, workflows form the cooperative branch of GEOP4TH. Therefore, they benefit from a high degree of freedom and flexibility. Worfklows can
be developped with procedural programming (functions) or object-oriented programming (classes and methods). To find some inspiration you can have a look
at standardize_fr.bdalti() or standardize_fr.sim2() (standardize) or download_fr.bnpe() (download) for procedural approach, or at
standardize_wl.ERA5StandardizerWL (standardize) or download_wl.ERA5LandDownloader (download) for objetc-oriented approach.
Under progress…
More info to come on script organization…
Artificial Intelligence statement#
GEOP4TH advocates for a fairer world. This calls for a more equal distribution of wealth, more collective decisions and more sustainable societies. Generative artificial intelligence blatantly contravenes to that in multiple ways (here is a manifest with more detailled views (French)). The use of AI in GEOP4TH development is not prohibited, but it is strognly advised for contributors to take the time and some step back to evaluate the ins and outs of AI before using it. If used, for the sake of transparency and contributor accountability, it is required to explicitely state it in the docstrings and in the documentation of the modules where AI has been used.
File formats and extensions#
in a nutshell
when referring to the file extension (for instance ‘.json’),
extensionis used, and would ideally include the ‘.’ characterwhen referring to the file format (for instance ‘GeoJSON’),
file_formatis used, and would ideally not include the ‘.’ character
When developping new prep-processing steps, it is very likely that you will have to deal with file formats and extensions!
In GEOP4TH code, we make a difference between those 2. An extension refers to the suffix to the file name (‘.asc’, ‘.tif’, ‘.json’…).
We usually need this variable to differentiate between different file types such as raster files, vector files and so on. In that case we use
the variable name extension as in geobricks.get_filelist() or geobricks.load(). We also consider that the extension values
should include the ‘.’ character. Of course this theory, in practice we add a safeguard to deal with values missing the ‘.’:
if isinstance(extension, str):
if extension[0] != '.': extension = '.' + extension
The tricky part to have in mind is that a single extension can refers to different file formats, for example
.json can refer both to ‘JSON’, ‘GeoJSON’ or ‘TopoJSON’ files, .tif can refer both to ‘TIFF’ or ‘GeoTIFF’… (see
wikipedia file formats).
In the situations told before (geobricks.get_filelist(), geobricks.load(), …), we do not mind if
the extension refers to one file format or another, because we try our best to handle these potential issues
in the code itself, so that users will not have to care for that. For instance in geobricks.load()
we will differentiate between ‘JSON’ and ‘GeoJSON’ during the loading step:
try:
data_ds = gpd.read_file(data, **kwargs)
except: # DataSourceError
try:
data_ds = pd.read_json(data, **kwargs)
except:
data_ds = json.load(open(data, "r"))
print(" Warning: The JSON file could not be loaded as a pandas.DataFrame and was loaded as a dict")
Anyway, in some other cases, we want to differentiate between different file formats with the same extension. It is typically
the case in download_fr.bnpe(). In the water withdrawal API of BNPE that this function queries, the user can explicitely
choose between ‘JSON’ or ‘GeoJSON’ formats. Of course we wanted to keep this choice available. In that type of situation, we use the variable
name file_format and we consider that its value should not include the ‘.’. Once again, we still add a nice safeguard:
if file_formats[i][0] == '.': file_formats[i] = file_formats[i][1:]