Esmvaltool: Preprocessor naming nomenclature

Created on 12 Feb 2019  Â·  15Comments  Â·  Source: ESMValGroup/ESMValTool

This is a comment about preprocessor naming.

There doesn't appear to be a standard nomenclature in naming preprocessors. In general, they consist of a descriptor and an operation. ie volume and mean. Sometimes it's the other way around, ie mean_volume, and the terms average and mean are used interchangeably.

In addition, several preprocessors are imported with different names in __init__.py. This is probably bad practice and will make it harder for users to find the relevant source code, especially if they are using the API which only includes the original documentation.

I think we should adopt a standard naming scheme for these terms. I propose that we use mean instead of average, that we stick to the order: descriptor_operation (ie volume_mean, not mean_volume), and that we rename the preprocessors such that they are not renamed while importing them into __init__.py.

This means that the preprocessor code will need some changes and some recipes will need to be changed. However, it should be cosmetic changes only.

standards

All 15 comments

Hi Lee, I think this is a good idea, so if you have some time to work on it, please go ahead.

Also, some files inside the esmvaltool/preprocessor directory end with _pp, which is an unnessary addition because they're already in the preprocessor directory so you know they contain preprocessor code, it would be nice to fix this too. And the documentation on readthedocs is flaky of course, but we have another issue for that I believe.

good call @ledm ! cosmetic until ImportError: strikes :grin:

I suspect that it may well be too late to do this. If people are using ESMValTool to prepare their IPCC chapters this month, then changing the ground under their feet would be a terrible idea.

If we make these changes, then we should also provide a bash (or similar) script that can perform a find-replace on all the new preprocessor names for their recipes.

We can keep this for a later (more quite) stage.

I made an early start here: https://github.com/ESMValGroup/ESMValTool/tree/version2_development_preprocessor_nomenclature_847

But this only changes the filenames:

  • _area_pp.py to _area.py
  • _volume_pp.py to _volume.py
  • _time_area_pp.py to _time.py

As @bouweandela noted above, the preprocessor documentation is also not complete, so if you have time and resources to invest I think that's a more urgent issue to address.

Agreed with Lee wrt to AR9 folk and with Mattia for a (bit) later stage.
But plese no bash regexp stuff - that is not only prone to errors but also
peasant-y and makes us look like some Microsoft application :)

Dr Valeriu Predoi.
Computational scientist
NCAS-CMS
University of Reading
Department of Meteorology
Reading RG6 6BB
United Kingdom

On Wed, 13 Feb 2019, 09:25 Mattia Righi <[email protected] wrote:

As @bouweandela https://github.com/bouweandela noted above, the
preprocessor documentation is also not complete, so if you have time and
resources to invest I think that's a more urgent issue to address.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/ESMValGroup/ESMValTool/issues/847#issuecomment-463123329,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AbpCo6bxuXHKURr99ELXVPbxMfHsT8Uvks5vM9nxgaJpZM4a2B2j
.

Brilliant, thanks for volunteering to write something clever, @valeriupredoi. Thats really helpful. :1st_place_medal:

ugh :man_facepalming:

I suspect that it may well be too late to do this. If people are using ESMValTool to prepare their IPCC chapters this month, then changing the ground under their feet would be a terrible idea.

Note that all of the changes that do not change the names used in recipes, i.e. the re-imports under a different name, removing _pp, adding documentation, can already be safely done and merged.

Just adding a reminder that the preprocessors average_area and zonal_mean are both much more generic than that now, Both can be used to calculate mean, median, min, min, standard deviation and variance. Also, the zonal_mean can calculate both zonal and meridional operations.

@bouweandela suggested the term region_statistic for average_region here.

can we close this @ledm ?

Nope, it's still open as we've only half addressed it.

Replaced by #24.

Was this page helpful?
0 / 5 - 0 ratings