Currently all diagnostics are in a directory esmvaltool/diag_scripts and bits of NCL code are scattered throughout the Python library. It would be good to take diag_scripts folder out of the esmvaltool python package and add some organization, with a standard location to add library functions. I discussed with Niels and we came up with the following proposal for a safer and cleaner directory structure.
Example:
esmdiagnostics/
βββ ncl
βΒ Β βββ my_diag
βΒ Β βΒ Β βββ test_pr.ncl
βΒ Β βΒ Β βββ test_ta.ncl
βΒ Β βΒ Β βββ test_ta_no2.ncl
βΒ Β βββ perfmetrics
βΒ Β βΒ Β βββ grading_collect.ncl
βΒ Β βΒ Β βββ grading.ncl
βΒ Β βΒ Β βββ main.ncl
βΒ Β βΒ Β βββ taylor_collect.ncl
βΒ Β βΒ Β βββ taylor.ncl
βΒ Β βββ shared
βΒ Β βββ aux_perfmetrics.ncl
βββ python
βΒ Β βββ my_diag
βΒ Β βΒ Β βββ test_pr.py
βΒ Β βΒ Β βββ test_ta_no2.py
βΒ Β βΒ Β βββ test_ta.py
βΒ Β βββ shared
βΒ Β βΒ Β βββ esmdiaglib.py
βΒ Β βΒ Β βββ mylib
βΒ Β βΒ Β βββ __init__.py
βΒ Β βββ some_other_diag
βΒ Β βββ test_pr.py
βΒ Β βββ test_ta_no2.py
βΒ Β βββ test_ta.py
βββ r
βββ shared
βββ some_r_diagnostic
βββ xxx.r
The idea is to create a single directory esmdiagnostics, which contains all diagnostics and related code organized in a single directory per language. Every diagnostic will live in it's own directory and per language there is one directory shared in which all shared code is placed. Diagnostic developers can then create a file (or a subdirectory containing a collection of related files) in the shared directory if something is shared between multiple diagnostics.
When running a diagnostic from ESMValTool via a namelist, the path of the diagnostic itself will be available in an environmental variable ESM_DIAG_PATH. The path of the shared directory of the language that the diagnostic is written in, will be available in the environmental variable ESM_DIAG_SHARED_PATH, e.g. for an NCL diagnostic ESM_DIAG_SHARED_PATH=
NCL diagnostics
In NCL, load paths starting with ./ can thus be eliminated in favour of paths starting with either "$ESM_DIAG_PATH" or "$ESM_DIAG_SHARED_PATH", to avoid having to set the current working directory to the location of the NCL diagnostic scripts when executing.
Python diagnostics
ESM_DIAG_SHARED_PATH will also be appended to PYTHONPATH, so in the example above it is possible to do import mylib in the python diagnostic scripts, e.g. in test_pr.py. A library function esmdiaglib.get_diag_path() will be made available that returns the value of ESM_DIAG_PATH to avoid having to use the environmental variables directly from Python in diagnostic scripts. Python functions from the esmvaltool module can also be used in Python diagnostics by using import esmvaltool after installation of the package.
All diagnostics should be executable scripts that take no command line arguments.
Preferably all paths should be lower case only.
Example with more content:
βββ ncl
βΒ Β βββ my_diag
βΒ Β βΒ Β βββ test_pr.ncl
βΒ Β βΒ Β βββ test_ta.ncl
βΒ Β βΒ Β βββ test_ta_no2.ncl
βΒ Β βββ perfmetrics
βΒ Β βΒ Β βββ grading_collect.ncl
βΒ Β βΒ Β βββ grading.ncl
βΒ Β βΒ Β βββ main.ncl
βΒ Β βΒ Β βββ taylor_collect.ncl
βΒ Β βΒ Β βββ taylor.ncl
βΒ Β βββ shared
βΒ Β βΒ Β βββ aux_perfmetrics.ncl
βΒ Β βΒ Β βββ ensemble.ncl
βΒ Β βΒ Β βββ latlon.ncl
βΒ Β βΒ Β βββ legacy_code
βΒ Β βΒ Β βΒ Β βββ ensemble.ncl
βΒ Β βΒ Β βΒ Β βββ statistics.ncl
βΒ Β βΒ Β βββ misc_function.ncl
βΒ Β βΒ Β βββ regridding.ncl
βΒ Β βΒ Β βββ rgb
βΒ Β βΒ Β βΒ Β βββ amwg.rgb
βΒ Β βΒ Β βΒ Β βββ eyring_toz.rgb
βΒ Β βΒ Β βΒ Β βββ ipcc-od550aer-delta.rgb
βΒ Β βΒ Β βΒ Β βββ ipcc-od550aer.rgb
βΒ Β βΒ Β βΒ Β βββ ipcc-precip-delta.rgb
βΒ Β βΒ Β βΒ Β βββ ipcc-precip.rgb
βΒ Β βΒ Β βΒ Β βββ ipcc-tas-delta.rgb
βΒ Β βΒ Β βΒ Β βββ ipcc-tas.rgb
βΒ Β βΒ Β βΒ Β βββ qcm3.rgb
βΒ Β βΒ Β βΒ Β βββ rainbow.rgb
βΒ Β βΒ Β βΒ Β βββ red-blue.rgb
βΒ Β βΒ Β βββ scaling.ncl
βΒ Β βΒ Β βββ set_operators.ncl
βΒ Β βΒ Β βββ statistics.ncl
βΒ Β βΒ Β βββ style.ncl
βΒ Β βΒ Β βββ styles
βΒ Β βΒ Β βββ CCMVal1.style
βΒ Β βΒ Β βββ CCMVal2.style
βΒ Β βΒ Β βββ CMIP5.style
βΒ Β βΒ Β βββ DEFAULT.style
βΒ Β βΒ Β βββ righi15gmd.style
βΒ Β βββ WAMonsoon
βΒ Β βββ 10W10E_1D_basic.ncl
βΒ Β βββ 10W10E_3D_basic.ncl
βΒ Β βββ autocorr.ncl
βΒ Β βββ cmap_difference_pres.rgb
βΒ Β βββ cmap_difference.rgb
βΒ Β βββ cmap_difference_tas.rgb
βΒ Β βββ cmap_difference_theta.rgb
βΒ Β βββ cmap_difference_wind.rgb
βΒ Β βββ contour_basic.ncl
βΒ Β βββ exact_panel_positions_pres.ncl
βΒ Β βββ exact_panel_positions_pr-mmday.ncl
βΒ Β βββ exact_panel_positions_tas.ncl
βΒ Β βββ exact_panel_positions_wind.ncl
βΒ Β βββ isv_filtered.ncl
βΒ Β βββ precip_IAV.ncl
βΒ Β βββ precip_seasonal.ncl
βΒ Β βββ wind_basic.ncl
βββ python
βΒ Β βββ my_diag
βΒ Β βΒ Β βββ test_pr.py
βΒ Β βΒ Β βββ test_ta_no2.py
βΒ Β βΒ Β βββ test_ta.py
βΒ Β βββ shared
βΒ Β βΒ Β βββ esmdiaglib.py
βΒ Β βΒ Β βββ mylib
βΒ Β βΒ Β βββ __init__.py
βΒ Β βββ sm_pr_diag
βΒ Β βΒ Β βββ global_rain_sm.f90
βΒ Β βΒ Β βββ sample_events.f90
βΒ Β βΒ Β βββ sm_pr_diag_nml.py
βΒ Β βΒ Β βββ sm_pr_diag_test.py
βΒ Β βββ some_other_diag
βΒ Β βββ test_pr.py
βΒ Β βββ test_ta_no2.py
βΒ Β βββ test_ta.py
βββ r
βββ shared
βββ some_r_diagnostic
βββ xxx.r
This would be fine, if none overlap exists, e.g. a ncl script running a py script to calculate some subroutines. I support this idea.
Another possibility would be the following. This would make it easier for users to find the diagnostic they are looking for, because diagnostics are no longer organized per language, which also allows using multiple languages related to a single diagnostic theme. This solution also addresses the issue raised by @BenMGeo .
esmdiagnostics/
βββ my_diag
βΒ Β βββ test_pr.py
βΒ Β βββ test_ta_no2.py
βΒ Β βββ test_ta.py
βββ my_ncl_diag
βΒ Β βββ test_pr.ncl
βΒ Β βββ test_ta.ncl
βΒ Β βββ test_ta_no2.ncl
βββ perfmetrics
βΒ Β βββ grading_collect.ncl
βΒ Β βββ grading.ncl
βΒ Β βββ main.ncl
βΒ Β βββ taylor_collect.ncl
βΒ Β βββ taylor.ncl
βββ shared
βΒ Β βββ ncl
βΒ Β βΒ Β βββ aux_perfmetrics.ncl
βΒ Β βΒ Β βββ ensemble.ncl
βΒ Β βΒ Β βββ latlon.ncl
βΒ Β βΒ Β βββ legacy_code
βΒ Β βΒ Β βΒ Β βββ ensemble.ncl
βΒ Β βΒ Β βΒ Β βββ statistics.ncl
βΒ Β βΒ Β βββ misc_function.ncl
βΒ Β βΒ Β βββ plot
βΒ Β βΒ Β βΒ Β βββ aux_plotting.ncl
βΒ Β βΒ Β βΒ Β βββ contour_maps.ncl
βΒ Β βΒ Β βΒ Β βββ contourplot.ncl
βΒ Β βΒ Β βΒ Β βββ GO_panels.ncl
βΒ Β βΒ Β βΒ Β βββ legacy_code
βΒ Β βΒ Β βΒ Β βΒ Β βββ aux_plotting.ncl
βΒ Β βΒ Β βΒ Β βββ legends.ncl
βΒ Β βΒ Β βΒ Β βββ mjo_level1.ncl
βΒ Β βΒ Β βΒ Β βββ mjo_level2.ncl
βΒ Β βΒ Β βΒ Β βββ monsoon_domain_panels.ncl
βΒ Β βΒ Β βΒ Β βββ monsoon_panels.ncl
βΒ Β βΒ Β βΒ Β βββ portrait_plot.ncl
βΒ Β βΒ Β βΒ Β βββ rgb
βΒ Β βΒ Β βΒ Β βΒ Β βββ amwg.rgb
βΒ Β βΒ Β βΒ Β βΒ Β βββ eyring_toz.rgb
βΒ Β βΒ Β βΒ Β βΒ Β βββ ipcc-od550aer-delta.rgb
βΒ Β βΒ Β βΒ Β βΒ Β βββ ipcc-od550aer.rgb
βΒ Β βΒ Β βΒ Β βΒ Β βββ ipcc-precip-delta.rgb
βΒ Β βΒ Β βΒ Β βΒ Β βββ ipcc-precip.rgb
βΒ Β βΒ Β βΒ Β βΒ Β βββ ipcc-tas-delta.rgb
βΒ Β βΒ Β βΒ Β βΒ Β βββ ipcc-tas.rgb
βΒ Β βΒ Β βΒ Β βΒ Β βββ qcm3.rgb
βΒ Β βΒ Β βΒ Β βΒ Β βββ rainbow.rgb
βΒ Β βΒ Β βΒ Β βΒ Β βββ red-blue.rgb
βΒ Β βΒ Β βΒ Β βββ scatterplot.ncl
βΒ Β βΒ Β βΒ Β βββ styles
βΒ Β βΒ Β βΒ Β βΒ Β βββ CCMVal1.style
βΒ Β βΒ Β βΒ Β βΒ Β βββ CCMVal2.style
βΒ Β βΒ Β βΒ Β βΒ Β βββ CMIP5.style
βΒ Β βΒ Β βΒ Β βΒ Β βββ DEFAULT.style
βΒ Β βΒ Β βΒ Β βΒ Β βββ righi15gmd.style
βΒ Β βΒ Β βΒ Β βββ taylor_diagram_less_hardcoded.ncl
βΒ Β βΒ Β βΒ Β βββ taylor_plot.ncl
βΒ Β βΒ Β βΒ Β βββ vector_scalar_map_polar.ncl
βΒ Β βΒ Β βΒ Β βββ xy_line.ncl
βΒ Β βΒ Β βΒ Β βββ zonalmean_profile.ncl
βΒ Β βΒ Β βββ regridding.ncl
βΒ Β βΒ Β βββ scaling.ncl
βΒ Β βΒ Β βββ set_operators.ncl
βΒ Β βΒ Β βββ statistics.ncl
βΒ Β βΒ Β βββ style.ncl
βΒ Β βββ python
βΒ Β βΒ Β βββ esmdiaglib.py
βΒ Β βΒ Β βββ mylib
βΒ Β βΒ Β βββ __init__.py
βΒ Β βββ r
βββ sm_pr_diag
βΒ Β βββ global_rain_sm.f90
βΒ Β βββ sample_events.f90
βΒ Β βββ sm_pr_diag_nml.py
βΒ Β βββ sm_pr_diag_test.py
βββ some_other_diag
βΒ Β βββ test_pr.py
βΒ Β βββ test_ta_no2.py
βΒ Β βββ test_ta.py
βββ some_r_diagnostic
βΒ Β βββ xxx.r
βββ WAMonsoon
βββ 10W10E_1D_basic.ncl
βββ 10W10E_3D_basic.ncl
βββ autocorr.ncl
βββ cmap_difference_pres.rgb
βββ cmap_difference.rgb
βββ cmap_difference_tas.rgb
βββ cmap_difference_theta.rgb
βββ cmap_difference_wind.rgb
βββ contour_basic.ncl
βββ exact_panel_positions_pres.ncl
βββ exact_panel_positions_pr-mmday.ncl
βββ exact_panel_positions_tas.ncl
βββ exact_panel_positions_wind.ncl
βββ isv_filtered.ncl
βββ precip_IAV.ncl
βββ precip_seasonal.ncl
βββ wind_basic.ncl
In principle, a better organization of the diagnostic scripts would be nice. The proposed ideas sound good for the easy cases, but I guess things become a bit more complicated for some other cases. Some code of the SAMonsoon diagnostic, for instance, is also used by the diagnostics DiurnalCycle, Evapotranspiration, Global Ocean, IPCC Ch. 9, and WAMonsoon, which might make sorting the diagnostics in the proposed way difficult if not confusing. I have no better idea yet, but I guess this might need some more fine tuning...
@axel-lauer The case you describe above seems to be covered by the proposed directory structure.
The WAMonsoon code that is shared with other diagnostics could for example go into
esmdiagnostics/shared/ncl/monsoon
or if it is code for plotting it could go into
esmdiagnostics/shared/ncl/plot
This would make it easier to see that this code is shared.
Maybe it would be better to also get rid of the language specific subdirectories in the shared directory and instead create a subdirectory for scripts that are related for some reason, e.g. like this:
esmdiagnostics/shared/monsoon
esmdiagnostics/shared/plot
esmdiagnostics/shared/statistics
This would make it easier to find tools related to a particular task, plus a single set of tools might use multiple languages.
We discussed this internally.
We find the idea good in principle but we are also afraid that it could generate a lot of issues (see above comment by Axel). Since our main focus now is to get the new backend and interface running, we would propose to postpone it.
Once the backend is finalized, we plan to have a coding workshop where we will focus on testing all namelist and diagnostics in the new version. That could be the opportunity to discuss this issue again, but for the moment we shall focus on the other issues which are more urgent.
@ESMValGroup/esmvaltool-coreteam
I fully second this.
Ben
On 12.12.2017 13:20, Mattia Righi wrote:
>
We discussed this internally.
We find the idea good in principle but we are also afraid that it
could generate a lot of issues (see above comment by Axel). Since our
main focus now is to get the new backend and interface running, we
would propose to postpone it.Once the backend is finalized, we plan to have a coding workshop where
we will focus on testing all namelist and diagnostics in the new
version. That could be the opportunity to discuss this issue again,
but for the moment we shall focus on the other issues which are more
urgent.β
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ESMValGroup/ESMValTool/issues/106#issuecomment-351035774,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AOSImkL3nuy7mXDwCrdAUCxo9mo6uz1Jks5s_m-dgaJpZM4PwafY.
--
M. Sc.
Scientific Programmer
Department of Geography
LMU - Ludwig-Maximilians-Universitaet
LuisenstraΓe 37
80333 MΓΌnchen
Tel.: 089-2180-6724
E-Mail: B.[email protected]
I think this issue needs to be addressed before starting to port the diagnostics from version 1.x to version 2 of esmvaltool, as it will be much harder to change this afterwards.
As discussed above, we shall avoid too complicated nestings.
My suggestion would be to simply sort diagnostics by language (as in plots_scripts):
diag_scripts\ncl\*.ncl
diag_scripts\ncl\lib\*.ncl
diag_scripts\python\*.py
diag_scripts\python\lib\*.py
diag_scripts\R\*.r
diag_scripts\R\lib\*.r
As long as we port the diag script to v2, we might also think about introducing a naming convention for the scripts (e.g., avoid upper-case, see #12).
@axel-lauer
One of the concerns raised at the sprint at KNMI last week was that we may end up writing the same functionality in many different languages (Python, NCL, R, ..). Therefore I would recommend against creating language specific directories, as this makes it harder for diagnostic developers to find functionality that is already implemented.
With the new namelist format, it is quite easy to use functionality implemented in a different language: a single diagnostic section in the namelist can use diagnostic scripts in multiple languages, the ancestors keyword can be used to run multiple diagnostic scripts one after the other, while the output of one or more scripts is used as the input of the next script. See e.g. the perfmetrics namelist for an example of this.
There seemed to be some agreement on the structure I proposed on October 9. I would propose to go for that, but drop the language specific directories. It does impose some grouping of diagnostic scripts, but not much:
diag_scripts/
βββ my_diag
β βββ test_pr.py
β βββ test_ta_no2.py
β βββ test_ta.py
βββ my_ncl_diag
β βββ test_pr.ncl
β βββ test_ta.ncl
β βββ test_ta_no2.ncl
βββ perfmetrics
β βββ grading_collect.ncl
β βββ grading.ncl
β βββ main.ncl
β βββ taylor_collect.ncl
β βββ taylor.ncl
βββ shared
β βββ aux_perfmetrics.ncl
β βββ ensemble.ncl
β βββ latlon.ncl
β βββ legacy_code
β β βββ ensemble.ncl
β β βββ statistics.ncl
β βββ misc_function.ncl
β βββ plot
β β βββ aux_plotting.ncl
β β βββ contour_maps.ncl
β β βββ contourplot.ncl
β β βββ GO_panels.ncl
β β βββ legacy_code
β β β βββ aux_plotting.ncl
β β βββ legends.ncl
β β βββ mjo_level1.ncl
β β βββ mjo_level2.ncl
β β βββ monsoon_domain_panels.ncl
β β βββ monsoon_panels.ncl
β β βββ portrait_plot.ncl
β β βββ rgb
β β β βββ amwg.rgb
β β β βββ eyring_toz.rgb
β β β βββ ipcc-od550aer-delta.rgb
β β β βββ ipcc-od550aer.rgb
β β β βββ ipcc-precip-delta.rgb
β β β βββ ipcc-precip.rgb
β β β βββ ipcc-tas-delta.rgb
β β β βββ ipcc-tas.rgb
β β β βββ qcm3.rgb
β β β βββ rainbow.rgb
β β β βββ red-blue.rgb
β β βββ scatterplot.ncl
β β βββ styles
β β β βββ CCMVal1.style
β β β βββ CCMVal2.style
β β β βββ CMIP5.style
β β β βββ DEFAULT.style
β β β βββ righi15gmd.style
β β βββ taylor_diagram_less_hardcoded.ncl
β β βββ taylor_plot.ncl
β β βββ vector_scalar_map_polar.ncl
β β βββ xy_line.ncl
β β βββ zonalmean_profile.ncl
β βββ regridding.ncl
β βββ scaling.ncl
β βββ set_operators.ncl
β βββ statistics.ncl
β βββ style.ncl
β βββ esmdiaglib.py
β βββ mylib
β βββ __init__.py
βββ sm_pr_diag
β βββ global_rain_sm.f90
β βββ sample_events.f90
β βββ sm_pr_diag_nml.py
β βββ sm_pr_diag_test.py
βββ some_other_diag
β βββ test_pr.py
β βββ test_ta_no2.py
β βββ test_ta.py
βββ some_r_diagnostic
β βββ xxx.r
βββ WAMonsoon
βββ 10W10E_1D_basic.ncl
βββ 10W10E_3D_basic.ncl
βββ autocorr.ncl
βββ cmap_difference_pres.rgb
βββ cmap_difference.rgb
βββ cmap_difference_tas.rgb
βββ cmap_difference_theta.rgb
βββ cmap_difference_wind.rgb
βββ contour_basic.ncl
βββ exact_panel_positions_pres.ncl
βββ exact_panel_positions_pr-mmday.ncl
βββ exact_panel_positions_tas.ncl
βββ exact_panel_positions_wind.ncl
βββ isv_filtered.ncl
βββ precip_IAV.ncl
βββ precip_seasonal.ncl
βββ wind_basic.ncl
Basically this structure is as simple as: if all required functionality for a diagnostic is implemented in a single file, create a file, if it is implemented in multiple files, put them together in a directory. If the functionality in a diagnostic script is shared across multiple diagnostic scripts, move those functions to the shared directory, again either as a single file or as a set of files in a directory. This will facilitate easy discovery of available functionality and helps to keep track of what is shared by multiple diagnostics (this also helps to discover what is already implemented, as the shared code is much more likely to be relevant for new diagnostic development than what is available in other diagnostic scripts).
I had a discussion with @nielsdrost and we both thought that shared was a slightly clearer name than lib for the location where code shared between multiple diagnostics is stored, but if this is an issue for you we can keep lib of course.
Let me discuss this again with @axel-lauer et al. and then take a decision.
As you said above, we need to come up with a structure, before start including other diags.
I am fine with the last proposed option including no language specific folders and moving code from "lib" and "aux" to "shared".
Working on this issue here.
Most helpful comment
Another possibility would be the following. This would make it easier for users to find the diagnostic they are looking for, because diagnostics are no longer organized per language, which also allows using multiple languages related to a single diagnostic theme. This solution also addresses the issue raised by @BenMGeo .