This issue is about porting the namelist_landcover diagnostic to version 2. The new branch is available here: https://github.com/ESMValGroup/ESMValTool/tree/version2_landcover .
@BenMGeo I think you were also involved in developing the original landcover diagnostic, right? For this port, I skipped all the infrastructure related to Alex's classes (as this is now done by the pre-processing) and only included the diagnostic which was implemented by Stefan Hagemann (and is included in the draft of the esmvaltool2 paper). I am aware that there are many more diagnostics included in the original version. Do you have any plans/interest to add them to this diagnostic as well? I would not do this myself, as they were just passive and not callable by the namelist.
I'm interested, but I do not have a schedule this would fit in yet. Can you provide any deadlines, so I can adjust to them?
I basically finished porting Stefans plots and am currently writing some documentation + cleaning up the code. However, I am not aware of any reason to rush. Thus, we can wait with the merge request until it fits your schedule. I'd propose to get this finished until end of January, but this is just to have it off my desk. If you would need more time, I am fine with it.
You could also follow your timeline and merge it in time. And I will add to it and extend this at a given point. If you think this is extra effort, I will try to follow your schedule, though I will be off for 4 weeks (harmless surgery).
Ok, let's do it like that. I think it's no problem to add to the diagnostic later on. Best of luck for your surgery!
@bouweandela I'd like to merge this soon, but I have an issue with the land table: I use the variable baresoilFrac for the analysis and according to definition it should have the dimensions
longitude latitude time typebare (see esmvaltool/cmor/tables/cmip5/Tables/CMIP5_Lmon). However, all the experiments I downloaded from ESGF have only the dimensions longitude latitude time for baresoilFrac. I could work around this by simply changing the definition in the CMIP5_Lmon file, but I guess this is the official definition and should not be changed. How can we take care of this? Would this be a model fix? And if yes, could you help me implementing it?
@jvegasbsc Can you comment on this? I think a cmor metadata fix could be the right solution.
I did the fix already in the latest commit and commented on it in #770 . Sorry, should have posted it here es well. I added a model fix for the ESA-CCI data itself and the inmcm4 model which I noted had the issue as well. I did not check the other model though.
OBS data should not be corrected via fix files, but via the corresponding reformat script (v1) or cmorizer (v2, see #232).
I can try to fix the reformat script, generate a revised reformatted dataset for ESACCI and send it to you for testing. If it works, you can remove the fix file before merging #770.
Sounds great! Let's try it like that.
You can donwload the files here, the link will be active for 7 days.
version key has to be changed in your recipe (you used 1, I used a more detailed description)basesoilFrac. I guess you need the cmorized ESACCI data for them as well, right?Great, thanks for that! From a technical point, I don't assume it is necessary to cmorize the other ESACCI landcover variables as well. They don't include the the typebare dimension and did run fine with my recipe. It might make sense in terms of consistency, though..?
Is there a specific reason you provided the file in native resolution? Originally, we decided to use 0.5 deg as a compromise between the vastly different resolution of the LC observations and the models. Thus, I assume it would produce different results due to the larger amount of missing values which are reduced by the interpolation.
I tried to apply your cmorization script to a 0.5 deg version of the file (generated with the lc tools by adding -PnumRows=360 to the aggregate_map parameters, but run into issues. After adding RAWOBS to my config file and calling
./esmvaltool/utils/cmorize_obs.py -c esmvaltool/config-m214103.yml -o ESACCI-LANDCOVER
it just complained about
import: unable to grab mouse ': Resource temporarily unavailable @ error/xwindow.c/XSelectWindow/9199.
from: can't read /var/mail/.._task
from: can't read /var/mail/.._config
./esmvaltool/utils/cmorize_obs.py: line 25: syntax error near unexpected token '('
./esmvaltool/utils/cmorize_obs.py: line 25: logger = logging.getLogger(__name__)'
Could you point me to what I was doing wrong (I am not even sure I got the call right), or alternatively provide me with the 0.5 deg version of the file you uploaded?
Great, thanks for that! From a technical point, I don't assume it is necessary to cmorize the other ESACCI landcover variables as well. They don't include the the typebare dimension and did run fine with my recipe. It might make sense in terms of consistency, though..?
Yes, and I also want the CMORization script to be able to generate all the variables needed in your recipe, in case other users require them.
Which data are you using at the moment for these other Lmon variables?
Could you point me to what I was doing wrong (I am not even sure I got the call right), or alternatively provide me with the 0.5 deg version of the file you uploaded?
That's a problem of the cmorize tool, not of the script, apparently. Did you run python setup.py install in the version2_reformat_obs_workflow branch?
But I can also regenerate the data, of course.
Did you run python setup.py install in the version2_reformat_obs_workflow branch?
Nope, I use the conda env update and pip install -e .[devlope] from the readme. You way is for sure more convenient. Does this do the same?
Still, I get the error message. I just noted a No such file or directory where I don't know what it relates to (I checked my RAWOBS Path and the one to the config file, both are fine). Thus, I will probably wait for your files. Thanks for preparing them!
Which data are you using at the moment for these other Lmon variables?
Variable | ESA-CCI Landcover Classes
-----------|-------------------------------------
baresoilFrac | Bare_Soil
cropFrac | Managed_Grass
grassFrac | Natural_Grass |
shrubFrac | Shrub_Broadleaf_Deciduous + Shrub_Broadleaf_Evergreen + Shrub_Needleleaf_Evergreen
treeFrac | Tree_Broadleaf_Deciduous + Tree_Broadleaf_Evergreen + Tree_Needleleaf_Deciduous + Tree_Needleleaf_Evergreen
Does this do the same?
It installs it in development mode, that's even better.
Thanks for the table, I will generate the data, stay tuned :radio:
Just to make sure, there are four shrub variable in the input data, but in your table only three of them are added up (Shrub_Needleleaf_Deciduous is not included). Is this done on purpose?
Yes, this is on purpose. The fourth shrub type 'Shrub_Needleleaf_Deciduous' contains only missing values. And I have to admit, I'd have to think a bit on that. Currently, I have no clue what plant this would be anyway.
Thanks! I run the file with the landcover recipe. They work for all but baresoilFrac as the typebare is missing. Looking at your script, there is a small typo
https://github.com/ESMValGroup/ESMValTool/blob/4fedf2a94d7b4167e7d716ac8e33999bdf547ba4/esmvaltool/utils/cmorizers/obs/cmorize_obs_ESACCI-LANDCOVER.ncl#L160-L162
in the variable name (should be baresoilFrac).
The others run through and are almost identical apart from the treeFrac. Seems like your script only considers grid cells which have values for all the variables which are combined for it. Thus, if any one of the grid cell has a missing value, it will have a missing value in the aggregate variable. I think it should be the other way around, right?
Thanks, I will fix the typo.
Concerning missing value, what you say is correct: that's the way NCL handles missing values in arithmetic operations. As a solution, I would set all missing to zero before adding the variable and set back the zeros to missing afterwards. Would that make sense?
Yes. But maybe as a boundary condition that at least one of the fields needs to have a value. Thus, the e.g. the land sea mask is preserved.
Both problems should be fixed now, see here.
treeFrac missing values looks better now. shrubFrac has also slightly changed (according to cdo diff).
Nice! Now all files work work with the recipe and the results are the same using your cmorized files. Thanks for preparing the script! I just pushed the new recipe version with removed obs data fix and the new version tag.
Btw: I noticed when downloading your files the filename is a bit messed up like "filename.ncfilename.nc" I am pretty sure this has nothing to do with your script as the filenames are show correctly on the swiftbrowser. Just wanted to mention it though.
One more question: I use area conservative remapping and thus had to add continuous grid cell boundaries to the esa files. I did this in
https://github.com/ESMValGroup/ESMValTool/blob/ef072c92ec21cdf27443ee697909702214166537/esmvaltool/preprocessor/_regrid.py#L198-L207
Is this to correct place or should this be done as part of the cmorizing? And if the latter, what about models? Would all of them provide continuous grid cell bounds or would it have to be corrected with a fix file?
Good question. I think it is fine to have it in _regrid.py, since the bounds are required only by that module and for a specific scheme. Fixing this at the cmorization level sounds like overkill.
What do you mean "esa files"? If you put it there it will be applied to all datasets satisfying the conditions. It would be need to be tested of course, to make sure it does not affect the functionality of other recipes.
Sorry, that was a bit unclear. I don't remember anymore whether these missing bounds were for the esacci observations (esa files) or a model or both. But as you agree with having this in the _regrid.py, there is no need to think how to best apply it to observations and model data separately.
From the preprocessor perspective there is no difference between models and observations, they are all datasets.