Esmvaltool: Get MIP name from higher level for CMOR checks

Created on 23 Nov 2017 · 15Comments · Source: ESMValGroup/ESMValTool

Hey folks, currently (in preprocess.py) there is a kludgey way to get the MIP value:
if project_name == 'CMIP5':
table = model['mip']
else:
table = project_info['MODELS'][0]['mip']

-- tis ugly and inefficient (what if the first model doesn't have a mip key?) -- could Javier have a look and change things so we get the MIP from a higher level position? Cheers, V

Source

valeriupredoi

All 15 comments

Maybe the MIP is not even needed, as the CMOR table should be unique given the variable name.

mattiarighi on 23 Nov 2017

Agreed it should probably go all together at some point. @bouweandela is working on centralising the namelist parser, perhaps something to add there.

nielsdrost on 23 Nov 2017

Wait, the MIP is still needed in the namelist to fully define the CMIP5 input (together with the other keys).
It is (maybe) not needed for searching the cmor table of the given variable.

mattiarighi on 23 Nov 2017

👍1

Can't we do a lookup in the CMIP5 tables (which are included in ESMValTool) to determine the MIP given a variable name? Would get rid of the MIP=* trick used when using variables from multiple MIPs?

Regardless, something to take care of while reading in the namelist, I think :-)

nielsdrost on 23 Nov 2017

👍1

Can't we do a lookup in the CMIP5 tables (which are included in ESMValTool) to determine the MIP given a variable name? Would get rid of the MIP=* trick used when using variables from multiple MIPs?

Yes, that's what I was thinking of, but @jvegasbsc should confirm whether it is doable at the preprocessor/cmorization level.

mattiarighi on 23 Nov 2017

I'm sorry to inform you that this is not possible in all cases: there are some variables in more than one table. For example, sithick in CMIP6 is in SIday and SImon.

Anyway, I think it is a good idea to deduce it for all the variables we are going to use.

And, by the way, the mip parameter should be moved from models configuration to variable config, because it is possible to use variable from different MIPs in some diagnostics

jvegasbsc on 24 Nov 2017

I'm sorry to inform you that this is not possible in all cases: there are some variables in more than one table. For example, sithick in CMIP6 is in SIday and SImon.

That's tricky. I guess the only difference between the two table is the time coordinate.
Is there any other way we can deduce it given the namelist settings?
Something like checking all mips in the MODELS section and taking the most frequent entry?

And, by the way, the mip parameter should be moved from models configuration to variable config, because it is possible to use variable from different MIPs in some diagnostics

That case is already covered by using wildcards in the MODELS section and moving the mip definition to the variable dict (see the yaml concept document). But for the general case I would stick to the current solution.

Let's also see what @axel-lauer think about this...

mattiarighi on 24 Nov 2017

I think all variables that are available for a particular mip are included in the corresponding table (e.g. tas is in the tables 3hr, 6hrPlev, 6hrPlevPt, day and Amon) but usually, those definitions are (should be) identical. Anyways, to be on the safe side it might be best to have the mip Definition in the variable dict and then simply use the table corresponding to the mip of the variable being processed.

axel-lauer on 24 Nov 2017

I've checked the CMIP5 tables for the variable ta.

This is defined in 7 mips (6hrLev, 6hrPlev, Amon, cf3hr, cfDay, cfMon, day) and unfortunately there are some differences across these definitions, mostly in the additional variable information section (things like dimensions, valid_min, valid_max, etc.), which could be critical for the preprocessor.

So a mip definition is definitely necessary here. @jvegasbsc suggestion has the advantage that it is clear, safe and flexible (i.e., allows using the same models list with multiple variables from different mips).

The only concern I have is for those namelist which do not use CMIP models, in that case the mip entry might be confusing.

Other ideas?
Otherwise I would suggest @jvegasbsc to go ahead and implement his suggestion.

mattiarighi on 24 Nov 2017

The only concern I have is for those namelist which do not use CMIP models, in that case the mip entry might be confusing.

This can be confusing only if you are mixing cmorized (all of them will have the mip, even if they are not part of any CMIP) and raw data models, and I think this is something that only "advenced" users will do.

Anyway, having parameters that only apply to certain types of projects and managing those kind of things with ease is part of the magic of YAML. And if we are thinking of using the tool with non-cmorized models, we will need this kind of things for most of the models. For example, for Nemo output we probably will need the frequency and the file type (gridT, gridV, icemod, pisces).

jvegasbsc on 24 Nov 2017

👍1

Ok!

Let's move the mip key from the model to the variable dictionary then.
I would suggest waiting for PR159 and continue form that.

mattiarighi on 24 Nov 2017

I've been working on a namelist parser in the REFACTORING_preprocessor branch that already works with this.

bouweandela on 24 Nov 2017

This has been implemented in #172, I think this issue can be closed.

bouweandela on 15 Dec 2017

Has the workaround by @valeriupredoi (see above) also been revised accordingly?

mattiarighi on 15 Dec 2017

PR #172 contained new code for running the preprocessor, independent of preprocess.py. I kept the file around for future reference, there may still be information in there that is useful, but I think it doesn't need to be updated anymore.

bouweandela on 15 Dec 2017

👍1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Start talking of datasets instead of models

jvegasbsc · 4Comments

Topics for ESMValTool workshop in November

axel-lauer · 5Comments

fx file retrieval for OBS

valeriupredoi · 4Comments

Sanity check: compare time range from filename with actual cube time range

valeriupredoi · 5Comments

Monthly ESMValtool meeting January

bouweandela · 4Comments