Esmvaltool: Refactor lazy_regrid.py for wflow diagnostics

Created on 12 Feb 2021  路  7Comments  路  Source: ESMValGroup/ESMValTool

The recipe wflow.yml returns memory error if it uses e.g. 10 years of data. The diagnostics wflow.py uses regrid function with scheme area_weighted. The memory error is explained in an open issue in SciTools/iris/issues/3808 .
With the new version of iris, the lazy_regrid script benefits from a refactoring. Note that the regrid preprocessor cannot be moved to the recipe because the format of wflow target grid file is .map.

diagnostic enhancement

All 7 comments

Comparing the performance of wflow_recipe regarding refactoring lazy_regrid script before (wflow_master) and after (wflow_pr):

  • check the differences in one of the outputs wflow_ERA-Interim_Meuse_1990_2001.nc that includes three variables, as shown below, the differences are zero or very small.
  • check the cube shapes, they are both the same.
  • check the file sizes in the work directory, as can be seen below, after refactoring, the .nc and .XML files are smaller.
  • check the resource_usage.txt, as can be seen below, running the recipe after refactoring takes about half-hour more.

the differences in one of the outputs wflow_ERA-Interim_Meuse_1990_2001.nc
pr
wflow_diff_pr

tas
wflow_diff_tas

pet
wflow_diff_pet

the cube shapes
wflow_diff_cubes

the file sizes
wflow_diff_file_sizes

the resource_usage.txt
before resource_usage.txt and after resource_usage.txt

@SarahAlidoost Thanks for the nice report! What are your conclusions from this?

Ah, sorry, didn't see the conclusions, but they are there already. It's a bit worrying that there is a difference at all in the runtime and memory use. Do you think the runtimes could vary per run?

Nice comparison @SarahAlidoost ! Half an hour more on 1-1.5 hours in total is quite a big difference I'd say. I'd like to know how this is possible, but that might be out of scope for your current objective, or not?

Ah, sorry, didn't see the conclusions, but they are there already. It's a bit worrying that there is a difference at all in the runtime and memory use. Do you think the runtimes could vary per run?

with new commits, the performance is improved. Now the differences in the variable pet is zero too (see below). It seems that the runtime varies per run. Now, it took less than one hour (see new resource_usage.txt).

the differences in one of the outputs wflow_ERA-Interim_Meuse_1990_2001.nc
pet
wflow_diff_pet_2

Nice comparison @SarahAlidoost ! Half an hour more on 1-1.5 hours in total is quite a big difference I'd say. I'd like to know how this is possible, but that might be out of scope for your current objective, or not?

please see my comment here.

It seems that the runtime varies per run. Now, it took less than one hour

OK, that's good news, because the implementation I did in iris is almost identical to what was here in ESMValTool, I would expect it to be only slightly more efficient. Therefore it would have been strange if there were big differences in runtime. Maybe the difference in runtime/memory use is because not all the nodes have the same hardware (assuming you're running on Cartesius: https://userinfo.surfsara.nl/systems/cartesius/usage/batch-usage#heading7) or because of other users also accessing the shared file system.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bouweandela picture bouweandela  路  4Comments

valeriupredoi picture valeriupredoi  路  4Comments

bouweandela picture bouweandela  路  3Comments

jonnyhtw picture jonnyhtw  路  4Comments

lukasbrunner picture lukasbrunner  路  4Comments