Lightgbm: [R-package] CRAN note about example timings > 5 seconds

Created on 9 Apr 2020 · 4Comments · Source: microsoft/LightGBM

See https://github.com/microsoft/LightGBM/pull/2985#issuecomment-611582338.

The R package has been just barely under this timing I guess:

* checking examples ... NOTE
Examples with CPU or elapsed time > 5s
     user system elapsed
dim 5.426  0.264    5.79

We hopefully can get back under it by making some of the examples less costly to run or using \dontrun guards. I would prefer to make the examples simpler, since \dontrun guards means that we lose testing of the documentation.

I ran the following on my Mac to get timings for the examples

Rscript build_r.R --skip-install
R CMD CHECK lightgbm*.tar.gz --as-cran --timings
Rscript -e "
    timing_df <- read.delim('lightgbm.Rcheck/lightgbm-Ex.timings');
    print(timing_df);
    print(paste0('total time (s) ', sum(timing_df[['name']])))
    "

The timings are as shown:

                             name  user system elapsed
dim                         0.695 0.036  0.731      NA
dimnames.lgb.Dataset        0.056 0.007  0.053      NA
getinfo                     0.060 0.002  0.056      NA
lgb.Dataset                 0.049 0.007  0.049      NA
lgb.Dataset.construct       0.043 0.001  0.035      NA
lgb.Dataset.create.valid    0.034 0.002  0.036      NA
lgb.Dataset.save            0.043 0.002  0.036      NA
lgb.Dataset.set.categorical 0.044 0.002  0.037      NA
lgb.Dataset.set.reference   0.034 0.001  0.036      NA
lgb.cv                      0.284 0.084  0.188      NA
lgb.dump                    0.070 0.012  0.059      NA
lgb.get.eval.result         0.066 0.013  0.060      NA
lgb.importance              0.264 0.042  0.214      NA
lgb.interprete              0.951 0.048  0.507      NA
lgb.load                    0.148 0.021  0.079      NA
lgb.model.dt.tree           0.118 0.029  0.116      NA
lgb.plot.importance         0.130 0.028  0.124      NA
lgb.plot.interpretation     0.926 0.042  0.479      NA
lgb.prepare                 0.020 0.001  0.012      NA
lgb.prepare2                0.010 0.001  0.006      NA
lgb.prepare_rules           0.015 0.002  0.019      NA
lgb.prepare_rules2          0.011 0.002  0.014      NA
lgb.save                    0.064 0.014  0.061      NA
lgb.train                   0.064 0.012  0.060      NA
lgb.unloader                0.066 0.013  0.063      NA
predict.lgb.Booster         0.068 0.013  0.060      NA
readRDS.lgb.Booster         0.076 0.013  0.077      NA
saveRDS.lgb.Booster         0.073 0.014  0.067      NA
setinfo                     0.052 0.003  0.046      NA
slice                       0.052 0.002  0.045      NA
[1] "total time (s) 4.586"

it looks like the example for lgb.interprete() and lgb.plot.interpretationo() take almost 2 seconds to run. I'll see if I can speed them up or just \dontrun them.

bug good first issue r-package

Source

jameslamb

All 4 comments

Is this worth at all? Maybe just count this NOTE in allowed? I think in the future it will be good to have some kind of complete and big examples for better user understandings of how LightGBM works. Or it is better to move to demo as much examples content as possible?

StrikerRUS on 9 Apr 2020

The best solution would be to write proper vignettes. If we don't have an issue for that i'll make one.

demo stuff is tough for users to discover...you have to literally call demo() in an R console. If your need evidence that demo isn't used that much in R projects these days, note that pkgdown doesn't render the demos anywhere: https://lightgbm.readthedocs.io/en/latest/R/reference/

Vignettes is the right place for long-form documentation.

example: future:

https://cran.r-project.org/web/packages/future/vignettes/future-1-overview.html

These vignettes get indexed by search engines, are more expressive because they're written in R markdown (so you can mix in formatting and long-form text), and automatically get put into an 'Articles' section in pkgdown ([example])https://uptake.github.io/pkgnet//articles/pkgnet-intro.html)

Is this worth at all?

I think it's worthwhile to have one example for ever exported object in a package. I think it's powerful to have one copy-pastable example in the documentation from ?<object-name>. It gives users a sense of what the valid values are.

So I am going to create a PR today that speeds up the current examples.

jameslamb on 9 Apr 2020

Oh, I meant vignettes. I remember your comment https://github.com/microsoft/LightGBM/issues/1944#issuecomment-599140770.

Is this worth at all?

Will be it possible to keep all future examples under 5s? Or at some point in future we will have to ignore this NOTE anyway sooner or later?

StrikerRUS on 9 Apr 2020

👍1

I think it will be be possible to keep them well under 5s. I think I can get the examples down to a bare minimum, and otherwise we can wrap them in \dontrun blocks. The less special cases we have to rely on CRAN accepting, the better.

jameslamb on 9 Apr 2020

👍1

Was this page helpful?

0 / 5 - 0 ratings