Lubridate: duration() interpreting 0[units] as 1[units] when argument is single string

Created on 13 Jan 2017  路  5Comments  路  Source: tidyverse/lubridate

I only tested this for hours and minutes (as you can see below), so I'm not sure how general the issue is.
I'm also not sure if indeed it is an issue or just outside the intended use of the function (or if I'm doing something stupid).

> duration("2d 0H 0M 1s")
[1] "176461s (~2.04 days)"
> as.numeric(duration("2d 0H 0M 1s"))
[1] 176461
> 2*24*3600 + 1
[1] 172801
> 
> duration("2d 1s")
[1] "172801s (~2 days)"
> as.numeric(duration("2d 1s"))
[1] 172801
>
> as.numeric(duration("2d 0H 0M 1s")) - as.numeric(duration("2d 1s")) == 1*3600 + 1*60
[1] TRUE
>
>
> session_info()
Session info ---------------------------------------------------------------
 setting  value                       
 version  R version 3.3.2 (2016-10-31)
 system   x86_64, mingw32             
 ui       RStudio (1.0.136)           
 language (EN)                        
 collate  English_Ireland.1252        
 tz       Europe/Berlin               
 date     2017-01-13                  

Packages -------------------------------------------------------------------
 package   * version date       source        
 devtools  * 1.12.0  2016-06-24 CRAN (R 3.3.2)
 digest      0.6.11  2017-01-03 CRAN (R 3.3.2)
 lubridate * 1.6.0   2016-09-13 CRAN (R 3.3.2)
 magrittr    1.5     2014-11-22 CRAN (R 3.3.2)
 memoise     1.0.0   2016-01-29 CRAN (R 3.3.2)
 stringi     1.1.2   2016-10-01 CRAN (R 3.3.2)
 stringr     1.1.0   2016-08-19 CRAN (R 3.3.2)
 withr       1.0.2   2016-06-20 CRAN (R 3.3.2)
>
bug

Most helpful comment

Your identification of the problem is correct. There must be an easy fix but I have been very busy recently.

I think I made the decision to default missing units to 1 because that parsing code is likely to be used in other places such as rounding or conversion where most common case is to specify one unit of time ("day", "minute" etc). I think I also didn't think much about 0 units back then. Let me have a fresh look.

All 5 comments

You are right. It's a bug in parsing 0 units. Will check this asap.

Hi Vitalie,

I had some free time on my hands recently so I decided to take a look at the parsing code.

Lines 45 and 46 of src/period.c, inside function parse_period_unit(), read:

out.val = parse_int(c, 100, FALSE);
if(out.val == 0) out.val = 1;

I guess this is to enforce the statement, on line 197 of R/durations.r, that

# Missing numerals default to 1.

i.e. to take care of the case where the numeral is missing but the unit is there (e.g. "1days 3mins" and "days 3mins" will yield the same result).

As noted in a pertinent comment inside function parse_int() (line 50 of src/utils.c)

// maybe: fixme: this returns 0 if no parsing happened and strict = FALSE

it seems that the current way of detecting missing numerals (receiving a zero from parse_int()) does not actually distinguish missing numerals from the numeral zero.

If the current (missing numeral) behavior is to be kept, maybe a counter should be implemented to check if at least 1 character has been parsed? You could then make the function return an int different than 0 (e.g. -2) if no parsing has occurred, which would then be caught in parse_period_unit() with

if(out.val == -2) out.val = 1;

I'm not familiar enough with the package to judge whether the current missing numeral behavior is the most desirable, but, from a user's perspective, I think I would prefer to be informed if I forgot to pass a numeral next to my unit; it's also more likely (I would think) that, if the input was produced by another function/program/software, a missing numeral really "meant to denote" 0 units and not 1. Or is this not your experience? Like I said, I'm not really sure what the big picture is in this situation.

Don't know if this is entirely correct or actually useful to you (it was certainly not a very extensive investigation) but I just thought I'd share. I didn't run any code/tests, just followed the function trail.

Best regards,
Dulce

Your identification of the problem is correct. There must be an easy fix but I have been very busy recently.

I think I made the decision to default missing units to 1 because that parsing code is likely to be used in other places such as rounding or conversion where most common case is to specify one unit of time ("day", "minute" etc). I think I also didn't think much about 0 units back then. Let me have a fresh look.

I have recently encountered this issue in my work and would appreciate a fix if possible

This issue was fixed back in February. Pls use the dev version or wait for new CRAN release (which should be there soon depending on when CRAN folks will find time to process it).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rachaelmburke picture rachaelmburke  路  55Comments

MichaelJW picture MichaelJW  路  7Comments

sebschub picture sebschub  路  7Comments

earuniitm picture earuniitm  路  5Comments

saberbouabid picture saberbouabid  路  3Comments