Hello there!
Consider this wonderful reprex:
library(tibble)
#> Warning: package 'tibble' was built under R version 3.4.4
library(dplyr)
#> Warning: package 'dplyr' was built under R version 3.4.4
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(nanotime)
#> Warning: package 'nanotime' was built under R version 3.4.4
df <- tibble(mytimestamp = c(nanotime('2011-12-05 08:30:00.000',format ="%Y-%m-%d %H:%M:%E9S", tz ="GMT"),
nanotime('2011-12-05 08:30:00.100',format ="%Y-%m-%d %H:%M:%E9S", tz ="GMT"),
nanotime('2011-12-05 08:30:00.825',format ="%Y-%m-%d %H:%M:%E9S", tz ="GMT")),
var = c(1,1,2))
df
#> # A tibble: 3 x 2
#> mytimestamp var
#> <nanotime> <dbl>
#> 1 2011-12-05T08:30:00.000000000+00:00 1
#> 2 2011-12-05T08:30:00.100000000+00:00 1
#> 3 2011-12-05T08:30:00.825000000+00:00 2
df %>% group_by(var) %>% summarize_all(., last)
#> # A tibble: 2 x 2
#> var mytimestamp
#> <dbl> <integr64>
#> 1 1 1323073800100000000
#> 2 2 1323073800825000000
Created on 2019-05-21 by the reprex package (v0.2.1)
As you can see, doing a simple aggregation on a nanotime column will strip the column from its date formatting.
Is this related to how tibble manages S4 classes? Is this a bug?
Thanks!!
I realize this might be related to dplyr instead. Posting there. Apologies for cross-posting if this is not relevant for tibble as well.
Thanks!
@romainfrancois this is just incredible. As I as about to post on dplyr, you transferred the post here. Just amazing coincidence
@romainfrancois interestingly, different types of aggregation will lead to different outputs. Please let me know if I can help/test in any way as the nanotime format if vital for many data processing tasks where the timestamp has to be extremely accurate. I do not want to move my workflow to data.table
Thanks!!!
> df %>% mutate_all(., max)
# A tibble: 3 x 2
mytimestamp var
<S4: nanotime> <dbl>
1 2011-12-05T08:30:00.825000000+00:00 2
2 2011-12-05T08:30:00.825000000+00:00 2
3 2011-12-05T08:30:00.825000000+00:00 2
> df %>% mutate_all(., mean)
# A tibble: 3 x 2
mytimestamp var
<S3: integer64> <dbl>
1 1323073800308333333 1.333333333333
2 1323073800308333333 1.333333333333
3 1323073800308333333 1.333333333333
The problem is that there is no [[ for nanotime objects:
library(nanotime)
times <- c(nanotime('2011-12-05 08:30:00.000',format ="%Y-%m-%d %H:%M:%E9S", tz ="GMT"),
nanotime('2011-12-05 08:30:00.100',format ="%Y-%m-%d %H:%M:%E9S", tz ="GMT"),
nanotime('2011-12-05 08:30:00.825',format ="%Y-%m-%d %H:%M:%E9S", tz ="GMT"))
times[1]
#> [1] "2011-12-05T08:30:00.000000000+00:00"
times[[1]]
#> integer64
#> [1] 1323073800000000000
I filed an issue in the nanotime repo: https://github.com/eddelbuettel/nanotime/issues/44
This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/