na.omit seems to be not be working as expected when invert=TRUE.
Here is an example:
library(data.table)
#data.table 1.10.5 IN DEVELOPMENT built 2018-03-02 08:25:10 UTC; travis
#The fastest way to learn (by data.table authors): https://www.datacamp.com/courses/data-analysis-# #the-data-table-way
#Documentation: ?data.table, example(data.table) and browseVignettes("data.table")
#Release notes, videos and slides: http://r-datatable.com
iris <- data.table(iris)
na.omit(iris, invert=TRUE)
Output
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1: 5.1 3.5 1.4 0.2 setosa
2: 4.9 3.0 1.4 0.2 setosa
3: 4.7 3.2 1.3 0.2 setosa
4: 4.6 3.1 1.5 0.2 setosa
5: 5.0 3.6 1.4 0.2 setosa
---
146: 6.7 3.0 5.2 2.3 virginica
147: 6.3 2.5 5.0 1.9 virginica
148: 6.5 3.0 5.2 2.0 virginica
149: 6.2 3.4 5.4 2.3 virginica
150: 5.9 3.0 5.1 1.8 virginica
Session Info
sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS
Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.10.5
loaded via a namespace (and not attached):
[1] compiler_3.4.1 tools_3.4.1 yaml_2.1.14
It's not clear what behavior you expect, and anyway, the behavior is identical for data.table and base R's data.frames, so I think this is not related to data.table; closing pending further details.
all.equal(na.omit(setDT(copy(iris)), invert=TRUE),
setDT(na.omit(setDF(copy(iris)), invert = TRUE)))
# [1] TRUE
Thanks for the quick response! And I apologize that I was unclear. This was my expected behavior was an empty data.table (from the CRAN data.table version):
Example
library(data.table)
#data.table 1.10.4.3
#The fastest way to learn (by data.table authors): https://www.datacamp.com/courses/data-analysis-the-data-table-way
#Documentation: ?data.table, example(data.table) and browseVignettes("data.table")
#Release notes, videos and slides: http://r-datatable.com
iris <- data.table(iris)
na.omit(iris, invert=TRUE)
Output
Empty data.table (0 rows) of 5 cols: Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species
It seems like this was the behavior for quite a few versions going back.
Is this going to be the result moving forward? If so, it at least seems NEWS.md worthy, as it definitely caught me by surprise.
After some digging, it seems like this behavior changed in the following commit:
https://github.com/Rdatatable/data.table/commit/1fd38629ec81af80c2ff57e475ff2e7f2c55f844
In the the commit right before (below), the same behavior found in the CRAN version was returned.
https://github.com/Rdatatable/data.table/commit/e871a4ffbbe3e67cdcf6912c7b24d165cd9ec6ab
Both commits were January 12th.
Oh, I see, na.omit.data.frame doesn't have any invert argument, so a difference vis-a-vis base is expected.
I identified the issue, filed a PR #2661, should be integrated soon. Thanks.
Thanks @MichaelChirico