Data.table: When subsetting dt with row indices, dt puts NA when a number of row indices is greater than .N

Created on 12 Apr 2020  路  4Comments  路  Source: Rdatatable/data.table

Hi,

I have a question, when I'm doing subsetting by row indices and a number of used indices is greater than .N, why is it don't throw error or warning, but instead of it creates NAs?

dt <- data.table(Num = rnorm(10))
dt[1:11]

Output is:
Num
1: -0.5613457
2: 1.7967747
3: 0.3145488
4: 1.1318019
5: 1.0750366
6: -1.5978017
7: -1.0894657
8: -1.2805362
9: -1.7455068
10: 0.1769249
11: NA

# Output of sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] data.table_1.12.8

loaded via a namespace (and not attached):
[1] compiler_3.5.3 tools_3.5.3 packrat_0.5.0

Most helpful comment

@PetoLau I think the relevant part of the help text is in ?data.table about the i argument:

integer and logical vectors work the same way they do in [.data.frame

The next obvious question is then: how does _out of bounds_ indexing work for data.frame? I once tried to answer Why does 'out of bounds' indexing differ between a matrix and a data.frame?, and it seems like OOB indexing for data.frame is not very well documented.

All 4 comments

Hi, thank you for submitting the issue. Behaviour you describe doesn't error or warning because it is the standard behaviour in R.

data.frame(x=1:2)[1:3,,drop=FALSE]
#    x
#1   1
#2   2
#NA NA

You may be interested in a FR that gives better control over that #3109, there is PR #4353 already for that, so it is likely to be available in next CRAN release.
Then you just add nomatch=NULL and extra rows are excluded. I think it address your issue well, so I am going to close it, in any case we can always re-open it later on if needed.

@PetoLau I think the relevant part of the help text is in ?data.table about the i argument:

integer and logical vectors work the same way they do in [.data.frame

The next obvious question is then: how does _out of bounds_ indexing work for data.frame? I once tried to answer Why does 'out of bounds' indexing differ between a matrix and a data.frame?, and it seems like OOB indexing for data.frame is not very well documented.

Thanks for fast reply @jangorecki . Yes, I hope options(datatable.nomatch=NULL) will do the thing (can't wait for it) :)

Actually this option is going to be deprecated in the long future. We advise to not use it. It is only safe if you can be sure that any of your code doesn't depend on a package that depends on data.table, including their recursive dependencies. This is the global option that will affect every single package that uses data.table, thus is likely to break some packages which expect default behaviour.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

st-pasha picture st-pasha  路  3Comments

tcederquist picture tcederquist  路  3Comments

DavidArenburg picture DavidArenburg  路  3Comments

jameslamb picture jameslamb  路  3Comments

arunsrinivasan picture arunsrinivasan  路  3Comments