I am trying to read a gzipped CSV file. According to the docs, it should work with newer versions of data.table. It does not for me.
Minimal reproducible example to create the file:
library(data.table)
library(R.utils)
mat = matrix(rnorm(10), 10, 10)
fwrite(as.data.table(mat), file = "test.csv")
gzip("test.csv")
Reading the file (command and output):
> fr_tbl = fread(file = "test.csv.gz")
Warning message:
In fread(file = "test.csv.gz") :
Stopped early on line 2. Expected 1 fields but found 1. Consider fill=TRUE and comment.char=. First discarded non-empty line: <<>>
> dim(fr_tbl)
[1] 0 1
> fr_tbl = fread(cmd = "gunzip -c test.csv.gz")
> dim(fr_tbl)
[1] 10 10
Output of sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS: /local/apps/R/3.4.1-cairo/lib64/R/lib/libRblas.so
LAPACK: /local/apps/R/3.4.1-cairo/lib64/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] R.utils_2.7.0 R.oo_1.21.0 R.methodsS3_1.7.1 data.table_1.11.8
loaded via a namespace (and not attached):
[1] compiler_3.4.1 tools_3.4.1
> fr_tbl = fread(file = "test.csv.gz")
> fr_tbl
V1 V2 V3 V4 V5
<num> <num> <num> <num> <num>
1: 0.4153061951 0.4153061951 0.4153061951 0.4153061951 0.4153061951
...
I am not able to reproduce on latest development version. Please re-open if you can.
BTW. thanks for very good reproducible example!
I just tried the development version and it works.
Is there an estimate when the current version will make it to CRAN?
@igordot I think it is a matter of days, @mattdowle already completed revdeps checks.
If you are asking because of needing package binaries be aware we do publish devel binaries for Windows on every commit, you can install it using. No need for Rtools.
install.packages("data.table", type="win.binary", repos="https://rdatatable.gitlab.io/data.table")
I am just more comfortable with CRAN versions and it's easier for sharing code (since you can specify minimum version).