Data.table: fread error reading gzipped files

Created on 10 Jan 2019  路  4Comments  路  Source: Rdatatable/data.table

I am trying to read a gzipped CSV file. According to the docs, it should work with newer versions of data.table. It does not for me.

Minimal reproducible example to create the file:

library(data.table)
library(R.utils)
mat = matrix(rnorm(10), 10, 10)
fwrite(as.data.table(mat), file = "test.csv")
gzip("test.csv")

Reading the file (command and output):

> fr_tbl = fread(file = "test.csv.gz")
Warning message:
In fread(file = "test.csv.gz") :
  Stopped early on line 2. Expected 1 fields but found 1. Consider fill=TRUE and comment.char=. First discarded non-empty line: <<>>
> dim(fr_tbl)
[1] 0 1

> fr_tbl = fread(cmd = "gunzip -c test.csv.gz")
> dim(fr_tbl)
[1] 10 10

Output of sessionInfo()

R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS: /local/apps/R/3.4.1-cairo/lib64/R/lib/libRblas.so
LAPACK: /local/apps/R/3.4.1-cairo/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] R.utils_2.7.0     R.oo_1.21.0       R.methodsS3_1.7.1 data.table_1.11.8

loaded via a namespace (and not attached):
[1] compiler_3.4.1 tools_3.4.1  

All 4 comments

> fr_tbl = fread(file = "test.csv.gz")
> fr_tbl
               V1            V2            V3            V4            V5
            <num>         <num>         <num>         <num>         <num>
 1:  0.4153061951  0.4153061951  0.4153061951  0.4153061951  0.4153061951
...

I am not able to reproduce on latest development version. Please re-open if you can.
BTW. thanks for very good reproducible example!

I just tried the development version and it works.

Is there an estimate when the current version will make it to CRAN?

@igordot I think it is a matter of days, @mattdowle already completed revdeps checks.
If you are asking because of needing package binaries be aware we do publish devel binaries for Windows on every commit, you can install it using. No need for Rtools.

install.packages("data.table", type="win.binary", repos="https://rdatatable.gitlab.io/data.table")

I am just more comfortable with CRAN versions and it's easier for sharing code (since you can specify minimum version).

Was this page helpful?
0 / 5 - 0 ratings