Data.table: [Regression] fread can no longer read all files from the extended test suite

Created on 6 Aug 2017  路  1Comment  路  Source: Rdatatable/data.table

Testing the latest fread.c with "large test suite" of files shows problems with the following files:

fread("h2o-3/smalldata/jira/pubdev_2455.csv")
  Internal error: Last field of last field should select quote rule 2

fread("h2o-3/smalldata/jira/pubdev_2336.csv")
  Internal error: Last field of last field should select quote rule 2

fread("h2o-3/smalldata/glm_test/prostate_cat_train.csv")
  Line 290 has too few fields when detecting types. Use fill=TRUE to pad with NA. Expecting 9 fields but found 8: <<380       0  69   R2     b     a   1.90 20.70       >>

fread("h2o-3/smalldata/glm_test/prostate_cat_test.csv")
  Line 90 has too few fields when detecting types. Use fill=TRUE to pad with NA. Expecting 9 fields but found 8: <<378       1  76   R2     b     a   5.5 53.9       >>

fread("h2o-3/smalldata/glm_test/abcd.csv")
  Line 4 has too few fields when detecting types. Use fill=TRUE to pad with NA. Expecting 6 fields but found 5: <<1 1 0 1 1 >>

All of these files were read without errors in the previous version of fread.c

bug dev fread

Most helpful comment

Here's a minified test case for these:

require(data.table)
cat("A  B  C\n1  2  3\n4  5  6", file=f<-tempfile())
data.table:::test(9999.1, fread(f), data.table(A=c(1L,4L), B=c(2L,5L), C=c(3L,6L)))

cat("A,B,C\n1,2,3\n4,5,", file=f<-tempfile())
data.table:::test(9999.2, fread(f), data.table(A=c(1L,4L), B=c(2L,5L), C=c(3L,NA)))

t = '"b","bc8d5",\n"c",,"2f685"\n"d",,\n,"cdfb9",\n'
cat(t, file=f<-tempfile()); 
data.table:::test(9999.3, fread(f), fread(t))

>All comments

Here's a minified test case for these:

require(data.table)
cat("A  B  C\n1  2  3\n4  5  6", file=f<-tempfile())
data.table:::test(9999.1, fread(f), data.table(A=c(1L,4L), B=c(2L,5L), C=c(3L,6L)))

cat("A,B,C\n1,2,3\n4,5,", file=f<-tempfile())
data.table:::test(9999.2, fread(f), data.table(A=c(1L,4L), B=c(2L,5L), C=c(3L,NA)))

t = '"b","bc8d5",\n"c",,"2f685"\n"d",,\n,"cdfb9",\n'
cat(t, file=f<-tempfile()); 
data.table:::test(9999.3, fread(f), fread(t))
Was this page helpful?
0 / 5 - 0 ratings

Related issues

mattdowle picture mattdowle  路  3Comments

DavidArenburg picture DavidArenburg  路  3Comments

alex46015 picture alex46015  路  3Comments

arunsrinivasan picture arunsrinivasan  路  3Comments

nachti picture nachti  路  3Comments