Testing the latest fread.c with "large test suite" of files shows problems with the following files:
fread("h2o-3/smalldata/jira/pubdev_2455.csv")
Internal error: Last field of last field should select quote rule 2
fread("h2o-3/smalldata/jira/pubdev_2336.csv")
Internal error: Last field of last field should select quote rule 2
fread("h2o-3/smalldata/glm_test/prostate_cat_train.csv")
Line 290 has too few fields when detecting types. Use fill=TRUE to pad with NA. Expecting 9 fields but found 8: <<380 0 69 R2 b a 1.90 20.70 >>
fread("h2o-3/smalldata/glm_test/prostate_cat_test.csv")
Line 90 has too few fields when detecting types. Use fill=TRUE to pad with NA. Expecting 9 fields but found 8: <<378 1 76 R2 b a 5.5 53.9 >>
fread("h2o-3/smalldata/glm_test/abcd.csv")
Line 4 has too few fields when detecting types. Use fill=TRUE to pad with NA. Expecting 6 fields but found 5: <<1 1 0 1 1 >>
All of these files were read without errors in the previous version of fread.c
Here's a minified test case for these:
require(data.table)
cat("A B C\n1 2 3\n4 5 6", file=f<-tempfile())
data.table:::test(9999.1, fread(f), data.table(A=c(1L,4L), B=c(2L,5L), C=c(3L,6L)))
cat("A,B,C\n1,2,3\n4,5,", file=f<-tempfile())
data.table:::test(9999.2, fread(f), data.table(A=c(1L,4L), B=c(2L,5L), C=c(3L,NA)))
t = '"b","bc8d5",\n"c",,"2f685"\n"d",,\n,"cdfb9",\n'
cat(t, file=f<-tempfile());
data.table:::test(9999.3, fread(f), fread(t))
Most helpful comment
Here's a minified test case for these: