Data.table: when will sep2 in fread be implemented?

Created on 27 May 2015  路  11Comments  路  Source: Rdatatable/data.table

I can't wait :grinning:

enhancement fread

Most helpful comment

hello! Just wondering if there is an update for sep2? is it currently available? would love to use it!

All 11 comments

@mattdowle , @arunsrinivasan , do you have any suggestions for implementation of this feature? I will try to work on it.

:+1:

Will this feature help to read below file format with 2 different delimiters?

1:87595672-88977120
1:92704584-94054584

For example:

fread("myFile.txt", sep = ":", sep2 = "-")

or pass multiple delimiters to sep?

fread("myFile.txt", sep=":|-")

I think the idea is it will be something like this:

1:87595672-88977120
1:92704584-94054584
dt <- fread("myFile.txt", sep = ":", sep2 = "-")
> dt
   V1                V2
1:  1 87595672,88977120
2:  1 92704584,94054584

@zx8754, no. For your case you can replace second type of delimiter in shell and use fread as usual. See for example this thread

hello! Just wondering if there is an update for sep2? is it currently available? would love to use it!

Any update on sep2?

Contributions welcome!
Generally to potential contributors: I understand that fread.c is one of the hardest pieces of C to work on (for example), especially now that it's multi-threaded. But I would have thought that would make it _more_ attractive to learn C and contribute to, for the very reason it is impressive; e.g. enhancing your CV.
I would prefer to help someone else do it (sep2), than do it myself.

@mattdowle it might be worth to spend some time writing an internals vignette to get people oriented on where to start with working on something like fread.

The code is decently commented, but I still find myself overwhelmed trying to get the gist of what's going on / the general order things are run in.

So some vignettes designed for giving new readers broad strokes of how internal code is working could go a long way 馃槃

A related example: writing coverage tests for non-equi-joins in #3195, I came across a branch hitting Cnestedid which has no comments so it would have taken a lot of time to figure out what it's doing...

This is part of #944

@MichaelChirico it is much easier to answer particular questions about the code than writing generally enough so it can be followed by readers, and low-level enough to be useful.
It is much easier to experiment with C code when you are equipped with good tools. cc() and dd() are very useful.

Was this page helpful?
0 / 5 - 0 ratings